Monday, October 23, 2017

Re: How to handle non-ascii characters?

On 10/23/2017 1:50 PM, Eli the Bearded wrote:
> Barry Gold wrote:
>> None of these looks like themselves when I edit the file with vim in a
>> cygwin Terminal window. I can search for [^ -~^t] to find the non-ASCII
>> characters, then go to the original word document to find out what the
>> correct character is. If I had only a few of these, that would be
>> enough. But in a longer document, a given non-ASCII can occur hundreds
>> of times. So once I've found (e.g.) an emdash, I want to replace _all_
>> occurrences with  "—". But I have no way of representing the
>> character I want to replace on the command line.
> I have a very similar problem to yours and have evolved some fixes that
> I use. You've already gotten some replies, but maybe my methods would
> help, too.
>
> In my case, I paste content from web pages into Usenet posts and want to
> have as much US-ASCII as possible for best readibility. To that end I
> have a specific vimrc for news that fixes things with map!s. It could
> easily be modified to a ':so script' usage, to fix things on command
> or a 'autocmd BufRead *.html' script to fix thins on load.
>
> In my vimrc:
>
> autocmd BufRead .article.* :so ~eli/.news_vimrc
>
> And my news_vimrc looks like this:
>
> :r! cat ~/.news_vimrc | mmencode -q
> " smart quotes
> map! =E2=80=99 '
> map! =E2=80=98 '
> map! =E2=80=9C "
> map! =E2=80=9D "
> map! =E2=80=B3 "
> " ellipsis
> map! =E2=80=A6 ...
> " n-dash
> map! =E2=80=93 --
> " m-dash
> map! =E2=80=94 --
> " U+2212 minus
> map! =E2=88=92 -
> " U+2010 hyphen
> map! =E2=80=90 -
> "
> " find non-ascii
> map <F5> /[^ -~]<cr>
> " add mime headers if leaving in non-ascii
> map <F6> iContent-Type: text/plain; charset=3D"UTF-8"<cr>MIME-Version: 1.=
> 0<cr><esc>
> map! <F6> Content-Type: text/plain; charset=3D"UTF-8"<cr>MIME-Version: 1.=
> 0<cr>
> " general news settings
> set ai sw=3D4 tw=3D72
>
> Basically, I'm suggesting that you take all the charcters you find and
> want to replace, and save the replacements in a script you can run
> easily before looking for new characters that you want to fix.
>
> I use http://qaz.wtf/u/ "Show unicode character" if needed to identify
> characters, the plugin might suit you better.
>
> And I have a long-standing macro:
>
> " Use * to "run" a line from the edit buffer
> " Mnemonic: * is executible in "ls -F"
> " Uses register y
> :map * "yyy@y
>
> If I were you, I would make the commands, test them with *, then 'p'ut
> them in the fix script.
>
> That * command is one of three macros I consider essential. The other
> two I think are less likely to be universally useful, but anyway:
>
> " Find previous space and split line on it
> " Mnemonic: 'S'pace
> :map S F r<CR>
> "
> " Double the character under the cursor
> " Mnemonic: fix C code like "if (0 = i) ..."
> :map = y p
>
I'm impressed that you can type your entire vimrc from memory. I'm
tempted to use some of that. If only I _understood_ it.


--
On Beta, we'd have earrings for that. You could buy them in any jewelry store.
http://www.conchord.org/xeno/bdgsig.html

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments: