Monday, October 23, 2017

Re: How to handle non-ascii characters?

To each his/her taste of course, but IMHO all-ASCII does not always go
hand in hand with best legibility. What it does go hand in hand with
is best typability on English-language keyboards when no keymap is in
use.

My first HTML pages were in French and I took the trouble to replace
all French non-ASCII characters by their ASCII entities: à
ç é ï œ û etc. etc. etc. For French the
result is still (barely) human-readable thanks to the many unaccented
Latin letters in the text.

My current project is a Russian-French dictionary, and replacing all
Cyrillic by numeric entities would have turned it completely into
gobbledygook. Here I have chosen in favour of UTF-8 with the help of
an owncoded keymap and an "international" Latin-alphabet keyboard
(except in rare cases such as   written as such in otherwise
empty <td> table cells) and this suits me perfectly; even the French
text looks more readable to me now that I write it in UTF-8.

Best regards,
Tony.

On Mon, Oct 23, 2017 at 10:50 PM, Eli the Bearded
<vim@eli.users.panix.com> wrote:
> Barry Gold wrote:
>> None of these looks like themselves when I edit the file with vim in a
>> cygwin Terminal window. I can search for [^ -~^t] to find the non-ASCII
>> characters, then go to the original word document to find out what the
>> correct character is. If I had only a few of these, that would be
>> enough. But in a longer document, a given non-ASCII can occur hundreds
>> of times. So once I've found (e.g.) an emdash, I want to replace _all_
>> occurrences with "&mdash;". But I have no way of representing the
>> character I want to replace on the command line.
>
> I have a very similar problem to yours and have evolved some fixes that
> I use. You've already gotten some replies, but maybe my methods would
> help, too.
>
> In my case, I paste content from web pages into Usenet posts and want to
> have as much US-ASCII as possible for best readibility. To that end I
> have a specific vimrc for news that fixes things with map!s. It could
> easily be modified to a ':so script' usage, to fix things on command
> or a 'autocmd BufRead *.html' script to fix thins on load.
>
> In my vimrc:
>
> autocmd BufRead .article.* :so ~eli/.news_vimrc
>
> And my news_vimrc looks like this:
>
> :r! cat ~/.news_vimrc | mmencode -q
> " smart quotes
> map! =E2=80=99 '
> map! =E2=80=98 '
> map! =E2=80=9C "
> map! =E2=80=9D "
> map! =E2=80=B3 "
> " ellipsis
> map! =E2=80=A6 ...
> " n-dash
> map! =E2=80=93 --
> " m-dash
> map! =E2=80=94 --
> " U+2212 minus
> map! =E2=88=92 -
> " U+2010 hyphen
> map! =E2=80=90 -
> "
> " find non-ascii
> map <F5> /[^ -~]<cr>
> " add mime headers if leaving in non-ascii
> map <F6> iContent-Type: text/plain; charset=3D"UTF-8"<cr>MIME-Version: 1.=
> 0<cr><esc>
> map! <F6> Content-Type: text/plain; charset=3D"UTF-8"<cr>MIME-Version: 1.=
> 0<cr>
> " general news settings
> set ai sw=3D4 tw=3D72
>
> Basically, I'm suggesting that you take all the charcters you find and
> want to replace, and save the replacements in a script you can run
> easily before looking for new characters that you want to fix.
>
> I use http://qaz.wtf/u/ "Show unicode character" if needed to identify
> characters, the plugin might suit you better.
>
> And I have a long-standing macro:
>
> " Use * to "run" a line from the edit buffer
> " Mnemonic: * is executible in "ls -F"
> " Uses register y
> :map * "yyy@y
>
> If I were you, I would make the commands, test them with *, then 'p'ut
> them in the fix script.
>
> That * command is one of three macros I consider essential. The other
> two I think are less likely to be universally useful, but anyway:
>
> " Find previous space and split line on it
> " Mnemonic: 'S'pace
> :map S F r<CR>
> "
> " Double the character under the cursor
> " Mnemonic: fix C code like "if (0 = i) ..."
> :map = y p
>
> Elijah
> ------
> can type his entire vimrc from memory, and often does
>
> --
> --
> You received this message from the "vim_use" maillist.
> Do not top-post! Type your reply below the text you are replying to.
> For more information, visit http://www.vim.org/maillist.php
>
> ---
> You received this message because you are subscribed to the Google Groups "vim_use" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments: