Wednesday, September 19, 2018

Re: No Cyrlillic text in CP866

On Thu, Sep 20, 2018 at 4:21 AM Charles E Campbell
<drchip@campbellfamily.biz> wrote:
>
> Anton Shepelev wrote:
> > Hello, all
> >
> > I am using text-mode Vim on Windows XP, where 'chcp'
> > tells me that the terminal encoding is cp866. Why
> > can't I type Russian characters with enc=cp866? It
> > works with enc=utf-8, though, but I expect Vim also
> > to support Cyrillic while using and encoding that
> > matches with that of the terminal...
> >
> This sounds like a problem for Tony Mechelynck; from what I've seen,
> he's really good at encoding issues. Unfortunately, it appears that he
> didn't see your plaint, and he seems to be responding via github
> recently and so his email is hidden. I've sent him this message with
> BCC: .
>
> Regards,
> Chip Campbell

I answer via github to messages which originate on github, because
otherwise my name is replaced on github by some robot ID; but
basically I follow the vim_use and vim_dev newsgroups.

Yeah, I've made encoding issues in Vim a kind of "specialty" of mine,
ever since I came to Vim, found that it supported UTF-8 (which,
unbeknownst to me, was a sort of novelty at the time) tried to
understand what the help said about it, succeeded, and wrote a FAQ
chapter and a few wiki pages which IIRC Bram later used to fill up the
already existing multibyte documentation.

However, it's been years and years since I've left Windows, my present
system is openSUSE Linux 15.0, and it has a "sane" locale policy,
using UTF-8 wherever possible: my system locale comes as
$LANG=en_US.UTF-8 (meaning "use that for all not otherwise specified
parts of the locale", and in particular for $LC_CTYPE, which Vim uses
at startup to set the default 'encoding').

I have absolutely no experience with CP866, the mixed Cyrillic/Latin
texts that I write (e.g. the dictionary accessed, among others, at
http://users.skynet.be/antoine.mechelynck/slovarj/ru-fr.abbrev.html
letters А "ah" to part of С "es" already exist) are in UTF-8, and my
reasoned opinion in this matter is that even to read and write files
in CP866, Windows-1251 or KOI8-R, our friend Антон Шепелев ;-) should
set 'encoding' to UTF-8 near the top of his vimrc (defining the
*internal* charset used by Vim to be the Universal one) while
converting when reading and writing by means of 'fileencodings'
(plural) (q.v.) when possible and of 'fileencoding' (singular) (see
:help ++enc) when necessary. If the Windows locale is CP866 (which is
an 8-bit encoding and therefore should come last in 'fileencodings')
my guess is that setting 'encoding' to utf-8 and 'fileencodings'
(plural) to ucs-bom,utf-8,cp866 ought to give good results; however it
will read Latin1 as if it were CP866. The alternative (if
'fileencodings' contains Latin1 instead of cp866) is to _always_ read
CP866 files with ++enc=cp866 as a modifier to the :e[dit] statement:
Vim will then remember it when writing back the modified file.
Similarly ++enc=koi8-r or ++enc=Windows-1251 as appropriate (and
assuming "of course" that either Vim is compiled with +iconv, or it is
compiled with +iconv/dyn and there is an iconv.dll or libiconv.dll
where Vim can find it).

See https://vim.wikia.com/wiki/Working_with_Unicode for more
information. Don't miss it! It is a little verbose but it should
clarify the difficult parts which undoubtedly exist in the above
paragraph.

Best regards,
Tony.

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments: