Saturday, July 9, 2011

Re: Encoding issue

On 07/07/11 17:27, Ben Fritz wrote:
>
>
> On Jul 7, 7:49 am, Михаил Голубев<qsolo...@gmail.com> wrote:
>> Another question to wizards)
>>
>> I run gvim on Windows 7 Professional x86 so my default encoding is set to
>> native cp1251. To avoid problems when opening files with Unicode encoding
>> ('fileencoding') I want to change 'encoding' value to utf-8. But when I do
>> so some standard messages in command line translated to Russian before are
>> corrupted. And look like this:
>>
>> href=http://dl.dropbox.com/u/14502217/messages_corrupted.png
>>
>> Is there any way to somehow "reencode" them or to turn off such translated
>> elements at all?
>>
>
> Where do you set your encoding? It should be pretty much the first
> thing in your .vimrc. The only thing it comes after for me, is "set
> nocompatible" and "let&termencoding =&encoding".
>
> If I understand correctly, when you change your encoding all the
> buffers, mappings, menus, and internal variables do not change their
> binary-encoded value; only their meaning changes. So, you need to set
> your encoding before doing anything else which might set these values.
> I.e., set it at the very beginning of your .vimrc.
>
> By the way, if you haven't figured it out already, it is probably a
> good idea to include cp1251 in your 'fileencodings' option, before any
> other 8-bit encoding, so that Vim correctly loads files in this
> encoding as well.

cp1251 is an 8-bit encoding, and as such it cannot give an "error"
signal when trying to open a file with it. In 8-bit encodings, there are
no invalid bytes. This means that anything after the first 8-bit
encoding in 'fileencodings' will never be tried. For instance, if you have

:set fencs=ucs-bom,utf-8,cp1251,iso-8859-15,latin1,shift-jis

thelast three (including shift-jis which is a multibyte encoding) will
never be tried. If there is a recognised BOM at the very start it will
be used to determine the 'fileencoding', otherwise utf-8 will be set if
there is no single invalid UTF-8 sequence in the whole file, and
otherwise cp1251 will be set -- period.

In short:
- ucs-bom, if used, should be first
- an 8-bit encoding, if used, should be last
- Since there is only one "last" item, there should be at most one 8-bit
encoding.

>
> See the :help obviously, but also this for some reference:
>
> http://vim.wikia.com/wiki/Working_with_Unicode
>
> That's how I got started with my encoding setup.
>

Best regards,
Tony.
--
hundred-and-one symptoms of being an internet addict:
129. You cancel your newspaper subscription.

--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

No comments: