Wednesday, November 3, 2010

Re: enc,fenc (again!?)

On 24/10/10 18:57, Alessandro Antonello wrote:
> Hi, all.
>
> I know that this has been discussed for a long time but I still don't
> understand why Vim behaves like that.
>
> Some times I need to work with several buffer. Some in natural 'utf-8'
> encoding. I say natural because it is the system default (I am in a Mac) and
> is defined in Vim 'encoding' option. But some buffer I need to be in 'latin1'.
> Opening the files in the right encoding is easy, since I use the '++enc'
> option. So the files are read correctly. The problem begins when I use
> 'bnext' and 'bprevious' (or another similar command) to navigate between the
> buffers. Since my encoding is 'utf-8', when I navigate from a buffer with this
> encoding to a buffer that was opened with '++enc=latin1' this buffer get its
> encoding changed to 'utf-8'.
>
> I am trying to solve this problem for a long time and never accomplished it.
> What I can't understand is why Vim ignores the buffer 'fileencoding' variable?
> It is set correctly and is set when the buffer is opened. To be sure about
> that I manually set the 'fileencoding' variable when the buffer is opened.
>
> Let me explain this right. I open a file with '++enc=latin1'. In the 'BufRead'
> auto command I set the 'fenc' variable as this: call setbufvar('%', '&fenc',
> 'latin1').
>
> Then I navigate to another buffer that is in another encoding (like 'utf-8').
> When I come back to the first buffer, its 'fenc' variable is 'utf-8' and I
> don't understand why!
>
> My current global configuration is as follows:
>
> encoding=utf-8
> fileencodings=ucs-bom,utf-8,default,latin1
> fileencoding=utf-8
>

When loading an existing file (or reloading one which you just edited)
which contains only bytes in the range 0-127, it will be detected as
UTF-8 without BOM, in preference to Latin1. This is not an error,
because these 128 characters are represented identically in US-ASCII,
Latin1 (, most non-EBCDIC encodings) and UTF-8; and UTF-8 has to come
before Latin1 in 'fileencodings', or it would never be detected. As long
as you don't add any characters with the high bit set, the file will be
read or written exactly the same way if its 'fileencoding' is set to
utf-8, latin1, or even us-ascii.

See ":help 'fileencodings'" (or ":h 'fencs'" if you're a lazy typist ;-)
) for an explanation of how the charset, and BOM if any, of an existing
file are detected.

If you want to be sure that a given file will be loaded as Latin1
(assuming 'fileencodings' is set to "ucs-bom,utf-8,latin1" -- or maybe
"ucs-bom,utf-8,default,latin1": trying UTF-8 once as itself and once as
default entails only a negligible performance loss in most cases), make
sure that it contains one or more characters in the range 128-255 (maybe
by using accented letters, possibly in a string literal or in a comment,
or maybe by underlining the top title of a plaintext file by a line of
"divided by" signs rather than dashes or equals...), then ":setlocal
fenc=latin1" will have it detected as Latin1 after the next time the
file is saved.


Best regards,
Tony.
--
ARTHUR: You are indeed brave Sir knight, but the fight is mine.
BLACK KNIGHT: Had enough?
ARTHUR: You stupid bastard. You havn't got any arms left.
"Monty Python and the Holy Grail" PYTHON (MONTY)
PICTURES LTD

--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

No comments: