Sunday, July 6, 2014

Re: Encoding and Fileencoding of a latin1 file

On Sunday, July 6, 2014 5:32:32 AM UTC-5, rameo wrote:
> Thank you very much Ben for your great explication.
>
> Not easy to understand.
>
>

Agreed, encoding stuff is hard to understand in the best of cases. I think Vim's mix of 4 options (enc, fenc, fencs, tenc), one of which (fenc) is "global-local" (it has both a global "default" value and also a buffer-local value) makes it even more confusing. Read Tony's post for a great detailed explanation.

In summary, I think the best method is:

1. Set "encoding" to utf-8 and forget the option even exists
2. Set "fileencodings" to detect the files you edit most. Consider a plugin like autofenc for any rough edges.
3. Pay attention to the intended encoding of your file and set "fileencoding" accordingly if Vim guesses wrong.

You can forget about "termencoding" unless you use Vim in a terminal a lot and you have encoding/display problems.

>
> I still don't understand why my vimrc and menu.vim, containing both french characters as "œu", could be read in latin1 in the past, without any problem or error.
>
> (The only encoding line I had in my vimrc file at that moment was "set encoding="latin1")
>

Again, Windows likes to pretend cp1252 (a.k.a. Windows-1252) and Latin1 are synonymous. They are not, but Vim treats them as such if Vim's encoding is set to Latin1 (default value on most English Windows installations).

>
>
> What I also don't understand is that with above setting, files in latin1 where encoded in latin1 but the fileencoding in my statusline was empty (no fileencoding was indicated by vim)
>
> If I changed the above setting to "set encoding=utf8", encoding and fileencoding both indicated utf-8. Does vim take as default encoding the default windows encoding?
>
>

The default encoding when saving a file, if the 'fileencoding' option is empty, matches Vim's 'encoding' option. Like I said above though, you should really forget that the 'encoding' option even exists, so your fileencoding should normally be set to something.

That's what the "setglobal fileencoding=utf-8" command I suggested is for. Or substitute your preferred default encoding.

>
> Every now and then I write something in Russian that is why it might be better to change the default encoding to utf8, isn't it? I had also troubles to use a plugin using latin1 as the default encoding.
>

I suggest "encoding" as utf-8 regardless of whether you are writing files with special characters. With a utf-8 encoding, you can set various Vim options like 'listchars' and 'showbreak' to fancy Unicode characters instead of boring ASCII characters, plugins can show fancy arrows and the like for their UI, and other such niceties. True Latin1 should not give you any trouble with a Vim running in utf-8 mode, and if it does, a simple "scriptencoding" added at the top of the plugin as Tony details will fix that.

> If I set my default encoding to utf-8, what would be the "filencodings"?
>
> Set fileencodings=ucs-bom,utf8,cp1252,latin1?
>
> utf8 at the end or after ucs-bom?
>

utf-8 must come before any of the fixed 8-bit encodings as Tony says. I left it out of my config entirely because I didn't want my Latin1 files detected as UTF-8. I usually set the "bomb" option on my UTF-8 files so that the "ucs-bom" portion will detect my UTF-8 files. For files where a BOM is not valid (e.g. HTML) I have the AutoFenc plugin.

> Btw I'm on a windows OS, 8bit-cp1252 has to be cp1252, isn't it?
>

Yes, sorry about that. I'm normally on Windows too, it didn't occur to me I would have tweaked my config for the Linux system I'm on at the moment.

>
>
> If my default encoding will be utf-8, it is better to convert vimrc and menu.vim to utf8 as well to avoid that I see every time "Converted" after the filename, isn't it?

"converted" is for files you edit that don't match your global encoding. It doesn't have anything to do with scripts.

You could convert all your scripts, but it can be easier (and clearer as to the intended encoding) to add a "scriptencoding cp1252" or something to the top of each file.

>
> Do you know a good software to convert cp1252 files to utf-8? (I used iconv in the past)
>
>

Vim can do it, if compiled with multibyte support (I have yet to see a Windows Vim without it).

gvim -N -u NONE -i NONE
:set encoding=utf-8
:e ++enc=cp1252 blah.txt
:setlocal fileencoding=utf-8
:wq

>
> Btw Ben, you noted in your reply "setglobal fileencoding=utf-8"
>
> What difference is there between "set fileencoding=utf-8" and "setglobal fileencoding=utf-8". I thought there was no local buffer encoding setting?

There is no buffer-local "encoding" option. "fileencoding" (the one that actually exists for daily use ;-) ) is almost entirely buffer-local but using "set" instead of "setlocal" will affect the default value for new buffers. The setglobal command just sets the default value without changing the value for the current buffer.

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments: