Saturday, July 5, 2014

Re: Encoding and Fileencoding of a latin1 file

On Saturday, July 5, 2014 10:46:51 AM UTC-5, Ben Fritz wrote:
> Regardless, in your case, I would change your 'fileencodings'
> option to include the Windows-1252 encoding rather than Latin1. Or, you could
> manually override the encoding selection for that file.
>
> Using Windows-1252 depends on your system. For Windows, the proper value for
> your 'fileencoding' and 'fileencodings' options would be simply "cp1252". On
> Linux systems, it changes to "8bit-cp1252".

By the way, I'm more of a purist and want my Latin1 files to actually be Latin1,
using cp1252 only occasionally when I know it will work.

For that reason, my Vim config contains this encoding logic (actually this is
simplified from my full config) that will detect files as cp1252 normally, but
reload them as latin1 if none of the "special" characters defined in 1252 but
not Latin1 are used:

if has('multi_byte')
set encoding=utf-8
setglobal fenc=latin1

" Don't detect utf-8 without a BOM by default, I don't use UTF-8 normally
" and any files in latin1 will detect as UTF. Detect cp1252 rather than
" latin1 so files are read in correctly.
set fileencodings=ucs-bom,8bit-cp1252,latin1
if has('autocmd')
augroup fenc_detect
au!

" Detect when a buffer should actually be latin1 (i.e. there are no cp1252
" bytes in the buffer). cp1252 is a superset of latin1. See
" http://en.wikipedia.org/wiki/Cp1252 for details.
"
" Since latin1 is a subset of cp1252, this does not ACTUALLY modify the
" buffer, so bypass the modifiable option.
let cp1252_latin1_diff =
\ '\u20AC'. '\u201A'. '\u0192'. '\u201E'. '\u2026'. '\u2020'. '\u2021'. '\u02C6'. '\u2030'. '\u0160'. '\u2039'. '\u0152'. '\u017D'.
\ '\u2018'. '\u2019'. '\u201C'. '\u201D'. '\u2022'. '\u2013'. '\u2014'. '\u02DC'. '\u2122'. '\u0161'. '\u203A'. '\u0153'. '\u017E'. '\u0178'
autocmd BufReadPost * let s:oldmod = &modifiable | if !s:oldmod | setlocal modifiable | endif
autocmd BufReadPost * if &fenc=~?'cp1252$' && search('['.cp1252_latin1_diff.']', 'nw') == 0 | setlocal fenc=latin1 nomodified | endif
autocmd BufReadPost * if !s:oldmod | setlocal nomodifiable | endif
augroup END
endif
endif


For some file types (notably HTML) I use the "autofenc" plugin: http://www.vim.org/scripts/script.php?script_id=2721

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments:

Post a Comment