On Thursday, November 29, 2012 3:16:33 PM UTC-6, coot_. wrote:
>
> <feff> is the BOM character for UTF-16 encoding. UTF-16 uses 2 bytes to
>
> encode a character, but the order of them might differ. This BOM
>
> character tells which byte comes first.
>
feff is the BOM character for UTF-8 as well, where it does not have any meaning in terms of byte ordering, but can be used to identify a file as UTF-8.
In UTF-8, the feff character is represented as efbbbf (three bytes) due to the way UTF-8 encodes multi-byte values in varying length.
The interesting thing about UTF-8 is that often even if an editor misidentifies a UTF-8 file as Latin1, or as windows-1252, for example, most of the file will remain readable, because UTF-8 has the same byte representation for many characters as Latin1 does.
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
Thursday, November 29, 2012
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment