Saturday, October 3, 2015

Re: Unicode input

* Marcel Svitalský <marcel.svitalsky@centrum.cz> [151003 16:24]:
> It works all right for common ASCII characters, however for characters
> above 127 it creates unexpected results:
>
> """
> Got from "Convert to HEX"
> 3a :
> 25 %
> 73 s
> 2d -
> c2a0 Unicode non-breakable space
> c2ad Unicode soft-hyphen
>
> Test results:
> s - put in with Ctrl-V u 0073 - OK
> 슠 - put in with Ctrl-V u c2a0 (its real code is ec8aa0) - ERROR
> 슭 - put in with Ctrl-V u c2ad (its real code is ec8aad) - ERROR
> """

Unicode c2a0 is not non-breaking space; that is 00a0. Soft-hyphen is
00ad. The c2a0 and c2ad characters are in the Hangul Syllables table,
and are displayed correctly for me (the same way they display for me in
the email you sent).

I believe your problem is that you are confusing Unicode code points
with utf-8. You should use the code point when using Ctrl-V u to enter
Unicode characters. So, if you type Ctrl-V u 00a0 you will get the
non-breaking space. If fileencoding is utf-8, then when the file is
written, the non-breaking space will be written out as the two-byte
sequence c2 a0, because that is the utf-8 encoding for code point 00a0.

...Marvin

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments:

Post a Comment