Friday, October 2, 2015

Unicode input

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iEYEARECAAYFAlYOwjgACgkQ74fQpdmOyDpG5wCfWnYmeA2WnRa/QrLzfGBB1LoD
96sAn0BdwIB8TthbEMN/Y0xNSXf8WipU
=5mUG
-----END PGP SIGNATURE-----
Hi all,

I am having troubles with entering non-printable Unicode characters. I am currently using (g)Vim 7.4.889 on Linux Ubuntu 12.04 (3.13.0.65 kernel), compiled with GCC 5.2.0.

Vim help states:
If everything else fails, you can type any character as four hex bytes:

    CTRL-V u 1234

"1234" is interpreted as a hex number.  You must type four characters, prepend
a zero if necessary.

It works all right for common ASCII characters, however for characters above 127 it creates unexpected results:

"""
Got from "Convert to HEX"
3a    :
25    %
73    s
2d    -
c2a0  Unicode non-breakable space
c2ad  Unicode soft-hyphen

Test results:
s   - put in with Ctrl-V u 0073 - OK
슠  - put in with Ctrl-V u c2a0 (its real code is ec8aa0) - ERROR
슭  - put in with Ctrl-V u c2ad
(its real code is ec8aad) - ERROR
"""

As you can see the last two hex numbers put in are interpreted correctly (i.e. a0 is a0, ad is ad), however the first two ones are changed: c2 becomes ec8a.

So I wonder whether I am doing it wrong or whether I should write a bug report. Any help or advice appreciated.

Thx
Marcel Svitalský

--
Marcel Svitalský 

No comments: