Sunday, October 4, 2015

Re: Unicode input

On 4.10.2015 03:03, Marvin Renich wrote:
* Marcel Svitalský <marcel.svitalsky@centrum.cz> [151003 16:24]:  
It works all right for common ASCII characters, however for characters  above 127 it creates unexpected results:    """  Got from "Convert to HEX"  3a    :  25    %  73    s  2d    -  c2a0  Unicode non-breakable space  c2ad  Unicode soft-hyphen    Test results:  s   - put in with Ctrl-V u 0073 - OK  슠  - put in with Ctrl-V u c2a0 (its real code is ec8aa0) - ERROR  슭  - put in with Ctrl-V u c2ad (its real code is ec8aad) - ERROR  """  
  Unicode c2a0 is not non-breaking space; that is 00a0.  Soft-hyphen is  00ad.  The c2a0 and c2ad characters are in the Hangul Syllables table,  and are displayed correctly for me (the same way they display for me in  the email you sent).    I believe your problem is that you are confusing Unicode code points  with utf-8.  You should use the code point when using Ctrl-V u to enter  Unicode characters.  So, if you type Ctrl-V u 00a0 you will get the  non-breaking space.  If fileencoding is utf-8, then when the file is  written, the non-breaking space will be written out as the two-byte  sequence c2 a0, because that is the utf-8 encoding for code point 00a0.    ...Marvin    
Marvin,

you are quite right: I misunderstood the Vim help and took it that it was suggesting to enter the four hex-digits displayed by "Convert to HEX" Vim menu, which represent character's UTF-8 code.

However I wrote another email to Bram in the meantime (I had not known before this conference was moderated and had been confused that my emails had not shown here) who explained me the matter and suggested 'ga' Vim command to display codes required to input Unicode characters properly.

Thanks anyway,
Marcel

--
Marcel Svitalský 

No comments:

Post a Comment