Saturday, July 2, 2011

Re: Success with non-BMP Unicode characters on Windows?

On 27/06/11 06:10, Yongwei Wu wrote:
> Hi gurus,
>
> Today somebody asked me about non-BMP support in Vim, and I took a
> quick test. These are my steps:
>
> 1) Open Word, choose Insert> Symbol, select SimSun-ExtB as the font,
> and insert a non-BMP character. I inserted U+2A6D6 in my test.
> 2) Open GVim, input "set guifontwide=SimSun-ExtB:h12", and then paste
> the character from Word to GVim.
> 3) I can use "ga" to confirm the character is pasted, but nothing is shown.
>
> Anybody has ideas about this?
>
> I also had a quick check in gui_w32.c, and can see the surrogate
> support is in gui_mch_draw_string for the UTF-8 encoding. ExtTextOutW
> is used to output the character. I do use UTF-8 as encoding.
>
> Best regards,
>
> Yongwei
>

In order to display a given character in gvim, you need:

1. a 'guifont' which has a glyph for that character. On Windows (and on
any gvim version other than GTK2) it must be defined as a monospaced
font (i.e. all glyphs the same width, except that CJK "wide" glyphs are
exactly double the width of other glyphs).

Most fonts don't have glyphs for all Unicode codepoints. Even fonts
billed as "Chinese" or "Japanese" may lack glyphs for "additional" CJK
characters, depending on which encoding their creator had in mind: if
the font was originally created for CP950 it may lack most of the
"additional" glyphs. Your best bet is to find a recent font explicitly
supporting GB18030. (GB18030 is a Chinese mainland encoding which has
the potential to represent any Unicode codepoint, so fonts mentioning it
in their sales literature "ought" to support every possible hanzi, even
those for rare family names, those only used in old classical texts, etc.).

I would try leaving 'guifontwide' empty and searching for an appropriate
'guifont'. This is based on the following sentences under ":help
'guifontwide'":

For systems other than GTK2:
> When 'guifont' is set and a valid font is found in it and
> 'guifontwide' is empty Vim will attempt to find a matching
> double-width font and set 'guifontwide' to it.

For (Unix/Linux) gvim with GTK2 GUI:
> Vim does not attempt to find an appropriate value for 'guifontwide'
> automatically. If 'guifontwide' is empty Pango/Xft will choose the
> font for characters not available in 'guifont'. Thus you do not need
> to set 'guifontwide' at all unless you want to override the choice
> made by Pango/Xft.

2. for characters above U+FFFF, a gvim version later than 7.1.116 (if
your gvim is older than that, it's *really* time to upgrade ;-) )

Entering and editing any Unicode codepoint was possible since Vim 6.1,
but the *displaying* of codepoints above U+FFFF was added by patch
7.1.116 (see ":help version7.txt" and search for 7.1.116).


I am on Linux with gvim with GTK2 GUI; when I type (in Insert mode)
^VU2a6d6^[ (where ^V means Ctrl-V and ^[ means Esc; see ":help
i_CTRL-V_digit"), a complex glyph is inserted which is exactly identical
no matter which same-size font I select as 'guifont'. This makes me
think that (see above) Pango/Xft always uses the same "fallback glyph",
perhaps there is only one font on my system which has a glyph for that
codepoint.

The left half of this glyph seems to be the last ("Big Flute") radical
of Kangxi, the right half is even more complex: I would say that it
looks to me like an elephant head with a long trunk turning to the right
(my right, the elephant's left) near the bottom, plus two squarish
claws, or maybe wheat ears, or tea leaves, laid down horizontally, on
the left, a box with an X on it on the right, and the four dots of a
heart below the curving trunk. After writing all this, I would summarize
this character (probably erroneously, but it'll make me remember which
character we're talking about) as: «Let's sing the merits of the hearty
charge elephant, who carries a ton load of tea in a single big labeled box.»


Best regards,
Tony.
--
It is only people of small moral stature who have to stand on their
dignity.

--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

No comments:

Post a Comment