Saturday, June 7, 2014

Re: mutlibyte word boundary

On 06/06/14 06:40, Rick Howe wrote:
> I often use multibyte characters and w or \< will jump cursor to the CJK/Hiragana/Katakana/Hangul/Symbol boundaries. Is there any document or help page to describe such a mutlibyte word boundary?
>

In Vim, one character (even made up of several bytes) is one character.
You don't need anything to match the boundary of a CJK character, since
Vim cannot end (or start) a match in the middle of an ideogram. (If it
ever does, it's a bug, and it should be reported in the vim_dev group
with full details of how to make the bug appear in the latest version
and patchlevel of Vim.)

If you want to treat a space-separated sequence of CJK characters as a
unit, treat them as a WORD — and note that in Vim terminology, a word
and a WORD are not the same thing: see
:help word-motions
:help WORD
:help aW
:help iW
See also
:help /\<
which resends to
:help 'iskeyword'
and from there to
:help 'isfname'
which says that multibyte characters above 0xFF are always included in
these 'is…something' options.

If you want to treat as a unit a multisyllabic Chinese word (i.e. a
combination of successive hanzi which have no meaning in isolation but
only as a group), then AFAIK you're out of luck: Vim doesn't know the
Chinese language (nor does it know Japanese or Korean, for that matter)
and it cannot determine where such a non-space-separated multi-hanzi
word would start or end (except maybe if spell checking is on, but I
don't know the subtleties of CJK spell checking). The same applies of
course to non-space-separated kana or hangul, or indeed to any
non-space-separated mixture of kanji and kana, or of hanja and hangul.


Best regards,
Tony.
--
Humor in the Court:
Q: Could you see him from where you were standing?
A: I could see his head.
Q: And where was his head?
A: Just above his shoulders.

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments: