Monday, January 7, 2013

Re: Match word containing characters beyond a-zA-Z

On 2013–01–07 Andy Wokula wrote:

> The word motion w moves over those characters.
> :h w
> :h word

But what does that mean? The w motion recognises å as a
letter, the regex \w does not.

w moves a *word* forward. A word is:

A word consists of a sequence of letters, digits and underscores,
or a sequence of other non-blank characters, separated with white
space (spaces, tabs, <EOL>). This can be changed with the
'iskeyword' option.

My iskeyword setting is:

iskeyword=@,48-57,_,192-255

Why does w move over the word treść? The letters ś and ć are not
considered to be a letter, right? iskeyword lists range 192-255. If
I hit ga on ś and ć, it reports the codes 347 and 263.

> :h 'isk
> :h /\k

\k seems to work. Is it safe to replace all occurrences of \w with
\k? That seems to be the easiest solution. And I still don't
understand *why* it works, since 347 seems to be out of range for
iskeyword.

> Also, [:upper:] and [:lower:] include more characters. Try
> /\c[[:lower:]]\+

This works for å and ä but fails on ś and ć.


Marco

No comments: