> On 29/03/10 02:31, Tim Chase wrote:
> >Tony Mechelynck wrote:
> >>Matt gave you a solution in several steps, but here's a single-step
> >>one, taking advantage of both case-matching and case-insensitive
> >>operators, of |sub-replace-expression| and of the ternary operator ?:
> >>as in (condition ? result_if_true : result_if_false) |expr1| :
> >>
> >>command -nargs=0 -range=% -bar Xeo
> >>\ <line1>,<line2>s/\c[scujgh]x/\=(
> >>\ submatch(0) ==# 'sx' ? 'ŝ' :
> >>\ submatch(0) ==# 'cx' ? 'ĉ' :
> >>\ submatch(0) ==# 'ux' ? 'ŭ' :
> >>\ submatch(0) ==# 'jx' ? 'ĵ' :
> >>\ submatch(0) ==# 'gx' ? 'ĝ' :
> >>\ submatch(0) ==# 'hx' ? 'ĥ' :
> >>\ submatch(0) ==? 'SX' ? 'Ŝ' :
> >>\ submatch(0) ==? 'CX' ? 'Ĉ' :
> >>\ submatch(0) ==? 'UX' ? 'Ŭ' :
> >>\ submatch(0) ==? 'JX' ? 'Ĵ' :
> >>\ submatch(0) ==? 'GX' ? 'Ĝ' : 'Ĥ' )/g
> >>
> >>For maximum efficiency, the most frequent cases should be tested
> >>first, but the use of ==# and ==? to enable (for instance) both Cx and
> >>CX for Ĉ but only cx for ĉ requires lowercase to come first. (This
> >>will identify cX as Ĉ but I think it can be tolerated.)
> >
> >In Vim7+, I'd be tempted to tweak Tony's solution so it uses a
> >literal/in-line dict for the conversions, something like (broken into
> >multiple lines without the requisite "\" characters but could just as
> >easily be one line):
> >
> >s/\c[scujgh]x/\=get({
> >'sx':'ŝ',
> >'cx':'ĉ',
> >'ux':'ŭ',
> >'jx':'ĵ',
> >'gx':'ĝ',
> >'hx':'ĥ',
> >'SX':'Ŝ',
> >'CX':'Ĉ',
> >'UX':'Ŭ',
> >'JX':'Ĵ',
> >'GX':'Ĝ',
> >'HX':'Ĥ'
> >}, submatch(0), '??default??')/g
> >
> >(which should have the benefit of a linear lookup time, and is a lot
> >less hassle to maintain, IMHO)
> >
> >You might have to do some case-folding with tolower()/toupper() on the
> >submatch(0), or include additional entries for other case-combinations.
> >
> >-tim
> >
> >
> >
>
> IIUC, the usual practice (beyond lowercase) is to use CX, GX, HX etc.
> in all-caps titles, and Cx, Gx, Hx, etc. for the initial capital of
> words (proper names, first-in-sentence, etc.) where the other letters
> are in lowercase. (ŭ is extremely rare at the start of a word; it
> mostly occurs after a vowel. ĥ is rather infrequent in any position.)
>
> For the default ("not found in table"), I'd just use submatch(0)
> again, i.e., "don't change".
>
> Oh, and for logarithmic time my solution could use dichotomic
> searching (taking advantage of the fact that both the result_if_true
> and the result_if_false of a ?: construct can in turn be ?:
> expressions, each of which can, etc.) but for such a small set of
> possibilities I don't think there would be a very big performance
> gain.
I guess ĥ was once more commonly used but had been replaced by k or ĉ
in common words such as ĥemio or ĥino. The occurrence frequency of ŭ
should be a little bit more than that for ĥ.
Just in the danger of far too off-topic, how does esperanto handle
foreign proper name for the accusative case. eg.
Einstein estimas Bohnn
how to tell who is the agent and who is the direct object?
--
regards,
====================================================
GPG key 1024D/4434BAB3 2008-08-24
gpg --keyserver subkeys.pgp.net --recv-keys 4434BAB3
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
To unsubscribe from this group, send email to vim_use+unsubscribegooglegroups.com or reply to this email with the words "REMOVE ME" as the subject.
No comments:
Post a Comment