Sunday, March 28, 2010

Re: user function for multiple substitution

On 29/03/10 02:31, Tim Chase wrote:
> Tony Mechelynck wrote:
>> Matt gave you a solution in several steps, but here's a single-step
>> one, taking advantage of both case-matching and case-insensitive
>> operators, of |sub-replace-expression| and of the ternary operator ?:
>> as in (condition ? result_if_true : result_if_false) |expr1| :
>>
>> command -nargs=0 -range=% -bar Xeo
>> \ <line1>,<line2>s/\c[scujgh]x/\=(
>> \ submatch(0) ==# 'sx' ? 'ŝ' :
>> \ submatch(0) ==# 'cx' ? 'ĉ' :
>> \ submatch(0) ==# 'ux' ? 'ŭ' :
>> \ submatch(0) ==# 'jx' ? 'ĵ' :
>> \ submatch(0) ==# 'gx' ? 'ĝ' :
>> \ submatch(0) ==# 'hx' ? 'ĥ' :
>> \ submatch(0) ==? 'SX' ? 'Ŝ' :
>> \ submatch(0) ==? 'CX' ? 'Ĉ' :
>> \ submatch(0) ==? 'UX' ? 'Ŭ' :
>> \ submatch(0) ==? 'JX' ? 'Ĵ' :
>> \ submatch(0) ==? 'GX' ? 'Ĝ' : 'Ĥ' )/g
>>
>> For maximum efficiency, the most frequent cases should be tested
>> first, but the use of ==# and ==? to enable (for instance) both Cx and
>> CX for Ĉ but only cx for ĉ requires lowercase to come first. (This
>> will identify cX as Ĉ but I think it can be tolerated.)
>
> In Vim7+, I'd be tempted to tweak Tony's solution so it uses a
> literal/in-line dict for the conversions, something like (broken into
> multiple lines without the requisite "\" characters but could just as
> easily be one line):
>
> s/\c[scujgh]x/\=get({
> 'sx':'ŝ',
> 'cx':'ĉ',
> 'ux':'ŭ',
> 'jx':'ĵ',
> 'gx':'ĝ',
> 'hx':'ĥ',
> 'SX':'Ŝ',
> 'CX':'Ĉ',
> 'UX':'Ŭ',
> 'JX':'Ĵ',
> 'GX':'Ĝ',
> 'HX':'Ĥ'
> }, submatch(0), '??default??')/g
>
> (which should have the benefit of a linear lookup time, and is a lot
> less hassle to maintain, IMHO)
>
> You might have to do some case-folding with tolower()/toupper() on the
> submatch(0), or include additional entries for other case-combinations.
>
> -tim
>
>
>

IIUC, the usual practice (beyond lowercase) is to use CX, GX, HX etc. in
all-caps titles, and Cx, Gx, Hx, etc. for the initial capital of words
(proper names, first-in-sentence, etc.) where the other letters are in
lowercase. (ŭ is extremely rare at the start of a word; it mostly occurs
after a vowel. ĥ is rather infrequent in any position.)

For the default ("not found in table"), I'd just use submatch(0) again,
i.e., "don't change".

Oh, and for logarithmic time my solution could use dichotomic searching
(taking advantage of the fact that both the result_if_true and the
result_if_false of a ?: construct can in turn be ?: expressions, each of
which can, etc.) but for such a small set of possibilities I don't think
there would be a very big performance gain.


Best regards,
Tony.
--
All Finagle Laws may be bypassed by learning the simple art of doing
without thinking.

--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

To unsubscribe from this group, send email to vim_use+unsubscribegooglegroups.com or reply to this email with the words "REMOVE ME" as the subject.

No comments: