Sunday, March 28, 2010

Re: OT accusative case [Re: user function for multiple substitution]

On 29/03/10 04:08, bill lam wrote:
> lun, 29 Mar 2010, Tony Mechelynck skribis:
>> On 29/03/10 02:31, Tim Chase wrote:
>>> Tony Mechelynck wrote:
>>>> Matt gave you a solution in several steps, but here's a single-step
>>>> one, taking advantage of both case-matching and case-insensitive
>>>> operators, of |sub-replace-expression| and of the ternary operator ?:
>>>> as in (condition ? result_if_true : result_if_false) |expr1| :
>>>> command -nargs=0 -range=% -bar Xeo
>>>> \<line1>,<line2>s/\c[scujgh]x/\=(
>>>> \ submatch(0) ==# 'sx' ? 'ŝ' :
>>>> \ submatch(0) ==# 'cx' ? 'ĉ' :
>>>> \ submatch(0) ==# 'ux' ? 'ŭ' :
>>>> \ submatch(0) ==# 'jx' ? 'ĵ' :
>>>> \ submatch(0) ==# 'gx' ? 'ĝ' :
>>>> \ submatch(0) ==# 'hx' ? 'ĥ' :
>>>> \ submatch(0) ==? 'SX' ? 'Ŝ' :
>>>> \ submatch(0) ==? 'CX' ? 'Ĉ' :
>>>> \ submatch(0) ==? 'UX' ? 'Ŭ' :
>>>> \ submatch(0) ==? 'JX' ? 'Ĵ' :
>>>> \ submatch(0) ==? 'GX' ? 'Ĝ' : 'Ĥ' )/g
>>>> For maximum efficiency, the most frequent cases should be tested
>>>> first, but the use of ==# and ==? to enable (for instance) both Cx and
>>>> CX for Ĉ but only cx for ĉ requires lowercase to come first. (This
>>>> will identify cX as Ĉ but I think it can be tolerated.)
>>> In Vim7+, I'd be tempted to tweak Tony's solution so it uses a
>>> literal/in-line dict for the conversions, something like (broken into
>>> multiple lines without the requisite "\" characters but could just as
>>> easily be one line):
>>> s/\c[scujgh]x/\=get({
>>> 'sx':'ŝ',
>>> 'cx':'ĉ',
>>> 'ux':'ŭ',
>>> 'jx':'ĵ',
>>> 'gx':'ĝ',
>>> 'hx':'ĥ',
>>> 'SX':'Ŝ',
>>> 'CX':'Ĉ',
>>> 'UX':'Ŭ',
>>> 'JX':'Ĵ',
>>> 'GX':'Ĝ',
>>> 'HX':'Ĥ'
>>> }, submatch(0), '??default??')/g
>>> (which should have the benefit of a linear lookup time, and is a lot
>>> less hassle to maintain, IMHO)
>>> You might have to do some case-folding with tolower()/toupper() on the
>>> submatch(0), or include additional entries for other case-combinations.
>>> -tim
>> IIUC, the usual practice (beyond lowercase) is to use CX, GX, HX etc.
>> in all-caps titles, and Cx, Gx, Hx, etc. for the initial capital of
>> words (proper names, first-in-sentence, etc.) where the other letters
>> are in lowercase. (ŭ is extremely rare at the start of a word; it
>> mostly occurs after a vowel. ĥ is rather infrequent in any position.)
>> For the default ("not found in table"), I'd just use submatch(0)
>> again, i.e., "don't change".
>> Oh, and for logarithmic time my solution could use dichotomic
>> searching (taking advantage of the fact that both the result_if_true
>> and the result_if_false of a ?: construct can in turn be ?:
>> expressions, each of which can, etc.) but for such a small set of
>> possibilities I don't think there would be a very big performance
>> gain.
> I guess ĥ was once more commonly used but had been replaced by k or ĉ
> in common words such as ĥemio or ĥino. The occurrence frequency of ŭ
> should be a little bit more than that for ĥ.

ŭ after vowel is quite frequent: laŭ (according to), preskaŭ (almost),
hodiaŭ (today), eŭfonio (euphony), poŭpo (poop [of a ship]), etc. The
Academy has condoned the replacemenbt of -rĥ- by -rk- (Appendix 8 to the
list of official radicals); in other words usage may vary (e.g. ĥoro vs.
koruso for English "choir" [singing company]; for the choir of a church
[part of the building] I suppose ĥorejo or maybe korusejo would be used).

> Just in the danger of far too off-topic, how does esperanto handle
> foreign proper name for the accusative case. eg.
> Einstein estimas Bohnn
> how to tell who is the agent and who is the direct object?

Either Esperantize the whole word according to established forms or to
Rule 15 of the Fundamenta Gramatiko ("En Nederlando mi vizitis
Mastriĥton, Hagon, Roterdamon, Amsterdamon, sed ne Groningon") or add
-on (for a substantive) to the object ("Einstein estimas Bohn-on", if it
is about someone named Bohn), if there is a risk of ambiguity. If the
presence of an adjective, or of several names in apposition, avoids
ambiguity, then the un-Esperantized name may remain unchanged: Johanon
Sebastianon Bach; Bruegel la Maljunan.

Best regards,
Eight Megabytes And Continually Swapping.

You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit

To unsubscribe from this group, send email to or reply to this email with the words "REMOVE ME" as the subject.

No comments:

Post a Comment