On Tuesday, November 18, 2014 2:37:22 AM UTC-8, Erik Christiansen wrote:
> On 17.11.14 10:57, Graham Lawrence wrote:
> > For my test file the awk program tagged some 3500 words, with 1960 of them
> > unique, so this vim script must run within a loop to avoid the tedium and
> > 4000 odd keystrokes required to invoke it individually for each unique
> > error,
>
> Er, what script loop, and what "4000 odd keystrokes [per] error", if one
> may be so bold?
A while loop to enclose the mapping as you saw it, I never add such details until I have the rest of the code working satisfactorily. Not 4000 keystrokes per error, ~4000 for all 1960 uniques errors with a 2 keystroke code to invoke the mapping.
All of which is redundant now, as I realized I could cut it to one keystroke per error by splitting it into 2 mappings, which allowed eliminating the need for user input entirely, in which the 2nd mapping ends by reinvoking the first.
> If the list of good words is read into an associative
> array (lets call it "list") in the BEGIN action, and membership tested
> with an "if (word in list) ..." in an unconditional action handling the
> input stream, _and_ the unrecognised words (sans @@) are printed to
> another file, then it is only necessary to open that file in vim, and
> for each word (one per line), hit ":.w >> /path/goodfile" for each word
> which we accept as good. With that aliased to a key of choice, only one
> keystroke is required to qualify each word. Both awk and vim are run
> once per session, handling thousands of words each time, if you have
> them. Four thousand keystrokes would handle 4000 errors.
In practice, it is not that straightforward. I'm not sure how awk organizes arrays internally, but I used a plain numeric index as I figured it must use an address array to reference the words array, and with a numeric index I could use a binary search pattern to locate the word. I think an associative array must use a linear search pattern because awk has no way of knowing if the array is actually in sequence.
And of course, I have a second array of word suffixes to reference if the word of interest is not the root form.
>
> If these are e.g. ordinary English words, is it acceptable to read in
> e.g. /usr/share/dict/british-english into "list", to start with 98,000
> or more good words in the BEGIN action, before reading in your list of
> special words,
Project Gutenberg provides Webster's Dictionary from about 1913. I extracted all the words from the html, and it reduced to about 200,000 unique words. I use arch and they don't include such refinements as dictionaries in their distro.
>
> Erik
> (Who is doubtless glossing over some undeclared additional requirement. :)
>
> --
> Melbourne Water Use:
> "More water is lost to stormwater each year than we use. On average we use
> about 40 billion litres of water each year, and each year about 500 billion
> litres runs into our drains." Leonie Duncan, Environment Victoria healthy river
> campaigner, quoted on p7 of Journal 21.10.08.
--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Tuesday, November 18, 2014
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment