Sunday, September 30, 2012

Re: Matching/Sorting line terminations

Dear TIm: Thank you very much. I've just been trying it out and I'm
still quite breathless -- you've saved me so much grief with this
brilliant generosity that I hardly know what to say. It works
beautifully on python 2.6.6 (Debian). Also the results look even more
informative than I had hoped. And I learned another batch of stuff
about Linux; it is kind of you to include explanations, with that I
can manage.

Thank you so much -- and thanks to all who joined in,

Julian

On Sun, Sep 30, 2012 at 10:20 PM, Tim Chase <vim@tim.thechases.com> wrote:
> On 09/30/12 11:14, jbl wrote:
>> The problem is this: I have a large file of poetry in alphabetical
>> order sorted on the last term in each line, I post an except in
>> sample1 below. I want to sort it so that lines that share, say, the
>> last two terms (on the right) with the last two terms of any other
>> line are in one group, those lines that share the last three terms in
>> another and so on
>
> Well, a quick little Python script seems to do the grunt-work for me:
>
> ##############################
> import re
> r = re.compile(r'\w+')
> print ''.join(sorted(
> (line for line in file("raw.txt")),
> key=lambda s: tuple(reversed(r.findall(s)))
> ))
> ##############################
>
> That is case-sensitive. It's a small bit more if you want it
> case-insensitive:
>
> ##############################
> import re
> r = re.compile(r'\w+')
> print ''.join(sorted(
> (line for line in file("raw.txt")),
> key=lambda s: tuple(w.upper() for w in reversed(r.findall(s)))
> ))
> ##############################
>
>> and so on up to seven places. But it must be possible to generalize
>> that somehow.
>
> One of the tricky aspects of this is how you treat (or ignore)
> differing punctuation. If you want the same words, but allowing for
> varying punctuation, it's a lot more complex. That said, it sounded
> like a fun afternoon challenge, so I threw together & attached a
> quick program that accommodates all your options and
> case-insensitivity needs :-)
>
> It can be called on a pair of files, or you can pipe stdin and it
> will return on stdout in case you want to call it from Vim with
>
> :%! python revsort.py
>
> to just operate on a sub-range of your file. Alternatively, from a
> command-line, you can use
>
> python revsort.py infile.txt outfile.txt
>
> or, if you only want those where the last N words match, you can do
> things like
>
> python revsort.py -w 3 infile.txt outfile.txt
>
> It was kinda fun, and hopefully the code is easy to follow. This
> does assume that you have Python installed on your machine. It was
> tested on 2.6, but should run on 2.4-2.7, and possibly on 3.x as well.
>
> -tim
>
>
>
>
>
>
> --
> You received this message from the "vim_use" maillist.
> Do not top-post! Type your reply below the text you are replying to.
> For more information, visit http://www.vim.org/maillist.php

--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

No comments: