Tuesday, June 21, 2011

Re: remove all except pattern

On 06/20/2011 12:22 PM, rameo wrote:
>>> myem...@mydomain.com hello myotherem...@mydomain.com
>>> hello mylatestem...@mydomain.co.uk
>>
>>> Putting that in your command gives an empty output (it removes the
>>> emails).
>>
>> Ah...with that context, I might try to approach the problem
>> differently. If you don't mind newlines being added when more
>> than one email was on the same line, I might do something like
>>
>> :%s/\([A-Z0-9+_.-]\+@\([A-Z0-9-]\+\.\)\+[A-Z]\{2,6}\)/\r&\r/g
>> :v/^\([A-Z0-9+_.-]\+@\([A-Z0-9-]\+\.\)\+[A-Z]\{2,6}\)$/d
>>
>> The first one puts each email address alone on a line, and the
>> second one deletes all lines that don't have an email address on
>> them. Given your example, one might also have to add "\c" to
>> your pattern to make it case-insensitive (your regexp only finds
>> uppercase email addresses).
>>
>> If you need to keep emails on the same line back together,
>> it might take a little more work.
>
> I can't find out how to keep them on the same line.

(reordering to the preferred interleaved-reply format)

My 2-pass could be modified something like a multiple-pass,
generically written as something like

1) :%s/^.\{-}\(<pattern>\)/\r- \1\r/
2) :v/^- <pattern>/s/<pattern>/\r&\r/g
3) :v/<pattern>/d
4) :v/^- /-j
5) :%s/^- /

where

1) pulls the first match in each line, deletes the stuff before
the match, prefixes it with something unique (using "- " here)

2) on all the non-prefixed lines, do the previous step to put
each match on its own (non-prefixed) line

3) delete the lines that don't match the pattern, leaving just
matching lines (with the first-matches having prefixes)

4) on lines without prefixes, join to the previous line

5) strip the prefix back off

For your email example, that would be:

:%s/^.\{-}\([A-Z0-9+_.-]\+@\([A-Z0-9-]\+\.\)\+[A-Z]\{2,6}\)\c/\r-
\1\r

:v/^-
\([A-Z0-9+_.-]\+@\([A-Z0-9-]\+\.\)\+[A-Z]\{2,6}\)\c/s/\([A-Z0-9+_.-]\+@\([A-Z0-9-]\+\.\)\+[A-Z]\{2,6}\)\c/\r&\r/g

:v/\([A-Z0-9+_.-]\+@\([A-Z0-9-]\+\.\)\+[A-Z]\{2,6}\)\c/d

:v/^- /-j

:%s/^- /


There might be some way to do it in fewer steps, but if you need
something right now, that should work.

-tim


--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

No comments: