Thursday, October 11, 2012

Re: Remove duplicate words or patterns inside lines

On 10/11/12 06:32, tjg wrote:
> I have this type of file (plain text) :
>
> sometext *sometext* @me &project1 *@me*
> &project2 sometext *&project2* @john @me
> something @john &project2
> sometext #1 @me something else *#1*

I presume the "*" were added by your MUA as an attempt to highlight
the duplicates.

> and I would like to remove all inside-a-line duplicates so as to obtain :
>
> sometext @me &project1
> sometext &project2 @john @me
> something @john &project2
> sometext #1 @me something else
>
> btw, the order of items (sometext, #, @ or &) do not matter, as long as they
> are unique per line

To remove the first instance of each pair, you can use this ugly brute:

:%s/\([#@&]\=\<\w\+\>\).\{-}\zs \+[#@&]\@<!\1\>//g

There are some cases where if there are two duplicates that overlap
such as

sometext @me sometext @me

where you'll have to run it a second time, but it otherwise seems to
catch all the edge-cases I threw at it:

- when the text such as "sometext" also appears as "&sometext" or
"@sometext" or "#sometext"

- when substrings match such as "mete" and "sometext"

Hope this helps,

-tim


--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

No comments: