On Wednesday, October 24, 2012 11:24:36 PM UTC-5, Karthick wrote:
> On Wed, Oct 24, 2012 at 11:05 PM, John Slattery wrote:
>
> > Hi,
>
> >
>
> > In a tab delimited data file I want to replace empty fields with \N. I wasn't getting the result I expected and began working with a simple test file that looked as follows with set list:
>
> >
>
> > ^I^I^I^I^I$
>
> > ^I^I^I^I$
>
> > ^I^I^I$
>
> > ^I^I$
>
> > ^I$
>
> > $
>
> For simplicity of typing here, I assumed a comma separated file:
>
> ,,,,,$
>
> ,,,,$
>
> ,,,$
>
> ,,$
>
> ,$
>
> $
>
>
>
> The '$' is a literal dollar sign I assume - and is not meant to imply
>
> end-of-line.
>
>
>
> > The desired result would be:
>
> >
>
> > \N^I\N^I\N^I\N^I\N^I\N$
>
> > \N^I\N^I\N^I\N^I\N$
>
> > \N^I\N^I\N^I\N$
>
> > \N^I\N^I\N$
>
> > \N^I\N$
>
> > \N$
>
>
>
> I also simplified the replacement pattern to just "N". So we are looking for:
>
> N,N,N,N,N,N$
>
> N,N,N,N,N$
>
> N,N,N,N$
>
> N,N,N$
>
> N,N$
>
> N$
>
>
>
> >
>
> > What I started with, and what seemed the most obvious way to do it, doesn't work, leaving the final field of multi-field lines unfilled:
>
>
>
> I think the question is "what constitutes an empty field?". There are
>
> four patterns:
>
> ,, - a comma followed by comma
>
> ,$ - a comma followed by dollar
>
> , - a comma at start of line
>
> $ - start of line followed by dollar.
>
>
>
> So an empty field is: "either a start of line (^) or comma, followed
>
> by either a comma or a dollar". So the substitute pattern becomes:
>
> %s;\(^\|,\)\ze\(,\|\$\);\1N;gc
>
>
>
> To now look at the original problem, the pattern is:
>
> %s;\(^\|^I\)\ze\(^I\|\$\);\1\\N;gc
>
>
>
> ^I must be replaced by the literal tab character.
>
>
>
> Note: for some reason, the same substitute command as above, but with
>
> only "g" flag (and not "gc"), does not work for me! Still checking
>
> that..
>
>
>
> >
>
> > %s/\(^\|\t\)\@<=\(\t\|$\)\@=/\\N/g
>
> >
>
> > The following works, the only difference being the match on end-of-line instead of end-of-line at end of pattern with zero width:
>
> >
>
> > %s/\(^\|\t\)\@<=\(\t\|\n\)\@=/\\N/g
>
> >
>
> > Using zs and ze instead of @<= and @=, the following did not work, strangely leaving the second field of multi-field lines unfilled:
>
> >
>
> > %s/\(^\|\t\)\zs\ze\(\t\|$\)/\\N/g
>
> Yes - I see this issue as well! And if you use "gc", it works correctly.
>
>
>
> >
>
> > Both of the commands above that failed achieve the desired result if executed repeatedly until the command returns E486: Pattern not found: ….
>
> >
>
> > Vim is v7.3.429 as packaged for the Lubuntu 12.04 repositories.
>
> Same here.
The '$' is actually end-of-line. (I mentioned that the test data is displayed as it would be with set list.)
So modified, I tried your suggestion and confirmed that it works with flags gc. When I tried it with just g:
%s/\(^\|,\)\ze\(,\|$\)/\1N/g
it left the second field of multi-field lines unfilled as my command:
%s/\(^\|\t\)\zs\ze\(\t\|$\)/\\N/g
did.
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment