Wednesday, October 24, 2012

Re: Problem Using :substitute to Replace Empty Fields in Tab Delimited File

On Wed, Oct 24, 2012 at 11:05 PM, John Slattery <johntslattery@gmail.com> wrote:
> Hi,
>
> In a tab delimited data file I want to replace empty fields with \N. I wasn't getting the result I expected and began working with a simple test file that looked as follows with set list:
>
> ^I^I^I^I^I$
> ^I^I^I^I$
> ^I^I^I$
> ^I^I$
> ^I$
> $
For simplicity of typing here, I assumed a comma separated file:
,,,,,$
,,,,$
,,,$
,,$
,$
$

The '$' is a literal dollar sign I assume - and is not meant to imply
end-of-line.

> The desired result would be:
>
> \N^I\N^I\N^I\N^I\N^I\N$
> \N^I\N^I\N^I\N^I\N$
> \N^I\N^I\N^I\N$
> \N^I\N^I\N$
> \N^I\N$
> \N$

I also simplified the replacement pattern to just "N". So we are looking for:
N,N,N,N,N,N$
N,N,N,N,N$
N,N,N,N$
N,N,N$
N,N$
N$

>
> What I started with, and what seemed the most obvious way to do it, doesn't work, leaving the final field of multi-field lines unfilled:

I think the question is "what constitutes an empty field?". There are
four patterns:
,, - a comma followed by comma
,$ - a comma followed by dollar
, - a comma at start of line
$ - start of line followed by dollar.

So an empty field is: "either a start of line (^) or comma, followed
by either a comma or a dollar". So the substitute pattern becomes:
%s;\(^\|,\)\ze\(,\|\$\);\1N;gc

To now look at the original problem, the pattern is:
%s;\(^\|^I\)\ze\(^I\|\$\);\1\\N;gc

^I must be replaced by the literal tab character.

Note: for some reason, the same substitute command as above, but with
only "g" flag (and not "gc"), does not work for me! Still checking
that..

>
> %s/\(^\|\t\)\@<=\(\t\|$\)\@=/\\N/g
>
> The following works, the only difference being the match on end-of-line instead of end-of-line at end of pattern with zero width:
>
> %s/\(^\|\t\)\@<=\(\t\|\n\)\@=/\\N/g
>
> Using zs and ze instead of @<= and @=, the following did not work, strangely leaving the second field of multi-field lines unfilled:
>
> %s/\(^\|\t\)\zs\ze\(\t\|$\)/\\N/g
Yes - I see this issue as well! And if you use "gc", it works correctly.

>
> Both of the commands above that failed achieve the desired result if executed repeatedly until the command returns E486: Pattern not found: ….
>
> Vim is v7.3.429 as packaged for the Lubuntu 12.04 repositories.
Same here.

--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

No comments: