Monday, July 9, 2012

Re: How to remove empty lines except of one

On Thu, Jul 05, 2012 at 06:54:44AM EDT, Tim Chase wrote:

Hello Tim, not replying to anybody in particular..

> It might be slightly more efficient, as replacing the "2" case with
> \r\r is a NOOP.
>
> As for blank-ish lines (containing just whitespace), it might become
> something like
>
> %s/^\(\s*\n\)\+/\r

I had a bit more time to look into this over the weekend and there's
something I don't understand regarding the way Vim handles the end of
buffer condition.

Here's one particular regex I came up with:

----------------------------------------------------------------------
/\(\%^\|\S\n\)\@<=\(\_^\s*\n\)\{2,\}\(\%$\|\S\)\@=
----------------------------------------------------------------------

This appears to do what I have in mind when used in a _search_ command,
note the initial '/'.

What this is supposed to match:

- a zero-length alternative: either start of buffer or any non
white-space character followed by a new line

- two or more empty lines, each optionally containing white space

- another zero-length alternative: either end of buffer or a non
white-space character.

What I'm trying to do is to match each group of more than two empty
lines once and once only - if you have a block of ten empty lines it
will match once.. When you hit 'n' the cursor jumps to the first line of
the next block.

Naturally, the regex is not meant to be efficient, smart, abstemious,
etc.. just a literal translation of my pseudo-code.

Anyway, I tested it on the following file/buffer:

--------------------------- start of file ----------------------------
1 |
2 |
3 |aaaa
4 |
5 |cccc
6 |
7 |
8 |dddd
9 |
10 |
11 |
12 |
13 |
14 |
15 |asdfasdf
16 |
17 |
18 |
19 |asdf asdf
20 | asdf
21 |
22 |asfd
23 |
24 |
25 |
---------------------------- end of file -----------------------------

line 1-2 : '^$' (empty lines)
line 3 : 'aaaa\n' (four a's + eol)
line 4 : ' /n' (two spaces + eol)
line 5 : 'cccc\n' (four c's + eol)
line 6-7 : '^$' (empty lines)
line 8 : 'dddd\n' (four d's + eol)
line 9-10 : '^$' (empty lines)
line 11 : '\t \n' (one tab, three spaces + eol)
line 12-14 : '^$' (empty lines)
line 15 : 'asdfasf\n' (asdfasdf + eol)
line 16-18 : '^$' (empty lines)
line 19 : 'asdf asdf \n' (asdf, 2 spaces, asdf, 2 spaces + eol)
line 20 : ' asdf\n' (2 spaces, asdf, + eol)
line 22 : 'asfd\n' (asfd + eol)
line 23-25 : '^$' (empty lines)

If I set hlsearch and search for text matching the above regex, five
blocks are (correctly) highlighted: 1-2, 6-7, 9-14, 16-18, and 23-25.

If I repeatedly hit the 'n' key, the cursor jumps to line 1, line 6,
line 9, line 16, line 23... and wraps around back to line 1, line 6,
line 9.. etc.

But when I proceed to _substitute_ all the matched blocks by a single
empty line:

:%s//\r

.. everything works as planned, preserving trailing white space, except
for the last three lines: they are replaced by two empty lines instead
of one. As if the last line in the file/buffer was somehow excluded from
the match.

So, is my regex not doing what I think it does¹, or is there something
'special' about the last line in a Vim buffer²?

No big deal... but if someone can figure out what's happening, I'd be
curious of their explanations.

Thanks,

CJ

¹ Plausible, but if I get rid of my two zero-length matches and I use
the simpler '/(\_^\s*\n\)\{2,\} .. I also get my matches via the '/'
command but the substitute ':%s' command still leaves me with two
empty lines at the end of the file.

² I saw other oddities, for instance when I add a line containing white
space at the end of the sample file, the regex no longer matches. As
a result, a block of empty lines at end of file is left untouched.
I also noticed that Ben Fritz's suggestion earlier in this thread
('%s/^\_s\+\n/\r') has exactly the same 'limitation'.


--
AHH! The neurotic monkeys are after me!

--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

No comments: