Monday, December 28, 2015

Re: Apparent regex bug involving `\_.' (and other newline-matching constructs)

Brett Stahlman wrote:

> Given a file containing the following 2 lines...
> 1a3
> 123xyz
>
> ...try the following tests, and note the unexpected results.
>
> Case 1.1:
> call cursor(1, 1)
> echo searchpos('\%(\([a-z]\)\|\_.\)\{-}xyz', 'pcW')
> => [1, 1, 2]
>
> Case 1.2:
> call cursor(1, 2)
> echo searchpos('\%(\([a-z]\)\|\_.\)\{-}xyz', 'pcW')
> => [2, 1, 1]
> Question: Why does the \_. not permit earlier match at cursor pos (1, 2)?
> Note: Clearly, submatch should be 2, not 1, but this error is simply a
> consequence of the first error: since match doesn't begin on 1st line,
> the "a" at cursor pos can't be captured.

This is because of the 'c' flag in 'cpoptions'. The Vi-compatible way
of searching is to start at the first column and skip over the match.
Then take the first match after the start position.

> Case 1.3:
> call cursor(1, 3)
> echo searchpos('\%(\([a-z]\)\|\_.\)\{-}xyz', 'pcW')
> => [2, 1, 1]
> Note: Why isn't a match found at cursor pos (1, 3)?
>
> Repeat these tests with a \zs in the pattern, and note how the capture
> is matched unconditionally...
>
> Case 2.1:
> call cursor(1, 1)
> echo searchpos('\%(\([a-z]\)\|\_.\)\{-}\zsxyz', 'pcW')
> => [2, 4, 2]
>
> Case 2.2:
> call cursor(1, 2)
> echo searchpos('\%(\([a-z]\)\|\_.\)\{-}\zsxyz', 'pcW')
> => [2, 4, 2]
>
> Case 2.3:
> call cursor(1, 3)
> echo searchpos('\%(\([a-z]\)\|\_.\)\{-}\zsxyz', 'pcW')
> => [2, 4, 2]
> Note: Submatch should be 1, not 2, here. It's as though the \zs forces the
> capture to match unconditionally.
>
> Points to note... Originally, I thought the error had to do with the 'p'
> flag, but that appears not to be the case: the submatch errors are simply a
> consequence of the incorrectly determined start locations. Also, it appears
> the results would have been the same with * as they were with \{-}.
> Finally, the unexpected behavior is not limited to \_., but is seen even
> when (e.g.) explicit \n is used.

After removing 'c' from 'cpoptions', does it work as you expect?

--
Veni, Vidi, Video -- I came, I saw, I taped what I saw.

/// Bram Moolenaar -- Bram@Moolenaar.net -- http://www.Moolenaar.net \\\
/// sponsor Vim, vote for features -- http://www.Vim.org/sponsor/ \\\
\\\ an exciting new programming language -- http://www.Zimbu.org ///
\\\ help me help AIDS victims -- http://ICCF-Holland.org ///

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments: