Tuesday, February 19, 2013

Re: trouble with pattern, character collections

diff --git a/runtime/doc/pattern.txt b/runtime/doc/pattern.txt
--- a/runtime/doc/pattern.txt
+++ b/runtime/doc/pattern.txt
@@ -754,6 +754,7 @@

. (with 'nomagic': \.) */.* */\.*
Matches any single character, but not an end-of-line.
+ Note, it will also match CR and LF control chars in the text.

*/\_.*
\_. Matches any single character or end-of-line.
diff --git a/src/regexp.c b/src/regexp.c
--- a/src/regexp.c
+++ b/src/regexp.c
@@ -2283,6 +2283,9 @@
{
char_u *lp;

+ if (*regparse == ']')
+ EMSG_M_RET_NULL(_("E769: Missing ] after %s["),
+ reg_magic > MAGIC_OFF);
/*
* If there is no matching ']', we assume the '[' is a normal
* character. This makes 'incsearch' and ":help [" work.
@@ -2404,6 +2407,8 @@
*ret = ANYOF + ADD_NL;
*flagp |= HASNL;
}
+ else if (*ret == ANYBUT)
+ regc(NL); /* also skip newlines */
/* else: must have had a \n already */
}
regparse++;
diff --git a/src/testdir/test44.in b/src/testdir/test44.in
--- a/src/testdir/test44.in
+++ b/src/testdir/test44.in
@@ -38,6 +38,9 @@
:put =matchstr(\"א×'×'×"\", \"..\", 0, 2) " ×'×'
:put =matchstr(\"א×'×'×"\", \".\", 0, 0) " א
:put =matchstr(\"א×'×'×"\", \".\", 4, -1) " ×'
+:put ='\n matches [^\n]: ' . (match(\"\n\",'[^\n]') > -1 ? 'YES' : 'NO')
+:put ='\n matches \"[^\n]\": ' . (match(\"\n\",\"[^\n]\") > -1 ? 'YES' : 'NO')
+:put ='empty collation matches ''[]'': ' . (match('[]', '[]') > -1 ? 'YES' : 'NO')
:w!
:qa!
ENDTEST
diff --git a/src/testdir/test44.ok b/src/testdir/test44.ok
--- a/src/testdir/test44.ok
+++ b/src/testdir/test44.ok
@@ -21,3 +21,6 @@
×'×'
א
×'
+\n matches [^\n]: NO
+\n matches "[^\n]": NO
+empty collation matches '[]': NO
On Di, 19 Feb 2013, Christian Brabandt wrote:

> On Mo, 18 Feb 2013, Marc Weber wrote:
> > I don't think that additional threads are going to help
> > There is an issue, and we should find a way to fix (IMHO).
> > Let me summarize again - and tell me if you feel differently.
> >
> > Test cases:
> > [1] echo len(matchstr("\n",'\zs[^\n]\ze'))
> > [2] echo len(matchstr("\n","\\zs[^\n]\\ze"))
> >
> > I expect both do the same, the difference is that the second as chr(10) in [^],
> > while the first has \n (which should be translated to chr(10).
> >
> > However I obsorve that [2] returns 0 as expected , but [1] does return
> > 1, thus it matches \n even though I told Vim that I do not want to match
> > it. People told me this was because '.' is equal to [^\n].
> >
> >
> > Current situation: at least to be fixed
> > 1:
> > No matter whether '.' should behave like [^\n]
> > [1] and [2] should behave the same, right?
> > 2:
> > This should be documented.
> > (Do you all at least agree these two statments?)
>
> Bram, here is a patch, making [^\n] not match NL within the text and
> that also documents, that '.' matches CR and LF within the text.
>
> This makes both [1] and [2] behave the same and seems to better match
> the users expectations.

Attached is an updated patch, that also prevents /[] matching []
(a collation cannot be empty, so I think it should return an error and
other vi clones do, also grep and perl throw an error).

Included are tests as well.

regards,
Christian
--
Sprachlexikon-Namen: GERRITT - gemütl. Schritt-Tempo b. Pferden

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

No comments: