Thursday, November 12, 2020

Re: regex to find where 'sample text' is not followed by 'sample text' a couple of lines down

On 2020-11-12 18:42, Chris Jones wrote:
> I am proofreading a document where a few words occur on one line
> and the same exact words are replicated two lines down.
>
> Here's a sample:
>
> | ```{=latex}
> | \index{Text that must occur twice}
> | ```
> | **2507. Text that must occur twice.** ... etc.
>
> I found that it's easy to highlight such occurrences using (e.g.):
>
> | /\\index{\(.*\)}\n```\n\*\*\d\+\. \1 " (1)
>
> Now I noticed that once in a while the repeated text is not the
> same as the text inside the curly brackets (i.e. in the \latex{...}
> command).

As best I can tell, this should highlight \index{} entries that don't
match text in the following N lines (3ish here, though I might have a
fenceposting error)

/\\index{\zs\(.*\)\ze}\(\%(\n.*\)\{,3\}\1\)\@!

At least it passed all the tests I threw at it.

> In order to find them I tried:
>
> | /\\index{\(.*\)}\n```\n\*\*\d\+\. \@<!\1 " (2)
>
> The '\@<!' as I understand it means that my search pattern will
> match everything up to and including the space... followed by
> something that differs from the current value of the '\1' back
> reference.

The first in there is that the "\@<!" references the atom *before* it
(a space) rather than the atom *after* it (your \1). However, even if
you group them, it might not-match if off by even one character. I'd
have to play with it more to see if there are other nuances that
would cause issue.

-tim



--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/20201112181507.1a3c8636%40bigbox.attlocal.net.

No comments: