Thursday, April 22, 2010

Re: regex for a range that has a given number?

On Apr 22, 11:14 am, Peng Yu <pengyu...@gmail.com> wrote:
> I frequently need to search for citations in papers by citation
> numbers.
>
> For exmaple, something like [7] [2-8] [2,4,8-10] in the maintext
> represent what articles that are listed in the bibliography are cited.
> Now, I want to search for where it cites citation number 7 in the
> maintext (in this example, it is cited the three places in the
> maintext).
>
> But I don't see a very general way to search for such citation. I
> suppect this may not be done by regex for every cases. But I'm not
> sure. Do anybody have a good solution?
>

You can do it with a regex, but it won't be pretty, and it will
require some scripting if you want to automatically generate the
pattern in the general case.

For the number 7 specifically, I think this pattern will work (if I
correctly deciphered what you want to match):

\v%(\[|,)\zs%(7|[1-6]-%([7-9]|\d{2,}))

This should match anywhere that 7:
* is explicitly listed in the citation
* is contained within a range of numbers in the citation

It assumes you only have the number 7 immediately following a comma
within a citation.

Explanation:
\v -- very magic, to simplify the regex
%( -- start a group (without a backreference for minor speed gain) so
we can use '|' on part of the pattern
\[|, -- match either an opening brace or a comma
) -- close the first group
\zs -- not really needed, but adjusts highlighting to only cover the
numbers instead of also highlighting the opening bracket or comma
%( -- start another group so we can use '|' again
7 -- match a 7 that occurs by itself or as the first number in a range
[1-6]- -- match ranges that start with a number less than 7
%([7-9]|\d{2,}) -- match the final part of a range, that ends in
either a single digit with value greater than or equal to 7, or a 2-
digit number (which is always greater than 7)

Obviously this will need some tweaking for any citation number other
than seven. If you need to search for a 2-digit number it will take a
more effort but still should be possible with regex (it could get very
long however). Doing it in this way will result in very ugly regular
expressions that will probably be easy to get wrong unless you script
the construction of the regex.

A better approach might be to define a function that searches for a
citation using a simple regex like \[[^]*]\] with the search()
function. You can then use getline() to grab the line to parse out in
your function to determine whether the citation contains your number.
If it does not, you can call search() again until you find one that
does, or until you reach the end of the file.

--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

Subscription settings: http://groups.google.com/group/vim_use/subscribe?hl=en

No comments: