Friday, March 6, 2015

Search pattern while excluding some words

First a small bit of background. I have created a little
bash script which runs pdftotext on a PDF file (containing
obituaries, with surnames in upper-case), then invokes vim
commands to massage the resulting text file, basically to
break the file into paragraphs.

I then open the resulting text file in
vim and search for surnames which may have remained
embedded within a paragraph; I use

/[A-Z]\{4,\}

for this (ignoring the occasional 3 letter surname).

Here's my question: while running this search on 4 or
more uppercase characters, I would like to be able to skip
past (ignore) certain commonly occurring 'words' such as
RCMP, QEII, SPCA and such. I want to jump immediately to
the next occurring surname.

I am not particularly good with regexes, and haven't found
anything which seems at all close to being able to do
this.

Thanks for any advice,
John Cordes


--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments:

Post a Comment