Wednesday, December 23, 2020

Substitute pattern over multiple lines

I'm seeking help with editing a GEDCOM (genealogy) file. For
this I'm using Vim 8.2 in Windows. Here is a segment of text from
the file (the language doesn't make sense since I've deleted
some internal lines in the NOTEs which aren't relevant to the
question):

=======================
1 EVEN
2 TYPE tngnote
2 NOTE I have included the children William, Charles, Alice, and
with his parents in 1881, and with his widowed mother in 1
3 CONC 891 (e.g. see my online transcription of the 1891 Smiths
with James Moser, son of Henry Moser and Mary Henneberry, and his
wife Margaret Woodin; however
3 CONC , I have not yet taken this step.
1 BIRT
=======================

The 2 lines beginning with ^3 CONC are Continuation (CONC=Concatenation) lines.

I want to surround the text of the NOTE with a 'div' tag, so that
the final result should look like this:

=======================
1 EVEN
2 TYPE tngnote
2 NOTE <div class="xxx">I have included the children William,
Charles, Alice, and with his parents in 1881, and with his widowed
mother in 1891 (e.g. see my online transcription of the 1891
Smiths with James Moser, son of Henry Moser and Mary Henneberry,
and his wife Margaret Woodin; however, I have not yet taken this
step.</div>
1 BIRT
=======================

The complete GEDCOM file (which may have 850,000 or so lines) may
have NOTE tags with 0, 1, 2, or 3 CONC tags (probably no more than
that) following.

It is this variable number of continuation lines which I find
most difficult to deal with.

For the NOTE tags where there are no continuation lines I believe
this is working:

:g/^2 TYPE tngnote/+1s/^2 NOTE\(.*\)/2 NOTE <div class="xxx">\1 <\/div>/

but when there are 1 or more CONC tags following the NOTE I get stuck.

I tried:
:g/^2 TYPE tngnote/+1s/^2 NOTE\(.*\n\(3 CONC \(.*\)\)*\)/2 NOTE <div class="xxx">\1\3<\/div> /

which 'almost' works if there is just 1 CONC tag (though it
leaves "3 CONC" in place which I don't want). So it's pretty bad!


I realize this is pretty messy looking but I'm hoping one of the
experts who so generously contribute to this group may be able to
give me a pointer for how to deal with this.

Thanks,
John Cordes


--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/20201223214854.GA8272%40dal.ca.

No comments: