Thursday, November 23, 2017

using vim to add <a href= ...> links to an epub index file

I am currently in the final stages of putting together an epub version of
Auguste Escoffier's _Le Guide Culinaire_.

Since this is a "cookbook" of sorts, the last step before proofreading
pretty much requires building a working index with html style links to
the text relative to each entry.

In an epub context this can be achieved by wrapping the text of each
entry in something of the form:

<a href="../Text/file.xhtml#p0001">index_entry</a>

where "file.xhtml" is one of the files making up the text of the e-book
and "p0001" has been defined as an "< ... id="p0001"> within the file.

There are over 6000 entries in this index, which (loudly) suggests that
in this instance it might be worth spending a few hours concocting some
form of automated solution to add all the < href > links to the file in
one fell swoop rather than doing it manually.

The index is a repetition of lines with the following structure:


<div class="ind-01"></div>
<div class="ind-02">Abatis</div>
<div class="ind-03">621</div>

<div class="ind-01"></div>
<div class="ind-02">    —     à la Bourguignonne</div>
<div class="ind-03">621</div>

...


After loading the index file in a vim buffer I have found that:

1. I can match all page entries in a non-ambiguous manner by a search
with the following pattern: "/\d\+<"

The match as highlighted via ":set hlsearch" includes the page number
and nothing else and the cursor sits on the first digit of the page
number.

2. I can invoke the following one-liner from vim with the page number as
an argument and it returns the generated link:


#!/bin/bash

grep -o 'p0[0-9][0-9][0-9]' *.htm | \
awk 'BEGIN { FS=":"} {print "<a href=\"../Text/" $1 "#" $2 "\"" ">" }' | \
grep "$1"

exit 0


... like so:

:r ! My_script 0621

generates the link and writes it to the vim buffer:

<a href="../Text/gc0306.htm#p0621">

What I am missing at this point:

1. I need to retrieve the matched string of the current "/\d\+<" search
and place it in some kind of vim variable (?) that I can use to
invoke the script so that it can be done iteratively without having
to tyoe the page number manually:

:r ! my_script $vim_variable

2. I need to find a way to remove any new-line character(s) so that the
output of "My_script $vim_variable" is placed at the right spot in
the buffer: after I invoke the script using ":r ! My_script"... the
output is inserted in column 0 on a new line immediately after the
matching string:


<div class="ind-01"></div>
<div class="ind-02">Abatis</div>
<div class="ind-03">621</div>
<a href="../Text/gc0306.htm#p0621">


3. A third issue is adding the closing "</a>" tag after the targeted
text, thus completing the wrapping of the entry so that the end
result of one iteration looks exactly like this:


<div class="ind-01"></div>
<div class="ind-02"><a href="../Text/gc0306.htm#p0621"> Abatis</a></div>
<div class="ind-03">621</div>


In other words, I need to put together some kind of front-end...
presumably in vimscript (so that I have ability to navigate the lines in
the buffer)... that does the three things described above:

1. grab the current matched string/page number, pass it to the bash
one-liner to generate the corresponding <a href="..."> and return
the result to vim.

2. move the cursor to the first character of the corresponding index
entry (the text and the page number are vertically aligned so that
hitting "k" on the keyboard does exactly that...) and insert the
generated text before the cursor (iow, what a Shift-P would do)

3. jump to the opening "<" of the closing </div> tag and insert "</a>"
before the cursor.

Another approach I considered might consist in recording a vim macro
that would reproduce manual actions at the keyboard and run it
iteratively against the buffer. But I doubt line-mode commands such as
":r ! ..." would be recorded.

Please let me know if this is at all feasible in vim (and vim might
offer better means of achieving what I am trying to do) or whether
I should look at other options.

Thanks,

CJ

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments: