Monday, December 19, 2011

Re: sh (bash) syntax for here-document strings: embedding other languages

On 2011-12-19 14:00, Timothy Madden wrote:
> I find this everything-everyore-must-be-a-simple-RE design (for vim
> syntax highlighting) insufficient to describe the syntax of a language.
>
> More high-level features like followed-by, containing and contained
> regions should be present, but with the ability to add alternatives for
> the following and previous regions, to search only in a region as one
> would search in a file, to add names to any regions in order to search
> or extract data from it later or in other places, or to be used in some
> kind of group_exists('name') macro, and some other high-level
> conditionals, using defines or flags declared when matching regions of
> text like defining them from within the REs used for a match.

For followed-by, containing, and contained, is that that just nextgroup
and contained/contains/containedin?

You can use nextgroup to do some of the stuff you want, but it's a bit of a
hack (the keepend part of the hdBody line is important):

unlet! b:current_syntax | syn include @awkSyntax syntax/awk.vim
unlet! b:current_syntax | syn include @phpSyntax syntax/php.vim

syn region hdBody matchgroup=shRedir contains=hdStart keepend
\ start="<<-\?["']\?\z(.*\)['"]\?" end="\z1"
syn match hdStart "\%(<<.*\n\)\@<=" contained nextgroup=@hdLang

syn cluster hdLang contains=hdAwk,hdPhp
syn region hdAwk start="\%(.*\n\)\{0,4}.*ft=awk" end="\#$" contained contains=@awksyntax
syn region hdPhp start="\%(.*\n\)\{0,4}.*ft=php" end="\#$" contained contains=@phpSyntax

From :help syntax, "Lexical highlighting might be a better name, but
since everybody calls it syntax highlighting we'll stick with that."

"Lexical highlighting" kind of defines the scope to simple regexes ;)
There are also some advantages to keeping it in this scope, you might
not always want a match to fail just because it's not technically
valid.. syn-region is a good example of this, it's nice to have it
highlight a string as a string even if you haven't typed the end quote
yet. With a stricter parser you might get a "syntax error" and not have
it match at all.

That being said, I've wanted some higher-level features for the syntax
matching too, and even more so for helping with indent expressions and
omnicompleteion. I started writing a BNF like parser for vim a while
back, but didn't get very far :(

No comments: