Tuesday, September 25, 2012

Re: E670: Mix of help file encodings within a language

On 25/09/12 17:58, Marco wrote:
> 2012-09-25 Tony Mechelynck <antoine.mechelynck@gmail.com>:
>
>> Well, is there a bomb on another helpfile?
>>
>> :vimgrep /\%1l/ ~/.vim/*.txt
>> :setl fenc? bomb? " watch for "utf-8" together with "bomb"
>> " or for anything other than utf-8 or latin1
>> :cn|setl fenc? bomb?
>> :cn|setl fenc? bomb?
>> :cn|setl fenc? bomb?
>
> Vim says they're all utf-8 and nobomb.
>
> I did some further experiments. The result is:
>
> The mentioned error is thrown when "one or more but not all" files
> have one or more non-ASCII characters in the *first line*. If all
> files have at least one non-ASCII character, it works fine.
> Non-ASCII characters elsewhere than the first line are not
> problematic. That's seems weird. A bug or a feature?
>
>
> Marco
>
>

I don't know; but the first line is what magically appears under ":help
local-additions" as if it were part of $VIMRUNTIME/help.txt

It should contain the filename and title, as in matchit.txt:

*matchit.txt* Extended "%" matching

and Vim changes stars to bars around the filename in the "Local
additions" list.

Since help.txt must all be in a single encoding (but US-ASCII, Latin1
and UTF-8, and many others, all have identical representations for
codepoints U+0000 to U+007F), it comes to reason that the first lines of
all "locally added" help files must be in a "compatible" encoding. For
instance on a zOS EBCDIC system they would, I suppose, be all in some
EBCDIC encoding, and compatible with each other but not with ASCII.

Since the $VIMRUNTIME/doc/tags is also regenerated using Vim, the same
criterion applies to it, and this is how eval.txt and map.txt can have a
few Latin1 bytes above 0x7F (but not in the first line), while
options.txt, arabic.txt and hebrew.txt (and maybe others) are in UTF-8
with Greek, Arabic and Hebrew letters (respectively) in UTF-8
representation (but, again, not in the first line). I haven't succeeded
to display farsi.txt correctly, all I can say is that it seems to be in
some 8-bit encoding which is not Latin1 (and I tried iso-8859-6 too, and
even ":view ++enc=farsi", but without success).


Best regards,
Tony.
--
Suddenly, Professor Leibowitz realizes he has come to the seminar
without his duck ...

--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

No comments: