Tuesday, January 5, 2010

Re: Printing with utf-8 characters on Windows

On Tue, 5 Jan 2010, Chris Jones wrote:

> On debian stable, I also tested printing utf-8 encoded files containing
> samples of CJK, Devanagari, and a couple other Eastern scripts and I was
> unable to get :hardcopy to print their contents.
>
> Since utf-8 is the default encoding on debian Lenny, I find it hard to
> believe that the Vim to Postscript implementation would not function out
> of the box with utf-8 encoded files, and even less plausible that I was
> unable to find anyone reporting this issue while searching online, apart
> from a few reports where Vim 7.0 or older was involved, and dating back
> 7-8 years ago.

Printing UTF-8 text is hard, since PostScript doesn't support it
natively. I was pretty surprised that 'enscript' never made it into the
Unicode age. 'paps' is the only thing I found that seems to do a
reasonable job. Though, just now (while trying to find the page I found
yesterday) I found a few entries in a UTF-8 and Unicode FAQ under
'Printing'[1].

[1] http://www.cl.cam.ac.uk/~mgk25/unicode.html

CUPS supposedly handles UTF-8 via the texttops filter, but I was unable
to get anything reasonable (even fiddling with 'CHARSET=' and '-o
document-format=text/plain;charset=' options). I eventually gave up and
replaced /usr/libexec/cups/filter/texttops with the following script:

#!/bin/sh
paps < "$6" | title="$3" perl -lpwe 's/stdin/$ENV{title}/ if 2==$.'

> Leads me to think that there's more to it than the speculations in my
> earlier post today.
>
> Note, that I tried to implement the following in my .vimrc, also without
> success:
>
> | set printexpr=PrintFile(v:fname_in)
> | function PrintFile(fname)
> | call system('paps --font="unifont 8" --paper letter | lpr ' . a:fname)
> | call delete(a:fname)
> | return v:shell_error
>
> The characters from the 'exotic' scripts were replaced by inverted
> question marks or blanks, and _as far as I can tell_ it looked as if the
> same ASCII or latin1 font was used not matter what font I passed to the
> paps converter.
>
> Can anyone shed some light on this mattter?

From the docs, printexpr only affects how the generated PS temp file
gets printed. So, if Vim's already subbing out the chars in the PS,
it's not going to matter what happens next.

Testing with :ha > test.ps shows that no matter what encoding or
fileencoding or printencoding or printmbencoding I tried, it still shows
up as latin1 in the resulting PostScript. Which is weird considering
the various charset handling that appears to be done in src/hardcopy.c.

The only way I was able to get decent printouts was by just shelling out
to paps:

:!paps < % > test.ps

Best,
Ben

No comments: