Saturday, January 2, 2010

Re: Printing with utf-8 characters on Windows


About the ucs2, because the utf8 has failed so many times. :(
On Sat, Jan 2, 2010 at 6:16 PM, Minh Duc Thai <thmduc@gmail.com> wrote:
Hello Chris,

I've tried to print in Linux (I use Linux Mint version 8, the printer is the Print To PDF) and the result is the same as in Windows.

I think this is a bug.

On Wed, Dec 23, 2009 at 3:31 AM, Chris Jones <cjns1989@gmail.com> wrote:
On Sun, Dec 20, 2009 at 11:36:27AM EST, Đức Minh Thái wrote:
> Hello,
> I cannot get utf-8 characters printed correctly. For example:
>
> bột
>
> becomes
>
> bá»™t

U+1ED9   ộ   LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW

See:

:help ga

In utf-8, this character is encoded by the following sequence of three
bytes:

0xe1, 0xbb, 0x99

See:

:help g8

This is what a utf-8 encoded file with the three characters 'bột'
actually contains:

00000000  62 e1 bb 99 74 0a                                 |b...t.|
00000006

0x62             b   LATIN SMALL LETTER B
0xe1,0xbb,0x99   ộ   LATIN SMALL LETTER O WITH CIRCUMFLEX AND DOT BELOW
0x74             t   LATIN SMALL LETTER T

The final 0x0a is a line feed control character.

In Microsoft Windows' cp1252:

0xe1    á
0xbb    »
0x99    ™

 http://en.wikipedia.org/wiki/Windows-1252

You do not give much detail as to where you see what, but I am probably
not far off the mark assuming that 'bột' is what you see when editing a
utf-8 encoded file in vim, and that 'bá»™t' is what you see on your
printout.

Being unfamiliar with Microsoft Windows, I'm speculating a bit, but it
does look like your printing software is processing the file as if it
were cp1252 rather than utf-8.

> My printing options are:
>
> set printfont=LMMono10:h10 " This is the LMMono from LaTeX Latin Modern
> set printoptions=number:y
> set printencoding=ucs-2le bomb

If your file is utf-8 encoded, why do you tell vim that it is ucs2..?

:h penc-option

In particular, this help file states that:

Code page 1252 print character encoding is used by default on Windows
and OS/2 platforms.

> Please help. Thank you!

I am not familiar with Microsoft Windows, so I don't really have an
answer to your question but you could try:

:set penc=

or..

:set penc=utf-8

and see if the 'bột' string prints correctly.

My understanding is that compiled with the adhoc +options, Vim should be
able to process utf-8 encoded files transparently on any platform but
you may also want to ask Vim to convert the file.

Take a look at:

:h ++enc
:h ++ff

If that doesn't help, please attach a small sample file, see if someone
on the list can come up with something more conclusive.

CJ



--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php



--
Minh Duc Thai - StudentID: 0711040
Faculty of Mathematics and Computer Science
University of Science
Vietnam National University - Ho Chi Minh City
227 Nguyen Van Cu street, District 5, Ho Chi Minh City, Vietnam



--
Minh Duc Thai - StudentID: 0711040
Faculty of Mathematics and Computer Science
University of Science
Vietnam National University - Ho Chi Minh City
227 Nguyen Van Cu street, District 5, Ho Chi Minh City, Vietnam

--
You received this message from the "vim_use" maillist.
For more information, visit http://www.vim.org/maillist.php

No comments: