Wednesday, May 27, 2015

Re: How to uppercase the non-English characters using Windows-1250 code page?

2015-05-27 14:27 GMT+03:00 Igor Forca <igor2x@gmail.com>:
> Hi,
> on gVim 7.4 on Windows 7 I have a text for example:
> abcčšž
> and I would like to get uppercase of this word, so final result should be all letters upper-cased:
> ABCČŠŽ
>
>
> TEST 1
> 1. Set code pages: :set encoding=utf-8 fileencoding=utf-8
> 2. Type in text: abcčšž
> 3. Normal mode (go uppercase a word): gUaw
> Result is: ABCČŠŽ
> Working fine.
>
>
> TEST 2
> Note: Clear the text before continuing with next text with dd command.
> 1. Set code pages: :set encoding=utf-8 fileencoding=cp1250
> Repeat 2 and 3 from TEST 1.
> Result is: ABCČŠŽ
> Working fine.
>
>
> TEST 3
> Note: Clear the text before continuing with next text with dd command.
> 1. Set code pages: encoding=cp1250 fileencoding=cp1250

I can confirm with

% echo 'abcčšž' | iconv -t CP1250 > /tmp/enctest
% vim -u NONE -i NONE -N --cmd 'set encoding=cp1250' -c 'e
++enc=cp1250 /tmp/enctest' -c 'normal! gUaw' -c 'wqa!'
% cat /tmp/enctest| iconv -f CP1250
ABCčšž

(Gentoo ~amd64 linux, vim-7.4.711, unicode locale, no CP1250 locale
compiled). Note: I highly suggest to post exact steps. Even though
there is a bug at the first glance it looked like the result of
incorrect usage of the &encoding option: it is not supposed to be set
at runtime.

> Repeat 2 and 3 from TEST 1.
> Result is: ABCčšž
> English letters correctly upper-cased, but non-English letters not converted at all.
>
> What is going on? Why are non-English letters not converted if using both code page settings to cp1250? Is this a bug or something else? Do I need to set some other setting?
>
> P.S. I like to use both encoding settings as cp1250, so when writing buffer to file with :write command I don't get the annoying "[converted]" message in status bar. I use cp1250 in 99% of the cases, but in rare cases I need to open non cp1250 file and I temporally set encoding to utf-8 and in this rare case the "[converted]" message is very useful, to know some code page conversion! was performed.

You can get the same thing saving using a custom function. Setting
&encoding option at runtime effectively means that all internal
strings change their value without visible reason. Some plugins are
not prepared for this (though most will not care AFAIK).

If you happen to finish editing with cp1250 encoding you can spoil
viminfo file because it is saved using used &encoding, but character
set supported by UTF-8 is far larger then character set supported by
CP1250.

>
> Note:
> set encoding --> internal Vim encoding for buffers etc.
> set fileencoding --> encoding file will be saved in when :write command is executed
> utf-8 --> is UTF-8 universal code page
> cp1250 --> Windows-1250 Latin 2 code page for Slavic languages
>
> Thanks
>
> --
> --
> You received this message from the "vim_use" maillist.
> Do not top-post! Type your reply below the text you are replying to.
> For more information, visit http://www.vim.org/maillist.php
>
> ---
> You received this message because you are subscribed to the Google Groups "vim_use" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments:

Post a Comment