Monday, March 7, 2016

Re: Is there any way to count all latin characters in utf-8 as 1 byte?

2016-03-07 14:54 GMT+03:00 rameo <raiwil@gmail.com>:
> I use searchpos() to capture start/endcolumns of a matches.
> Then I use the results in Python code to transform the text.
>
> However I noted that latin characters as 'èéàòìù' are counted as 1 byte in Python but 2 bytes in Vim and the output is not as expected.
>
> Is there any way to resolve this problem?

In Python you are not using *byte* counts, it indexes *unicode
codepoints*. You may convert unicode Python objects to bytes objects
by using `string.encode(vim.options['encoding'])`, use
`.decode(vim.options['encoding'])` to convert back. bytes objects are
indexed by bytes. You may also count codepoints on Vim side by using
`strchars()`.

>
> --
> --
> You received this message from the "vim_use" maillist.
> Do not top-post! Type your reply below the text you are replying to.
> For more information, visit http://www.vim.org/maillist.php
>
> ---
> You received this message because you are subscribed to the Google Groups "vim_use" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
> For more options, visit https://groups.google.com/d/optout.

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

No comments: