Tuesday, July 27, 2021

unicode: UTF / UCS

Beloved vim'er!

until shortly before... I never came up with
the idea of doing: "thinking about the text file encoding
of my files@hdd"

I used unicode like a definition at my locales. Still in
mind that my files are utf-8 encoded.

BUT, after a file crash - during the system play with an
old ext2 filesystem and gnu tar, i had an file header
without file in my inodes. Like an condensor without
payload :) AND, out of curiosity i probed a bit with vim
files, and utf-8 (but btrfs) and an up-to-date archlinux.

Then, I realized that there are three encoding views:
keyboard, display(terminal), vim. Like, decoding pipes to
an encoded socket. The encoded socket, the file itself,
works partly inconsistent together with vim, xterm and
the unixtool file.

Setting: I create an file using xterm console and touch.
Then, i open it with vim.

Vim: enc & fenc = utf-8
BUT file -i: us-ascii

The file results with 2-byte per Character, yet like
us-ascii inside of an unicode container. However, i
like to have real unicode and not an endianness
of us-ascii using 2-byte instead of 1-byte.

Then @vim, i change the encoding to ucs-2 with :set fenc=ucs-2. I
read@vimdoku ucs-2 and utf-8 is similar@linux
Now :write, vim tells me [converted] and
file (sometimes) tells me utf-8 like expected. The file
size increases to 4-byte per character, like expected
for ucs-4. Then reread @vim, shows me unreadable content.
I have to ++enc it back to ucs-2. So, inside vim ucs-2 and utf-8 seems
to be different. And @linux ucs-2 using
filespace like ucs-4.

Imaginary reasoning: my system wide (or kernel working)
utf-8 differs from real unicode utf-8 by endianness
abuse. Maybe because of compatibility...
That is why the file tool works inconsistent
(partly tells binary stuff instead of text encoding).

Is there a way to ensure working with true utf-8
or better utf-16 files? Aim is to work with source
files in unicode to exclude the deprecated ascii...

Sincerly
-kefko

--
Wonderful vim doku:
When a mapping triggers itself, it will run forever
WEB www.johannes-koehler.de

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/8713d446-fcc2-6ca8-2ba0-0162ebcaae61%40googlemail.com.

No comments: