Tuesday, October 27, 2020

Re: the :sort command does not appear to give expected result

On Tue, Oct 27, 2020 at 11:55 PM Chris Jones <cjns1989@gmail.com> wrote:
>
> Here's a (test) file that contains a sample of single characters from
> the French alphabet.
>
> Column 1 contains a <tab> character (0x09) and column 2 contains the
> actual letters.
>
> A
> E
> O
> À
> È
> É
> Ô
> Œ
>
> If I use the sort command provided on linux by the GNU coreutils package
> so as to sort this file at the terminal with the following locale:
>
> LANG=en_US.UTF-8
> LANGUAGE=
> LC_CTYPE="en_US.UTF-8"
> LC_NUMERIC="en_US.UTF-8"
> LC_TIME="en_US.UTF-8"
> LC_COLLATE="en_US.UTF-8"
> LC_MONETARY="en_US.UTF-8"
> LC_MESSAGES="en_US.UTF-8"
> LC_PAPER="en_US.UTF-8"
> LC_NAME="en_US.UTF-8"
> LC_ADDRESS="en_US.UTF-8"
> LC_TELEPHONE="en_US.UTF-8"
> LC_MEASUREMENT="en_US.UTF-8"
> LC_IDENTIFICATION="en_US.UTF-8"
> LC_ALL=
>
> ... without changing locales the resulting ouput appears to be
> sorted the way it should be:
>
> A
> À
> E
> É
> È
> O
> Ô
> Œ
>
> But when I edit the file in vim and run the :sort / / where the '//'
> pattern contains a tab character (0x09) nothing happens.
>
> In other words... the fancy pants letters (À, È, É, Ô, Œ ) stay where
> they are instead of being moved to the spot where they belong.
>
> So I tried launching vim like so:
>
> $ LANG='fr_FR.UTF-8' vim
>
> I noticed that vim was now talking French to me and when I ran the
> ':language' commmand I saw that vim's locale-related variables were now
> set to the 'fr_FR' locale:
>
> Langue courante pour :
>
> "LC_CTYPE=fr_FR.UTF-8;
> LC_NUMERIC=C;
> LC_TIME=fr_FR.UTF-8;
> LC_COLLATE=fr_FR.UTF-8;
> LC_MONETARY=fr_FR.UTF-8;
> LC_MESSAGES=fr_FR.UTF-8;
> LC_PAPER=fr_FR.UTF-8;
> LC_NAME=fr_FR.UTF-8;
> LC_ADDRESS=fr_FR.UTF-8;
> LC_TELEPHONE=fr_FR.UTF-8;
> LC_MEASUREMENT=fr_FR.UTF-8;
> LC_IDENTIFICATION=fr_FR.UTF-8"
>
> But when I ran the same ':sort / /' command it didn't make any difference.
>
> Am I doing it wrong?
>
> Thanks,
>
> CJ
>
> P.S. I'm using a bit of vim trickery to translate the LaTeX '\index ...'
> etc. stuff to html tags so as to have a basic index with links to
> anchors in the HTML version of the document. Unfortunately the original
> document happens to be in French... and naturally... correct sorting of
> the 'TABLE ALPHABÉTIQUE' is crucial (I do want eggs/œufs to appear under
> letter 'O'... not relegated to the index's last page).
>
> I've read the ':h :sort' doc something like a dozen times and find parts
> of it a little cryptic. Especially when somewhere near the end it says:
> 'Vim does do a "stable" sort.' :-) What's up with that?

A "stable" sort is a sort which will keep lines with the same sort
keys in the order they were before the sort. (If you sort on whole
lines the difference is not visible, unless there exist different
lines which sort as equal, but if you sort on "pattern" or on "first
number" it may matter.)

But there is another few sentences which may be relevant in the help
for :sort, near the end, as follows:

<quote>
The details about sorting depend on the library function used. There is no
guarantee that sorting obeys the current locale. You will have to try it out.
</quote>

$LC_COLLATE is the part of the locale which says how to sort. if
$LC_ALL is set if overrides all the others, otherwise $LANG is used as
a fallback for any locale variable which is not set. ":lang" with no
arguments lists all settiings after taking care of $LANG and/or
$LC_ALL if present.

Best regards,
Tony.

--
--
You received this message from the "vim_use" maillist.
Do not top-post! Type your reply below the text you are replying to.
For more information, visit http://www.vim.org/maillist.php

---
You received this message because you are subscribed to the Google Groups "vim_use" group.
To unsubscribe from this group and stop receiving emails from it, send an email to vim_use+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/vim_use/CAJkCKXs_EgN%2Bd9vm1VJUu%3D92%3DR0t6RGXW3sjQZqtucKMeu53cg%40mail.gmail.com.

No comments: