0

I was playing with sort in Ubuntu and saw a strange thing: Sorting (sort test.txt):

d
auto_design
tst
auto_tuning
autosport

Gives:

auto_design
autosport
auto_tuning
d
tst

But sorting

d
auto_design
auto_tuning
autoaport

Gives

autoaport
auto_design
auto_tuning
d
tst

If it sorts lexicographical why "a", "b", "c" are less than "_", but letters after "c" are "bigger" than "_"? In the first case it breaks auto_* words with autosport word which does not contain _ and it seems strange to me.

Thanks in advance.

3
  • 1
    Any answer about sort order requires knowing your current locale, LC_ALL and LC_COLLATE settings. Commented Apr 21, 2015 at 22:52
  • 1
    If you were to look up "auto_design" in a dictionary, wouldn't you look between "autobahn" and "autodial"? Commented Apr 21, 2015 at 23:02
  • Ergo, it does not sort lexicographically. Lexicographic sorts have limited utility. Commented Apr 22, 2015 at 0:43

1 Answer 1

1

The character order used by sort is provided by your current locale settings.

If you want a minimum of surprises, and don't need locale-specific character ordering, set LC_COLLATE=C in your environment. This can be scoped to a single command like so:

LC_COLLATE=C sort test.txt

See the glibc documentation on locales for more information on how locales can be configured.


To give an example of "locale-specific collation order" -- in Estonian, ö sorts after w, whereas it's more typical for ö to sort somewhere between n and p... but in a pure ASCII sort, such characters follow after the entire a-z set in full.

Sign up to request clarification or add additional context in comments.

3 Comments

Thanks. Can you say how it sorts when this env variable is empty? Comparing by what?
If LC_COLLATE is empty, then LC_ALL is checked. If no collation order is set anywhere in the system, the default is C (or its synonym POSIX), but a modern operating system will almost always set it somewhere.
Setting LANG will also impact collation order, if LC_COLLATE and LC_ALL are not set.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.