Skip to main content
Tweeted twitter.com/StackUnix/status/1123376710634934273
Fix a typo
Source Link
tremby
  • 573
  • 8
  • 17

I have a OSX machine where sort runs GNU sort from coreutils 8.26 (installed from Homebrew), and a Linux machine where sort runs GNU sort from coreutils 8.25.

On the Mac:

mac$ echo -e "{1\n2" | sort
2
{1

While on Linux:

linux$ echo -e "{1\n2" | sort
{1
2

I'm aware that sort depends on the locale. I ran locale on the Linux machine, prepended each line of output with export and ran the resulting lines on the OSX machine before running (in the same terminal) the sort command again, which gave the same output as before.

I noticed, however, that running locale on the Mac doesn't show all of the lines which appear on Linux, and I'm not sure if this is related.

The locale on Linux:

linux$ locale
LANG=en_CA.UTF-8
LANGUAGE=en_CA:en
LC_CTYPE="en_CA.UTF-8"
LC_NUMERIC="en_CA.UTF-8"
LC_TIME="en_CA.UTF-8"
LC_COLLATE="en_CA.UTF-8"
LC_MONETARY="en_CA.UTF-8"
LC_MESSAGES="en_CA.UTF-8"
LC_PAPER="en_CA.UTF-8"
LC_NAME="en_CA.UTF-8"
LC_ADDRESS="en_CA.UTF-8"
LC_TELEPHONE="en_CA.UTF-8"
LC_MEASUREMENT="en_CA.UTF-8"
LC_IDENTIFICATION="en_CA.UTF-8"
LC_ALL=en_CA.UTF-8

And locale on OSX:

mac$ locale
LANG="en_CA.UTF-8"
LC_COLLATE="en_CA.UTF-8"
LC_CTYPE="en_CA.UTF-8"
LC_MESSAGES="en_CA.UTF-8"
LC_MONETARY="en_CA.UTF-8"
LC_NUMERIC="en_CA.UTF-8"
LC_TIME="en_CA.UTF-8"
LC_ALL="en_CA.UTF-8"

I've found that if I set LC_ALL=UTF-8LC_ALL=C on both machines, they both sort 2 before {1. But if I set LC_ALL=en_CA.UTF-8 on both machines I have the differing output as above. Same if I set LC_ALL=en_CA.utf8 on both machines. (locale -a lists en_CA.utf8 on the Linux machine but en_CA.UTF-8 on the OSX machine.)

Any idea what is going on here?

I have a OSX machine where sort runs GNU sort from coreutils 8.26 (installed from Homebrew), and a Linux machine where sort runs GNU sort from coreutils 8.25.

On the Mac:

mac$ echo -e "{1\n2" | sort
2
{1

While on Linux:

linux$ echo -e "{1\n2" | sort
{1
2

I'm aware that sort depends on the locale. I ran locale on the Linux machine, prepended each line of output with export and ran the resulting lines on the OSX machine before running (in the same terminal) the sort command again, which gave the same output as before.

I noticed, however, that running locale on the Mac doesn't show all of the lines which appear on Linux, and I'm not sure if this is related.

The locale on Linux:

linux$ locale
LANG=en_CA.UTF-8
LANGUAGE=en_CA:en
LC_CTYPE="en_CA.UTF-8"
LC_NUMERIC="en_CA.UTF-8"
LC_TIME="en_CA.UTF-8"
LC_COLLATE="en_CA.UTF-8"
LC_MONETARY="en_CA.UTF-8"
LC_MESSAGES="en_CA.UTF-8"
LC_PAPER="en_CA.UTF-8"
LC_NAME="en_CA.UTF-8"
LC_ADDRESS="en_CA.UTF-8"
LC_TELEPHONE="en_CA.UTF-8"
LC_MEASUREMENT="en_CA.UTF-8"
LC_IDENTIFICATION="en_CA.UTF-8"
LC_ALL=en_CA.UTF-8

And locale on OSX:

mac$ locale
LANG="en_CA.UTF-8"
LC_COLLATE="en_CA.UTF-8"
LC_CTYPE="en_CA.UTF-8"
LC_MESSAGES="en_CA.UTF-8"
LC_MONETARY="en_CA.UTF-8"
LC_NUMERIC="en_CA.UTF-8"
LC_TIME="en_CA.UTF-8"
LC_ALL="en_CA.UTF-8"

I've found that if I set LC_ALL=UTF-8 on both machines, they both sort 2 before {1. But if I set LC_ALL=en_CA.UTF-8 on both machines I have the differing output as above. Same if I set LC_ALL=en_CA.utf8 on both machines. (locale -a lists en_CA.utf8 on the Linux machine but en_CA.UTF-8 on the OSX machine.)

Any idea what is going on here?

I have a OSX machine where sort runs GNU sort from coreutils 8.26 (installed from Homebrew), and a Linux machine where sort runs GNU sort from coreutils 8.25.

On the Mac:

mac$ echo -e "{1\n2" | sort
2
{1

While on Linux:

linux$ echo -e "{1\n2" | sort
{1
2

I'm aware that sort depends on the locale. I ran locale on the Linux machine, prepended each line of output with export and ran the resulting lines on the OSX machine before running (in the same terminal) the sort command again, which gave the same output as before.

I noticed, however, that running locale on the Mac doesn't show all of the lines which appear on Linux, and I'm not sure if this is related.

The locale on Linux:

linux$ locale
LANG=en_CA.UTF-8
LANGUAGE=en_CA:en
LC_CTYPE="en_CA.UTF-8"
LC_NUMERIC="en_CA.UTF-8"
LC_TIME="en_CA.UTF-8"
LC_COLLATE="en_CA.UTF-8"
LC_MONETARY="en_CA.UTF-8"
LC_MESSAGES="en_CA.UTF-8"
LC_PAPER="en_CA.UTF-8"
LC_NAME="en_CA.UTF-8"
LC_ADDRESS="en_CA.UTF-8"
LC_TELEPHONE="en_CA.UTF-8"
LC_MEASUREMENT="en_CA.UTF-8"
LC_IDENTIFICATION="en_CA.UTF-8"
LC_ALL=en_CA.UTF-8

And locale on OSX:

mac$ locale
LANG="en_CA.UTF-8"
LC_COLLATE="en_CA.UTF-8"
LC_CTYPE="en_CA.UTF-8"
LC_MESSAGES="en_CA.UTF-8"
LC_MONETARY="en_CA.UTF-8"
LC_NUMERIC="en_CA.UTF-8"
LC_TIME="en_CA.UTF-8"
LC_ALL="en_CA.UTF-8"

I've found that if I set LC_ALL=C on both machines, they both sort 2 before {1. But if I set LC_ALL=en_CA.UTF-8 on both machines I have the differing output as above. Same if I set LC_ALL=en_CA.utf8 on both machines. (locale -a lists en_CA.utf8 on the Linux machine but en_CA.UTF-8 on the OSX machine.)

Any idea what is going on here?

Source Link
tremby
  • 573
  • 8
  • 17

Why does Gnu sort sort differently on my OSX machine and Linux machine?

I have a OSX machine where sort runs GNU sort from coreutils 8.26 (installed from Homebrew), and a Linux machine where sort runs GNU sort from coreutils 8.25.

On the Mac:

mac$ echo -e "{1\n2" | sort
2
{1

While on Linux:

linux$ echo -e "{1\n2" | sort
{1
2

I'm aware that sort depends on the locale. I ran locale on the Linux machine, prepended each line of output with export and ran the resulting lines on the OSX machine before running (in the same terminal) the sort command again, which gave the same output as before.

I noticed, however, that running locale on the Mac doesn't show all of the lines which appear on Linux, and I'm not sure if this is related.

The locale on Linux:

linux$ locale
LANG=en_CA.UTF-8
LANGUAGE=en_CA:en
LC_CTYPE="en_CA.UTF-8"
LC_NUMERIC="en_CA.UTF-8"
LC_TIME="en_CA.UTF-8"
LC_COLLATE="en_CA.UTF-8"
LC_MONETARY="en_CA.UTF-8"
LC_MESSAGES="en_CA.UTF-8"
LC_PAPER="en_CA.UTF-8"
LC_NAME="en_CA.UTF-8"
LC_ADDRESS="en_CA.UTF-8"
LC_TELEPHONE="en_CA.UTF-8"
LC_MEASUREMENT="en_CA.UTF-8"
LC_IDENTIFICATION="en_CA.UTF-8"
LC_ALL=en_CA.UTF-8

And locale on OSX:

mac$ locale
LANG="en_CA.UTF-8"
LC_COLLATE="en_CA.UTF-8"
LC_CTYPE="en_CA.UTF-8"
LC_MESSAGES="en_CA.UTF-8"
LC_MONETARY="en_CA.UTF-8"
LC_NUMERIC="en_CA.UTF-8"
LC_TIME="en_CA.UTF-8"
LC_ALL="en_CA.UTF-8"

I've found that if I set LC_ALL=UTF-8 on both machines, they both sort 2 before {1. But if I set LC_ALL=en_CA.UTF-8 on both machines I have the differing output as above. Same if I set LC_ALL=en_CA.utf8 on both machines. (locale -a lists en_CA.utf8 on the Linux machine but en_CA.UTF-8 on the OSX machine.)

Any idea what is going on here?