9

I want to sort csv as follow, what I want is

  1. sort by column 2
  2. if column is the same, sort by column 3(numerically)

here is what I do:

$ sort  -t","  -k2 -nk3  /tmp/test.csv
55b64670abb9c0663e77de84,525e3bfad07b4377dc142a24:9999,0.081032
5510b33ec720d80086865312,525e3bfad07b4377dc142a24:9999,0.081033
55aca6a1d2e33dc888ddeb31,525e3bf7d07b4377d31429d2:2,0.081034
55aca6a1d2e33dc888ddeb31,525e3bf7d07b4377d31429d2:2,0.081034
5514548ec720d80086bfec46,525e3bfad07b4377dc142a24:9999,0.081035
551d4e21c720d80086084f45,525e3bfad07b4377dc142a24:9999,0.081036
557bff5276bd54a8df83268a,525e3bfad07b4377dc142a24:9999,0.081036

this result is strange, it sorts by the column three first, then by column 2

3

3 Answers 3

7

This command seems to yield correct output:

sort -t"," -k2,2 -k3,3n /tmp/test.csv

I use comma to constrain order to that column only, and use the numeric (-n) switch to last character in the third column.

It yields:

55aca6a1d2e33dc888ddeb31,525e3bf7d07b4377d31429d2:2,0.081034
55aca6a1d2e33dc888ddeb31,525e3bf7d07b4377d31429d2:2,0.081034
55b64670abb9c0663e77de84,525e3bfad07b4377dc142a24:9999,0.081032
5510b33ec720d80086865312,525e3bfad07b4377dc142a24:9999,0.081033
5514548ec720d80086bfec46,525e3bfad07b4377dc142a24:9999,0.081035
551d4e21c720d80086084f45,525e3bfad07b4377dc142a24:9999,0.081036
557bff5276bd54a8df83268a,525e3bfad07b4377dc142a24:9999,0.081036
Sign up to request clarification or add additional context in comments.

Comments

1

While the sort command has some hacks to partially handle CSV files, it won't handle all CSV format features. csvsort is a great option:

csvsort -c 2,3 /tmp/test.csv

1 Comment

Nice answer ! This should be the accepted one
0

Sort will work sorting data on csv & txt file , it will print the output on console

-t says columns are delimited by '|' , -k1 -k2 says that-- it will sort te data by column 1 & then by 2

$ sort -t '|' -k1 -k2 <INPUT_FILE>

For storing the result in output file use following command

$ sort -t '|' -k1 -k2 <INPUT_FILE> -o <OUTPUTFILE>

If you wann do it with ignoring header line then use following command

(head -n1 INPUT_FILE && sort <(tail -n+2 INPUT_FILE)) > OUTPUT_FILE

head -n1 INPUT_FILE which will print only the first line of your file i.e. header

& This special tail syntax gets your file from second line up to EOF.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.