Get Rows based on column value from csv

Question

I have a csv with the below data

10.000.00.00,D3,1
10.001.00.00,C4,2
10.002.00.00,C5,2
10.000.88.99,B1,3
10.000.00.00,B2,3
10.000.00.00,C6,3
10.000.99.00,D1,3

tried below code

cat Data.csv | awk -F , '$3 == "3" { print }'

Need to get only the rows having last values as 3.

Please let me know how to do this

What's wrong with your code? Your code does exactly what it's supposed to do. Maybe a little awkward. — Cyrus
– Cyrus, Commented Apr 1, 2019 at 17:19
Assuming your posted code doesn't do what you expected - your input file has DOS line-endings. Use cat -v Data.csv to see them and then dos2unix or similar to remove them. See stackoverflow.com/a/45772568/1745001 for details. — Ed Morton
– Ed Morton, Commented Apr 1, 2019 at 17:25
@Sandy: To avoid the described problem, you can append the following to your mawk or gawk command to handle DOS and Unix line-endings: -v RS='\n|\r\n' — Cyrus
– Cyrus, Commented Apr 1, 2019 at 17:30
@Cyrus '\n|\r\n' = \r?\n. That approach will fail of course if there truly are supposed to be DOS line endings such as from an Excel export to a CSV where lines end in \r\n but can contain \ns inside quoted fields. — Ed Morton
– Ed Morton, Commented Apr 1, 2019 at 17:45

James Brown · Accepted Answer · 2019-04-02 09:06:29Z

6

Using awk to get only the rows having last values as 3:

$ awk -F, '$NF==3' file
10.000.88.99,B1,3
10.000.00.00,B2,3
10.000.00.00,C6,3
10.000.99.00,D1,3

Explained:

awk -F, '  # set the field separator to a comma
$NF==3     # NF is the last field, $NF last field value (see comments for more
' file                                                  #thanks @kvantour)

edited Apr 2, 2019 at 9:06

answered Apr 1, 2019 at 16:51

James Brown

37.7k8 gold badges52 silver badges64 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

kvantour Over a year ago

The reason this works is because (a) a numeric comparison is enforced as fields are both numeric and string at the same time (b) $NF is converted to a numeric value using strod and the latter ignores unrecognized characters (such as \r)

ctac_ · Accepted Answer · 2019-04-01 18:02:29Z

2

You can try with sed :

sed '/,3$/!d' infile

If you can have \r at end of lines, try this way :

sed '/,3\r*$/!d' infile

edited Apr 1, 2019 at 18:02

answered Apr 1, 2019 at 17:57

ctac_

2,5012 gold badges10 silver badges18 bronze badges

Comments

Allan · Accepted Answer · 2019-04-02 07:42:11Z

2

Why do we need awk or sed for this kind of operations in the first place??? Isn't it an overkill?

OP is asking about extracting some lines meeting a specific condition from the file without even modifying their format...

grep is THE perfect tool for this.

$ grep ',3$' Data.csv 
10.000.88.99,B1,3
10.000.00.00,B2,3
10.000.00.00,C6,3
10.000.99.00,D1,3

Eventually grep -E ',3\r?$' Data.csv if you have windows EOLs.

Also try avoiding as much as possible cat <FILE> | <COMMAND>, instead pass directly the file to the command or redirect the stdin from the file to the command (Command < file).

edited Apr 2, 2019 at 7:42

answered Apr 2, 2019 at 7:34

Allan

12.5k3 gold badges33 silver badges56 bronze badges

Comments

Nitin Tripathi · Accepted Answer · 2019-04-02 08:58:24Z

0

you can use built in awk variable for this.

in our case

'$NF' - NF is for the number of fields in the current record

awk -F, '{if($NF == 3) {print $0} }' Data.csv
10.000.88.99,B1,3
10.000.00.00,B2,3
10.000.00.00,C6,3
10.000.99.00,D1,3

You can learn more about built in varible at following link: Awk Built in Variables

answered Apr 2, 2019 at 8:58

Nitin Tripathi

1,2648 silver badges19 bronze badges

Collectives™ on Stack Overflow

Get Rows based on column value from csv

4 Answers 4

1 Comment

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related