1

I have a csv with the below data

10.000.00.00,D3,1
10.001.00.00,C4,2
10.002.00.00,C5,2
10.000.88.99,B1,3
10.000.00.00,B2,3
10.000.00.00,C6,3
10.000.99.00,D1,3

tried below code

cat Data.csv | awk -F , '$3 == "3" { print }'

Need to get only the rows having last values as 3.

Please let me know how to do this

5
  • What's wrong with your code? Your code does exactly what it's supposed to do. Maybe a little awkward. Commented Apr 1, 2019 at 17:19
  • Assuming your posted code doesn't do what you expected - your input file has DOS line-endings. Use cat -v Data.csv to see them and then dos2unix or similar to remove them. See stackoverflow.com/a/45772568/1745001 for details. Commented Apr 1, 2019 at 17:25
  • 1
    @Sandy: To avoid the described problem, you can append the following to your mawk or gawk command to handle DOS and Unix line-endings: -v RS='\n|\r\n' Commented Apr 1, 2019 at 17:30
  • @Cyrus '\n|\r\n' = \r?\n. That approach will fail of course if there truly are supposed to be DOS line endings such as from an Excel export to a CSV where lines end in \r\n but can contain \ns inside quoted fields. Commented Apr 1, 2019 at 17:45
  • @EdMorton: That's correct. Commented Apr 1, 2019 at 17:53

4 Answers 4

6

Using awk to get only the rows having last values as 3:

$ awk -F, '$NF==3' file
10.000.88.99,B1,3
10.000.00.00,B2,3
10.000.00.00,C6,3
10.000.99.00,D1,3

Explained:

awk -F, '  # set the field separator to a comma
$NF==3     # NF is the last field, $NF last field value (see comments for more
' file                                                  #thanks @kvantour)
Sign up to request clarification or add additional context in comments.

1 Comment

The reason this works is because (a) a numeric comparison is enforced as fields are both numeric and string at the same time (b) $NF is converted to a numeric value using strod and the latter ignores unrecognized characters (such as \r)
2

You can try with sed :

sed '/,3$/!d' infile

If you can have \r at end of lines, try this way :

sed '/,3\r*$/!d' infile

Comments

2

Why do we need awk or sed for this kind of operations in the first place??? Isn't it an overkill?

OP is asking about extracting some lines meeting a specific condition from the file without even modifying their format...

grep is THE perfect tool for this.

$ grep ',3$' Data.csv 
10.000.88.99,B1,3
10.000.00.00,B2,3
10.000.00.00,C6,3
10.000.99.00,D1,3

Eventually grep -E ',3\r?$' Data.csv if you have windows EOLs.

Also try avoiding as much as possible cat <FILE> | <COMMAND>, instead pass directly the file to the command or redirect the stdin from the file to the command (Command < file).

Comments

0

you can use built in awk variable for this.

in our case

'$NF' - NF is for the number of fields in the current record

awk -F, '{if($NF == 3) {print $0} }' Data.csv
10.000.88.99,B1,3
10.000.00.00,B2,3
10.000.00.00,C6,3
10.000.99.00,D1,3

You can learn more about built in varible at following link: Awk Built in Variables

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.