how to extract columns from a text file with bash

Question

I have a text file like this.

 res          ABS   sum     
 SER A   1   161.15 138.3  
 CYS A   2    66.65  49.6  
 PRO A   3    21.48  15.8  
 ALA A   4    77.68  72.0  
 ILE A   5    15.70   9.0  
 HIS A   6    10.88   5.9

I would like to extract the names of first column(res) based on the values of last column(sum). I have to print resnames if sum >25 and sum<25. How can I get the output like this?

Per the conversation in the comments, can you clarify when you actually want resnames printed? No number can be both less than 25 and greater than 25. Do you want resnames printed if sum != 25, or do you want it printed if, for instance, sum < 25 OR ABS > 25? — Tim Pote
– Tim Pote, Commented Apr 28, 2012 at 15:28

nullpotent · Accepted Answer · 2012-04-28 13:48:54Z

1

This should do it:

awk 'BEGIN{FS=OFS=" "}{if($5 != 25) print $1}' bla.txt

answered Apr 28, 2012 at 13:48

nullpotent

9,2801 gold badge34 silver badges44 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Tim Pote · Accepted Answer · 2012-04-28 13:47:46Z

1

While you can do this with a while read loop in bash, it's easier, and most likely faster, to use awk

awk '$5 != 25 { print $1 }'

Note that your logic print resnames if sum >25 and sum<25 is the same as print if sum != 25.

answered Apr 28, 2012 at 13:47

Tim Pote

28.4k6 gold badges66 silver badges68 bronze badges

4 Comments

Kevin Over a year ago

Sum > 25 and sum < 25 leaves nothing. Or would.

user unknown Over a year ago

@TimPote: So 25 is meanwhile > 25 and < 25?

Tim Pote Over a year ago

@userunknown Ah I see. Apply actual boolean logic. Sorry I parsed that as what he meant not what he actually said. Yeah actual boolean logic would say that he needs to print nothing.

user unknown Over a year ago

Mh, yes, but what he meant might be two functions, one which selects the lower, and one which selects the higher values, but not both at the same time. However - from your sample code it is trivial to make 2 specialized ones. What's about the headline?

Will Demaine · Accepted Answer · 2012-04-28 13:49:44Z

1

Consider using awk. Its a simple tool for processing columns of text (and much more). Here's a simple awk tutorial which will give you an overview. If you want to use it within a bash script, then this tutorial should help.

Run this on the command line to give you an idea of how you could do it:

> echo "SER A   1   161.15 138.3" | awk '{ if($5 > 25) print $1}'
> SER
> echo "SER A   1   161.15 138.3" | awk '{ if($5 > 140) print $1}'
>

answered Apr 28, 2012 at 13:49

Will Demaine

1,3961 gold badge11 silver badges13 bronze badges

Comments

user unknown · Accepted Answer · 2012-04-28 14:14:53Z

0

while read line
do 
v=($line)
sum=${v[4]}
((${sum/.*/} >= 25)) && echo ${v[0]}
done < file

You need to skip the first line.

Since bash doesn't handle floating point values, this will print 25 which isn't exactly bigger than 25.

This can be handled with calling bc for arithmetics.

tail -n +2 ser.dat | while read line
do  
  v=($line)
  sum=${v[4]}
  gt=$(echo "$sum > 25" | bc) && echo ${v[0]}
done

edited Apr 28, 2012 at 14:14

answered Apr 28, 2012 at 14:02

user unknown

36.4k12 gold badges77 silver badges123 bronze badges

Comments

jpmuc · Accepted Answer · 2012-04-29 20:24:13Z

0

what about the good old cut? :)

say you would like to have the second column,

cat your_file.txt | sed 's, +, ,g' | cut -d" " -f 2

what is doing sed in this command? cut expects columns to be separated by a character or a string of fixed length (see documentation).

answered Apr 29, 2012 at 20:24

jpmuc

1,1541 gold badge15 silver badges33 bronze badges

1 Comment

Tim Pote Over a year ago

Just to let you know, there are a few problems with this solution. First of all, it's a prime example of a Useless Use of Cat. sed can work on files without the need for the pipe. Second, awk doesn't have the field separator limitations of cut, so you could do the same with a single print $2 in awk without the need for sed. Third, it doesn't do what the OP asks. They wanted to conditionally print the second field. Yours always prints the second field.

Collectives™ on Stack Overflow

how to extract columns from a text file with bash

5 Answers 5

Comments

4 Comments

Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

4 Comments

Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related