Linux shell script read columns into variable and then add the attribute

Question

I have a file test.txt looking like this:

2092 Mary
103 Tom
1239 Mary
204 Mark
1294 Tom
1092 Mary

I am trying to create a shell script that will

Read each line and put the data in two columns into variable var1 and var2
If var2 in each line is the same, then add the var1 in those lines.
output the file into a text file.

The result should be unique values in the var2 column. Here's what I have so far:

#!/bin/sh
#!/usr/bin/sh
cat test.txt| while read line;
do
$var1=$(echo $line| awk -F\; '{print $1}')
$var2=$(echo $line| awk -F\; '{print $2}')

How can I reference the variable in each line and then compare them?
The expected output would be:

4423 Mary
1397 Tom 
204  Mark

What about using sort -s -n k 1 and then -u to get unique values? — fedorqui
– fedorqui, Commented Mar 20, 2013 at 18:40
no done? and $var1=, you probably mean var1=.... Finally, no need to split line in such a convoluted way. while read num name; do ... ; done < test.txt. Finally, finally, agree with fedorqui that sort can do what you need, unless this is just a sketch of your problem. Doing uniqueness in shell code without sort and/or uniq is not trival. Good luck! — shellter
– shellter, Commented Mar 20, 2013 at 18:45

Jonathan Leffler · Accepted Answer · 2013-03-20 20:50:57Z

2

Using awk it is easy:

awk '{sum[$2] += $1} END {for (i in sum) printf "%4d %s\n", sum[i], i; }'

If you want to do it with bash 4.x (not 3.x), then:

declare -A sum
while read number name
do
    ((sum[$name] += $number))
done

for name in "${!sum[@]}"
do
    echo ${sum[$name]} $name
done

The structure here is essentially isomorphic with the awk script, but a little less notationally convenient. It will read from standard input, using the names as indexes into the associative array sum. The ${!sum[@]} notation is described in the Shell Parameter Expansion section of the manual, and not even hinted at in the section on Arrays. The information is there if you know where to look.

If you want to process an arbitrary number of input files (like the awk script would) then you need to use cat to collect the data:

cat "$@" |
{
declare -A sum
while read number name
do
    ((sum[$name] += $number))
done

for name in "${!sum[@]}"
do
    echo ${sum[$name]} $name
done
}

This is not UUOC because it handles no arguments (read standard input), one argument or many arguments.

For all the scripts, if you want to sort the output in number or name order, apply an appropriate sort to the output of the script:

script file1 file2 file3 | sort -k 1,1n     # By sum increasing order
script file1 file2 file3 | sort -k 1,1nr    # By sum decreasing order
script file1 file2 file3 | sort -k 2,2      # By name increasing order
script file1 file2 file3 | sort -k 2,2r     # By name decreasing order

edited Mar 20, 2013 at 20:50

answered Mar 20, 2013 at 19:20

Jonathan Leffler

760k145 gold badges961 silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

chepner Over a year ago

You need bash 4.2 and the lastpipe option to let sum be visible after the while loop, though. An outer loop of for f; do while ...; done < "$f"; done should work, though.

chepner Over a year ago

Or perhaps while read number name; do ...; done < <( cat "$@" ), which should also work with zero arguments, unlike my previous comment.

Jonathan Leffler Over a year ago

@chepner: Oh —drat! It's always dangerous not to test 'enhancements', even simple ones. Thanks — I'll fix it, my way, using { ... } to group the operational part of the script into a single unit for redirection of the cat output.

Collectives™ on Stack Overflow

Linux shell script read columns into variable and then add the attribute

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related