2

I have a file test.txt looking like this:

2092 Mary
103 Tom
1239 Mary
204 Mark
1294 Tom
1092 Mary

I am trying to create a shell script that will

  1. Read each line and put the data in two columns into variable var1 and var2
  2. If var2 in each line is the same, then add the var1 in those lines.
  3. output the file into a text file.

The result should be unique values in the var2 column. Here's what I have so far:

#!/bin/sh
#!/usr/bin/sh
cat test.txt| while read line;
do
$var1=$(echo $line| awk -F\; '{print $1}')
$var2=$(echo $line| awk -F\; '{print $2}')

How can I reference the variable in each line and then compare them?
The expected output would be:

4423 Mary
1397 Tom 
204  Mark
3
  • What about using sort -s -n k 1 and then -u to get unique values? Commented Mar 20, 2013 at 18:40
  • Can you add the expected output ? Thanks Commented Mar 20, 2013 at 18:44
  • 1
    no done? and $var1=, you probably mean var1=.... Finally, no need to split line in such a convoluted way. while read num name; do ... ; done < test.txt. Finally, finally, agree with fedorqui that sort can do what you need, unless this is just a sketch of your problem. Doing uniqueness in shell code without sort and/or uniq is not trival. Good luck! Commented Mar 20, 2013 at 18:45

1 Answer 1

2

Using awk it is easy:

awk '{sum[$2] += $1} END {for (i in sum) printf "%4d %s\n", sum[i], i; }'

If you want to do it with bash 4.x (not 3.x), then:

declare -A sum
while read number name
do
    ((sum[$name] += $number))
done

for name in "${!sum[@]}"
do
    echo ${sum[$name]} $name
done

The structure here is essentially isomorphic with the awk script, but a little less notationally convenient. It will read from standard input, using the names as indexes into the associative array sum. The ${!sum[@]} notation is described in the Shell Parameter Expansion section of the manual, and not even hinted at in the section on Arrays. The information is there if you know where to look.

If you want to process an arbitrary number of input files (like the awk script would) then you need to use cat to collect the data:

cat "$@" |
{
declare -A sum
while read number name
do
    ((sum[$name] += $number))
done

for name in "${!sum[@]}"
do
    echo ${sum[$name]} $name
done
}

This is not UUOC because it handles no arguments (read standard input), one argument or many arguments.

For all the scripts, if you want to sort the output in number or name order, apply an appropriate sort to the output of the script:

script file1 file2 file3 | sort -k 1,1n     # By sum increasing order
script file1 file2 file3 | sort -k 1,1nr    # By sum decreasing order
script file1 file2 file3 | sort -k 2,2      # By name increasing order
script file1 file2 file3 | sort -k 2,2r     # By name decreasing order
Sign up to request clarification or add additional context in comments.

3 Comments

You need bash 4.2 and the lastpipe option to let sum be visible after the while loop, though. An outer loop of for f; do while ...; done < "$f"; done should work, though.
Or perhaps while read number name; do ...; done < <( cat "$@" ), which should also work with zero arguments, unlike my previous comment.
@chepner: Oh —drat! It's always dangerous not to test 'enhancements', even simple ones. Thanks — I'll fix it, my way, using { ... } to group the operational part of the script into a single unit for redirection of the cat output.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.