Counting unique strings where there's a single string per line in bash

Question

Given input file

z
b
a
f
g
a
b
...

I want to output the number of occurrences of each string, for example:

z 1
b 2
a 2
f 1
g 1

How can this be done in a bash script?

johnsyweb · Accepted Answer · 2012-01-28 10:20:25Z

4

You can sort the input and pass to uniq -c:

$ sort input_file | uniq -c
 2 a
 2 b
 1 f
 1 g
 1 z

If you want the numbers on the right, use awk to switch them:

$ sort input_file | uniq -c | awk '{print $2, $1}'
a 2
b 2
f 1
g 1
z 1

Alternatively, do the whole thing in awk:

$ awk '
{
    ++count[$1]
}
END {
    for (word in count) {
        print word, count[word]
    }
}
' input_file
f 1
g 1
z 1
a 2
b 2

edited Jan 28, 2012 at 10:20

answered Jan 28, 2012 at 10:13

johnsyweb

143k26 gold badges197 silver badges253 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

shadyabhi · Accepted Answer · 2012-01-28 10:14:11Z

1

cat text | sort | uniq -c

should do the job

answered Jan 28, 2012 at 10:14

shadyabhi

17.4k28 gold badges86 silver badges135 bronze badges

Comments

Mithrandir · Accepted Answer · 2012-01-28 10:18:32Z

1

Try:

awk '{ freq[$1]++; } END{ for( c in freq ) { print c, freq[c] } }' test.txt

Where test.txt would be your input file.

answered Jan 28, 2012 at 10:18

Mithrandir

25.5k6 gold badges53 silver badges67 bronze badges

Comments

Mat · Accepted Answer · 2012-01-28 14:57:21Z

1

Here's a bash-only version (requires bash version 4), using an associative array.

#! /bin/bash

declare -A count
while read val ; do
    count[$val]=$(( ${count[$val]} + 1 ))
done < your_intput_file # change this as needed

for key in ${!count[@]} ; do
    echo $key ${count[$key]}
done

edited Jan 28, 2012 at 14:57

answered Jan 28, 2012 at 10:23

Mat

208k41 gold badges407 silver badges423 bronze badges

2 Comments

glenn jackman Over a year ago

requires bash version 4 for associative arrays

Dennis Williamson Over a year ago

Simpler: (( count[$val]++ )). Also, you should almost always use -r with read. Always quote variables: for key in "${!count[@]}" and echo "$key ${count[$key]}".

potong · Accepted Answer · 2012-01-28 21:35:03Z

0

This might work for you:

cat -n file | 
sort -k2,2 | 
uniq -cf1 | 
sort -k2,2n | 
sed 's/^ *\([^ ]*\).*\t\(.*\)/\2 \1/'

This output the number of occurrences of each string in the order in which they appear.

answered Jan 28, 2012 at 21:35

potong

59.3k6 gold badges55 silver badges92 bronze badges

Comments

Borealid · Accepted Answer · 2012-01-29 05:34:39Z

0

You can use sort filename | uniq -c.

Have a look at the Wikipedia page on uniq.

edited Jan 29, 2012 at 5:34

Borealid

99.4k9 gold badges111 silver badges123 bronze badges

answered Jan 28, 2012 at 10:14

Balaswamy Vaddeman

8,6003 gold badges33 silver badges40 bronze badges

Collectives™ on Stack Overflow

Counting unique strings where there's a single string per line in bash

6 Answers 6

Comments

Comments

Comments

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

6 Answers 6

Comments

Comments

Comments

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related