How to sort and count wrt a string

Question

This is my input file.

yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *5555555555 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *2222222222 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *3333333333 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [BBBBB]
yyyy-mm-dd hh:mm:ss string *6666666666 [AAAAA]

Let's consider the above input as input.gz, how to get the count of *9999999999 with last column as [AAAAAA]

I need a script using SED or AWK or GREP.

Expected output should be:

What if the above input has the last column extended to a new line? like:

yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *5555555555 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA  
zzzzzzzzzzzz xxxxxxxx yy]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *2222222222 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *3333333333 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [BBBBB]
yyyy-mm-dd hh:mm:ss string *6666666666 [AAAAA]

In the above case, won't it be difficult to use AWK? How to overcome this using SED?

I'm sorry for editing it again. What if the 10-digit number is unknown? like *9999999999 is unknown, can we find out the number of times *NNNNNNNNNN is occuring with last column as [AAAAA]?

wrt In the above case, won't it be difficult to use AWK? How to overcome this using SED?. You have been misinformed. awk is specifically designed to operate on multi-line records. No exaggeration, sed hasn't been an appropriate tool to use on multi-line text since the mid-1970s when awk was invented. Really think about the most difficult input cases you need to handle and edit your question to show those plus the expected output given that input because right now the sample input you are providing doesn't seem to reflect worst cases of what you are describing. — Ed Morton
– Ed Morton, Commented May 9, 2016 at 23:54

Drew Varner · Accepted Answer · 2016-05-09 21:33:19Z

2

cat input_file | grep '[*]9999999999 \[AAAAA\]$' | wc -l

answered May 9, 2016 at 21:33

Drew Varner

563 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

F. Knorr · Accepted Answer · 2016-05-10 08:10:48Z

1

Try this:

 awk '$NF ~ /\[A+\]/ && $(NF1)~/\*9+/' input | wc -l

For the sake of simplicity, I use the wc-command to do the counting. Of course, this could be implemented in awk, too:

 awk '$NF ~ /\[A+\]/ && $(NF1)~/\*9+/{counter++}END{print counter}' input

Update: How to list the number of occurrences for each number

 awk '$NF ~ /\[A+\]/{ar[$(NF-1)]++}END{for(key in ar){print key,ar[key]}}' input

Output:

*2222222222 1
*6666666666 1
*5555555555 1
*3333333333 1
*9999999999 5

edited May 10, 2016 at 8:10

answered May 9, 2016 at 21:31

F. Knorr

3,07518 silver badges22 bronze badges

Comments

Walter A · Accepted Answer · 2016-05-09 21:43:42Z

0

Just with one grep:

grep -c "\*9999999999.*\[AAAAA\]$" inputfile

When you have the input split over 2 lines (sometimes) but [AAAAA still on the first, you can try

grep -c "\*9999999999.*\[AAAAA" inputfile

answered May 9, 2016 at 21:43

Walter A

20.2k2 gold badges29 silver badges46 bronze badges

Comments

karakfa · Accepted Answer · 2016-05-09 21:52:45Z

0

awk to the rescue!

$ awk -v key='*9999999999' '$NF=="[AAAAA]" && $(NF-1)==key {c++} END{print c}' file
5

if the last field is split into two lines, by definition it won't be equal to "[AAAAA]"

answered May 9, 2016 at 21:52

karakfa

67.8k8 gold badges45 silver badges59 bronze badges

Collectives™ on Stack Overflow

How to sort and count wrt a string

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related