1

This is my input file.

yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *5555555555 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *2222222222 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *3333333333 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [BBBBB]
yyyy-mm-dd hh:mm:ss string *6666666666 [AAAAA]

Let's consider the above input as input.gz, how to get the count of *9999999999 with last column as [AAAAAA]

I need a script using SED or AWK or GREP.

Expected output should be:

5  

What if the above input has the last column extended to a new line? like:

yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *5555555555 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA  
zzzzzzzzzzzz xxxxxxxx yy]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *2222222222 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *3333333333 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [AAAAA]
yyyy-mm-dd hh:mm:ss string *9999999999 [BBBBB]
yyyy-mm-dd hh:mm:ss string *6666666666 [AAAAA]    

In the above case, won't it be difficult to use AWK? How to overcome this using SED?

I'm sorry for editing it again. What if the 10-digit number is unknown? like *9999999999 is unknown, can we find out the number of times *NNNNNNNNNN is occuring with last column as [AAAAA]?

1
  • wrt In the above case, won't it be difficult to use AWK? How to overcome this using SED?. You have been misinformed. awk is specifically designed to operate on multi-line records. No exaggeration, sed hasn't been an appropriate tool to use on multi-line text since the mid-1970s when awk was invented. Really think about the most difficult input cases you need to handle and edit your question to show those plus the expected output given that input because right now the sample input you are providing doesn't seem to reflect worst cases of what you are describing. Commented May 9, 2016 at 23:54

4 Answers 4

2
cat input_file | grep '[*]9999999999 \[AAAAA\]$' | wc -l
Sign up to request clarification or add additional context in comments.

Comments

1

Try this:

 awk '$NF ~ /\[A+\]/ && $(NF1)~/\*9+/' input | wc -l

For the sake of simplicity, I use the wc-command to do the counting. Of course, this could be implemented in awk, too:

 awk '$NF ~ /\[A+\]/ && $(NF1)~/\*9+/{counter++}END{print counter}' input

Update: How to list the number of occurrences for each number

 awk '$NF ~ /\[A+\]/{ar[$(NF-1)]++}END{for(key in ar){print key,ar[key]}}' input

Output:

*2222222222 1
*6666666666 1
*5555555555 1
*3333333333 1
*9999999999 5

Comments

0

Just with one grep:

grep -c "\*9999999999.*\[AAAAA\]$" inputfile

When you have the input split over 2 lines (sometimes) but [AAAAA still on the first, you can try

grep -c "\*9999999999.*\[AAAAA" inputfile

Comments

0

awk to the rescue!

$ awk -v key='*9999999999' '$NF=="[AAAAA]" && $(NF-1)==key {c++} END{print c}' file
5

if the last field is split into two lines, by definition it won't be equal to "[AAAAA]"

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.