sorting with bash, how to sort output from awk

Question

Here we have file like a log, each process has acknowledgement i have to get all history about creating an apple. so i need to find all strings were apple is created, after that i should find all created status for apples as pattern we use number of process, and result should be sorted by time stack.log is below

03:01:29.312    5 process   create apple
05:22:42.211    1 process   create banana
05:22:42.302    1 process   created
06:09:32.083    12 process  create apple
05:12:32.759    5 process   created
07:21:45.112    11 process  create orange
06:09:35.083    12 process  created
03:01:25.714    21 process  create apple
05:12:32.308    7 process   create grape
05:12:32.309    7 process   created
05:12:32.300    21 process  created
07:25:41.000    11 process  created

here is sample output for this task

03:01:25.714    21 process  create apple
03:01:29.312    5 process   create apple
05:12:32.300    21 process  created
05:12:32.759    5 process   created
06:09:32.083    12 process  create apple
06:09:35.083    12 process  created

here is a code i've tried

a=($(awk '$5 == "apple" { print $2 }' stack.log))
for i in "${a[@]}"
do
    awk -v search="$i" '$0 ~ search { print $1 }' stack.log
done

@Thor exactly, i need two loops, because i don't know when status will be known, because status don't mention apples in it. i need creation and created status for apples, but created status i can find only with process number. Process number for apples are different from time to time, so i need to find all process numbers, after all time for them. moreover, i can't do this in opposite way: search and sort processes and after that select only apples, because file is very big, and there are a lot of other items — laverka trip
– laverka trip, Commented Jun 9, 2017 at 10:48
Okay. How about sorting the source file before you parse it? That would make the task simpler and doable in one parse — Thor
– Thor, Commented Jun 9, 2017 at 10:58
@Thor i thought that sorting few fields is easier than sort all log file with a big amount of other data. Even it will be sorted we anyway need twice parsing, because for first time we dont have enough information what we are looking for. Am i wrong? with the first loop we will got 03:01:25.714 21 process create apple 03:01:29.312 5 process create apple 06:09:32.083 12 process create apple but no information about created status? sorry for dumb question — laverka trip
– laverka trip, Commented Jun 9, 2017 at 11:13
Assuming the "created" lines always come after the "create apple" lines it is also doable in one-parse without sorting. Just remember the process id number until the "created" line comes along — Thor
– Thor, Commented Jun 9, 2017 at 11:22
@Thor as far as i understand you it looks like every string we compare with apple or with the stored number of process? if stored process is founded we have to delete it? but what we will do if we will store 10 process numbers to check? we each string will compare with 10 values? For example something happened and for a long time we have no confirmation that apples are created, but we continue to create them :) so we will have 10 messages with create apple, and no one with created, so on this stage we will have to store 10 processes and every string should be compared with every of this proces — laverka trip
– laverka trip, Commented Jun 9, 2017 at 11:35

Thor · Accepted Answer · 2017-06-09 16:49:50Z

Assuming the "created" lines always come after the "create apple" lines it is doable in one-parse, e.g.:

awk '/create apple/ { h[$2]; print; next } $2 in h { print; delete h[$2] }'

Sort the output:

awk ... | sort

Output:

03:01:25.714    21 process  create apple
03:01:29.312    5 process   create apple
05:12:32.300    21 process  created
05:12:32.759    5 process   created
06:09:32.083    12 process  create apple
06:09:35.083    12 process  created

Explanation

The awk script consists of two blocks:

/create apple/ {    # Only run on lines containing the pattern
  h[$2]             # Save process id in hash
  print             # Print the line
  next              # Skip to next line
}

and

$2 in h {           # If this process id was seen before
  print             # Print the line
  delete h[$2]      # and remove the id from the hash
}

The idea is to only remember process ids until the matching id is found. This should only happen if a previous create apple line is present.

Note that if your data is inconsistent, you need a lot more error checking.

Collectives™ on Stack Overflow

sorting with bash, how to sort output from awk

1 Answer 1

Explanation

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Explanation

Comments

Your Answer

Sign up or log in

Post as a guest

Related