1

Here we have file like a log, each process has acknowledgement i have to get all history about creating an apple. so i need to find all strings were apple is created, after that i should find all created status for apples as pattern we use number of process, and result should be sorted by time stack.log is below

03:01:29.312    5 process   create apple
05:22:42.211    1 process   create banana
05:22:42.302    1 process   created
06:09:32.083    12 process  create apple
05:12:32.759    5 process   created
07:21:45.112    11 process  create orange
06:09:35.083    12 process  created
03:01:25.714    21 process  create apple
05:12:32.308    7 process   create grape
05:12:32.309    7 process   created
05:12:32.300    21 process  created
07:25:41.000    11 process  created

here is sample output for this task

03:01:25.714    21 process  create apple
03:01:29.312    5 process   create apple
05:12:32.300    21 process  created
05:12:32.759    5 process   created
06:09:32.083    12 process  create apple
06:09:35.083    12 process  created

here is a code i've tried

a=($(awk '$5 == "apple" { print $2 }' stack.log))
for i in "${a[@]}"
do
    awk -v search="$i" '$0 ~ search { print $1 }' stack.log
done
13
  • @Thor exactly, i need two loops, because i don't know when status will be known, because status don't mention apples in it. i need creation and created status for apples, but created status i can find only with process number. Process number for apples are different from time to time, so i need to find all process numbers, after all time for them. moreover, i can't do this in opposite way: search and sort processes and after that select only apples, because file is very big, and there are a lot of other items Commented Jun 9, 2017 at 10:48
  • 1
    Okay. How about sorting the source file before you parse it? That would make the task simpler and doable in one parse Commented Jun 9, 2017 at 10:58
  • @Thor i thought that sorting few fields is easier than sort all log file with a big amount of other data. Even it will be sorted we anyway need twice parsing, because for first time we dont have enough information what we are looking for. Am i wrong? with the first loop we will got 03:01:25.714 21 process create apple 03:01:29.312 5 process create apple 06:09:32.083 12 process create apple but no information about created status? sorry for dumb question Commented Jun 9, 2017 at 11:13
  • Assuming the "created" lines always come after the "create apple" lines it is also doable in one-parse without sorting. Just remember the process id number until the "created" line comes along Commented Jun 9, 2017 at 11:22
  • @Thor as far as i understand you it looks like every string we compare with apple or with the stored number of process? if stored process is founded we have to delete it? but what we will do if we will store 10 process numbers to check? we each string will compare with 10 values? For example something happened and for a long time we have no confirmation that apples are created, but we continue to create them :) so we will have 10 messages with create apple, and no one with created, so on this stage we will have to store 10 processes and every string should be compared with every of this proces Commented Jun 9, 2017 at 11:35

1 Answer 1

2

Assuming the "created" lines always come after the "create apple" lines it is doable in one-parse, e.g.:

awk '/create apple/ { h[$2]; print; next } $2 in h { print; delete h[$2] }'

Sort the output:

awk ... | sort

Output:

03:01:25.714    21 process  create apple
03:01:29.312    5 process   create apple
05:12:32.300    21 process  created
05:12:32.759    5 process   created
06:09:32.083    12 process  create apple
06:09:35.083    12 process  created

Explanation

The awk script consists of two blocks:

/create apple/ {    # Only run on lines containing the pattern
  h[$2]             # Save process id in hash
  print             # Print the line
  next              # Skip to next line
}

and

$2 in h {           # If this process id was seen before
  print             # Print the line
  delete h[$2]      # and remove the id from the hash
}

The idea is to only remember process ids until the matching id is found. This should only happen if a previous create apple line is present.

Note that if your data is inconsistent, you need a lot more error checking.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.