0

I have a hack script that is checking for some entries in a vary large log file. It is a mix of perl and bash - it works fine - the script gets what it needs to get. The only problem is the formatting "$chdk_subscription" I have tried to delimit the output by a keyword 'STATS' and 'added'. These two words are at the beginning and the end of each line that I want to read.

#!/bin/bash
pid_foo_process=$(pgrep foo_process)
check_primary=$(grep $1 /data/foo_process-0210.$pid_foo_process.log | perl -nle 'print /(primary book \w+:\w+)/ ')
check_primary_symbol=$(grep $1 /data/foo_process-0210.$pid_foo_process.log | perl -nle 'print /primary book (\w+:\w+)/ ')
chdk_subscription=$(grep $1 /data/foo_process-0210.$pid_foo_process.log | perl -nle 'print if /(subscription for \w+:\w+.*)/ ')
echo $check_primary
echo $check_primary_symbol

IFS="STAT"
while read line
echo $line
done < echo $chdk_subscription

Breaking up "$chdk_subscription" by 'IFS' does not seem to work

echo $chdk_subscription| awk -F"STATS" '{print $0}'

using awk is not working either -- each time it comes out as one big line with no newlines.

STATS 10/15 08:03:09.391048 32978  (0)SB: subscription for APA:T added STATS 10/15 08:03:09.391164 32978  (0)SB: subscription for APA:P added STATS 10/15 08:03:09.391226 32978  (0)SB: subscription for APA:Z added STATS 10/15 08:03:09.391537 32978  (0)SB: subscription for APA:n added STATS 10/15 08:03:09.391599 32978  (0)SB: subscription for APA:A added STATS 10/15 08:03:09.391686 32978  (0)SB: subscription for APA:a added STATS 10/15 08:03:09.391756 32978  (0)SB: subscription for APA:K added STATS 10/15 08:03:09.391818 32978  (0)SB: subscription for APA:J added STATS 10/15 09:38:12.826928 32978 (0)SB: subscription for APA:N, XNYSAPA:3 added

I want something like this that I can read.

STATS 10/15 08:03:09.391048 32978  (0)SB: subscription for APA:T added
STATS 10/15 08:03:09.391164 32978  (0)SB: subscription for APA:P added
STATS 10/15 08:03:09.391226 32978  (0)SB: subscription for APA:Z added
STATS 10/15 08:03:09.391537 32978  (0)SB: subscription for APA:n added
STATS 10/15 08:03:09.391599 32978  (0)SB: subscription for APA:A added
STATS 10/15 08:03:09.391686 32978  (0)SB: subscription for APA:a added
STATS 10/15 08:03:09.391756 32978  (0)SB: subscription for APA:K added
STATS 10/15 08:03:09.391818 32978  (0)SB: subscription for APA:J added
STATS 10/15 09:38:12.826928 32978  (0)SB: subscription for APA:N, XNYSAPA:3 added
3
  • 1
    Could you give an example of how it looks BEFORE the filtering, which corresponds to the expected output you already presented. Commented Oct 15, 2014 at 22:15
  • 1
    I don't grok the details of what you're trying to do, but one important note: Quote! echo $line IS NOT THE SAME AS echo "$line"; the former will change newlines to spaces, expand glob expressions, and do all kinds of other messing with your formatting you almost certainly don't want. Commented Oct 15, 2014 at 22:25
  • Also, unless you want backslash-escape sequences parsed (for instance, \n changed from two characters into a single newline), use read -r, not bare read. Commented Oct 15, 2014 at 22:26

2 Answers 2

1

You're better off using a temporary file rather than reading multiple lines into a bash variable.

temp_file=$(mktemp)

if [ 0 -eq $? ]; then
  trap 'rm -f -- "${temp_file}"' 0
else
  echo "Unable to create temporary file!"
  exit 1
fi

# Fill temporary file.
pid_foo_process=$(pgrep foo_process)
grep $1 /data/foo_process-0210.${pid_foo_process}.log | perl -nle 'print /(primary book \w+:\w+)/ ' >>"${temp_file}"
grep $1 /data/foo_process-0210.${pid_foo_process}.log | perl -nle 'print /primary book (\w+:\w+)/ ') >>"${temp_file}"
grep $1 /data/foo_process-0210.${pid_foo_process}.log | perl -nle 'print if /(subscription for \w+:\w+.*)/ ' >>"${temp_file}"

# Print contents.
while read line
  echo "${line}"
done < "${temp_file}"

Note that the trap statement at the top will automatically delete the temporary file when the script is finished.

Sign up to request clarification or add additional context in comments.

4 Comments

Significantly improved. I'd suggest using read -r, absent a specific and compelling reason to do otherwise. Also, unless you clear IFS during the read operation, trailing whitespace will be stripped.
...and if you want to be able to reliably print lines that only contain, say, -n or -E, then you're better off using printf '%s\n' "$line" rather than echo at all.
I'd also suggest putting trap 'rm -f -- "$temp_file"' 0 up at the top, instead of having a rm at the end; that way if the script exits from some different point rather than running through to the end, the temporary file still gets deleted.
@CharlesDuffy - The formatting of the while loop reflects the code in the question, however your point about the trap is a good one so I'll add that to my answer. Thanks!
0
awk 'BEGIN { RS=" ?STATS "} NR > 1 {print "STATS "$0}'

RS=" ?STATS " is the separator, catching any STATS with a space after him and zero to one spaces before him. This separator will result in a first empty record. NR is the Record Number, so NR > 1 will ignore the first record.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.