0

I have a directory with files that I want to process one by one and for which each output looks like this:

==== S=721 I=47 D=654 N=2964 WER=47.976% (1422)

Then I want to calculate the average percentage (column 6) by piping the output to AWK. I would prefer to do this all in one script and wrote the following code:

for f in $dir; do
    echo -ne "$f "
    process $f
done | awk '{print $7}' | awk -F "=" '{sum+=$2}END{print sum/NR}'

When I run this several times, I often get different results although in my view nothing really changes. The result is almost always incorrect though.

However, if I only put the for loop in the script and pipe to AWK on the command line, the result is always the same and correct.

What is the difference and how can I change my script to achieve the correct result?

4
  • Perhaps, this is same buffering issue. Just redirect the output to a file and then run awk on it. Or you can use cut and bc to compute the average Commented Oct 18, 2013 at 9:24
  • 2
    Try explaining problem again with some sample data as its not easy to understand what your script is doing. Commented Oct 18, 2013 at 9:45
  • 1
    Do any of your filenames contain spaces? Does your script define $dir? Commented Oct 18, 2013 at 10:25
  • 1
    It's not clear what the problem is. Can you produce an example of the input, along with the wrong output and the expected output? Commented Oct 18, 2013 at 12:42

2 Answers 2

1

Guessing a little about what you're trying to do, and without more details it's hard to say what exactly is going wrong.

for f in $dir; do
    unset TEMPVAR
    echo -ne "$f "
    TEMPVAR=$(process $f | awk '{print $7}')
    ARRAY+=($TEMPVAR)
done

I would append all your values to an array inside your for loop. Now all your percentages are in $ARRAY. It should be easy to calculate the average value, using whatever tool you like.

This will also help you troubleshoot. If you get too few elements in the array ${#ARRAY[@]} then you will know where your loop is terminating early.

Sign up to request clarification or add additional context in comments.

Comments

0
# To get the percentage of all files
Percs=$(sed -r 's/.*WER=([[:digit:].]*).*/\1/' *)

# The divisor
Lines=$(wc -l <<< "$Percs")

# To change new lines into spaces
P=$(echo $Percs)

# Execute one time without the bc. It's easier to understand
echo "scale=3; (${P// /+})/$Lines" | bc

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.