1

I am trying to grep lines matching some pattern and then trying to print those matched lines.

#!/bin/bash

file=/path/to/some/file
pattern=socket
if [ -f $file ]; then
    lines=`grep -i "$pattern" $file`
# Case 1
    for x in $lines; do   # <--- isn't this an array
        echo "$x"                                                                                                                                                                                                                                                                
        done
# Case 2
    while read -r line_a; do
        echo "$line_a"
        done <<< "$lines"
fi

Output:
Case 1: Instead of complete line, individual words from those lines are printed on each new line.
Case 2: Individual lines are printed.

Question:
Why doesn't case 1 print the whole line on one line instead of printing individual words from that line on each new line? Isn't $lines an array of strings (lines in my case) ?

1
  • 1
    No it is not an array, you capture the result of grep as one big string by using backticks. I am guessing the for loop is treating whitespace as the record separator and so each word is treated as an element Commented Apr 5, 2015 at 1:45

2 Answers 2

3

Isn't $lines an array of strings (lines in my case)?

No; $lines is a scalar string variable that contains the entire output captured from command grep -i "$pattern" $file - in other words: a single string comprising potentially multiple lines.

Why doesn't case 1 print the whole line on one line instead of printing individual words from that line on each new line?

Because you're referencing variable $lines unquoted, which means that it is subject to word splitting (among other so-called shell expansions).

Word splitting means that the input is split into tokens by whitespace (even across lines), and each token is passed separately to the for loop.


With a single input string, even if you set $IFS to $'\n', there is no safe way to to iterate over its lines with for, because the lines are still subject to pathname expansion (globbing); i.e., if a line contains a substring that happens to be a valid glob (filename pattern, e.g., *), it will be expanded to the matching filenames.

Using an array of lines in a for loop does work, but requires that it be populated with unmodified input lines; using lines=($(grep -i "$pattern" "$file")) to populate the array is NOT an option, for the same reasons as stated above.


You have two choices, both of which use a process substitution to capture the grep command's output:

(a) If you truly need to read all lines up front into memory, read them robustly into an array as follows:

IFS=$'\n' read -d '' -ra lines < <(grep -i "$pattern" "$file") 

In bash 4+, you can use readarray -t lines ... instead.

Then process them in a for loop as follows:

 for line in "${lines[@]}"; do # double quotes prevent word splitting and globbing
    echo "$line"
 done 

(b) Otherwise, use a while loop to directly read grep's output line by line:

while IFS= read -r line; do
    echo "$line"
done < <(grep -i "$pattern" "$file")
Sign up to request clarification or add additional context in comments.

5 Comments

Thanks for the detailed explanation. I recently started working on bash, where can I learn about when to use - single/double quotes, back ticks, curly braces, square braces, round braces etc.. ?
stackoverflow.com/a/23140961/45375 will give you a quick intro to quoting; shell expansions discusses all expansions (substitutions) that the shell performs. mywiki.wooledge.org/BashGuide is a great Bash resource in general. Also, don't forget man bash, which contains all relevant info, but is dense and not easy to read.
can you please explain (b), what is the value IFS is being set to? I am able to read the lines separately but in those lines any appearance of character "n" is being replaced by a " " (space).
IFS= (followed by a space) means that IFS is set to the empty string, which deactivates word splitting, meaning that each input line is read unmodified, as a whole into $line.
I have no explanation for "n" being replaced by a space. You could create a new question with the code in question and more details.
1

You are currently capturing the output using back ticks which considers the entire output as one big string. If you want to capture it as an array use the following notation

lines=($(grep -i "$pattern" $file))

However, the default record separator is whitespace so each array element will be a single word and not an entire line from the grep output. You can circumvent this by (temporarily) changing the record separator IFS to split on new line characters. The whole solution would look like the following

IFS=$'\n'
lines=($(grep -i "$pattern" $file))
for x in ${lines[@]}; do
    echo $x
done

Note that you now have changed IFS for the shell and you would probably want to reset it to the old value. As you can see, this approach is very likely not the most optimal for your problem, but I posted it here to answer your original question

4 Comments

lines=($(grep -i "$pattern" $file))....It still doesn't work. It is storing white space separated characters as elements of the array.
are you setting the IFS to a newline character properly?
Yes, that was the problem. Thanks!
If you don't double-quote ${lines[@]}, it is subject to word splitting again - while this does no harm in this case - given that $IFS is still set to a \n - it is unnecessary. More importantly, however, using lines=($(grep -i "$pattern" $file)) will invariably subject the lines output by the grep command to pathname expansion (globbing), which is typically undesired.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.