Iterative Bash Script Bug

Question

Using a bash script, I'm trying to iterate through a text file that only has around 700 words, line-by-line, and run a case-insensitive grep search in the current directory using that word on particular files. To break it down, I'm trying to output the following to a file:

Append a newline to a file, then the searched word, then another newline

Append the results of the grep command using that search

Repeat steps 1 and 2 until all words in the list are exhausted

So for example, if I had this list.txt:

search1
search2

I'd want the results.txt to be:

search1:
grep result here

search2:
grep result here

I've found some answers throughout the stack exchanges on how to do this and have come up with the following implementation:

#!/usr/bin/bash

while IFS = read -r line;
do 
    "\n$line:\n" >> "results.txt";
    grep -i "$line" *.in >> "results.txt";
done < "list.txt"

For some reason, however, this (and the numerous variants I've tried) isn't working. Seems trivial, but I'd it's been frustrating me beyond belief. Any help is appreciated.

is grep result here only the list of file names that contain the pattern searchX? Do you also need the line number? — Allan
– Allan, Commented Mar 16, 2018 at 7:17
shellcheck.net is a good option to quickly find issues with shell script — Sundeep
– Sundeep, Commented Mar 16, 2018 at 8:11

Ed Morton · Accepted Answer · 2018-03-16 12:21:11Z

1

Your script would work if you changed it to:

while IFS= read -r line; do
    printf '\n%s:\n' "$line"
    grep -i "$line" *.in
done < list.txt > results.txt

but it'd be extremely slow. See https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice for why you should think long and hard before writing a shell loop just to manipulate text. The standard UNIX tool for manipulating text is awk:

awk '
NR==FNR { words2matches[$0]; next }
{
    for (word in words2matches) {
        if ( index(tolower($0),tolower(word)) ) {
            words2matches[word] = words2matches[word] $0 ORS
        }
    }
}
END {
    for (word in words2matches) {
        print word ":" ORS words2matches[word]
    }
}
' list.txt *.in > results.txt

The above is untested of course since you didn't provide sample input/output we could test against.

edited Mar 16, 2018 at 12:21

answered Mar 16, 2018 at 12:07

Ed Morton

209k18 gold badges90 silver badges212 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Anubis The Coding Nooby Puppy Over a year ago

One problem I ran into was that the $line variable included a newline, but found a way to format that out. Thanks for the detailed solution!

Ed Morton Over a year ago

It is not possible for the $line variable to include a newline since the shell loop is reading one line at a time into that variable. It might contain a carriage return e.g. if your input file was generated on a Windows machine, and you can remove those by running dos2unix or similar on the file. Glad it helped!

Onkar Kamatkar · Accepted Answer · 2018-03-16 07:20:43Z

0

Possible problems:

bash path - use /bin/bash path instead of /usr/bin/bash
blank spaces - remove ' ' after IFS
echo - use -e option for handling escape characters (here: '\n')
semicolons - not required at end of line

Try following script:

#!/bin/bash

while IFS= read -r line; do
    echo -e "$line:\n" >> "results.txt"
    grep -i "$line" *.in >> "results.txt"
done < "list.txt"

answered Mar 16, 2018 at 7:20

Onkar Kamatkar

2122 silver badges10 bronze badges

Comments

Allan · Accepted Answer · 2018-03-16 08:52:55Z

0

You do not even need to write a bash script for this purpose:

INPUT FILES:

$ more file?.in
::::::::::::::
file1.in
::::::::::::::
abc
search1
def
search3
::::::::::::::
file2.in
::::::::::::::
search2
search1
abc
def
::::::::::::::
file3.in
::::::::::::::
abc
search1
search2
def
search3

PATTERN FILE:

$ more patterns 
search1
search2
search3

CMD:

$ grep -inf patterns file*.in | sort -t':' -k3 | awk -F':' 'BEGIN{OFS=FS}{if($3==buffer){print $1,$2}else{print $3; print $1,$2}buffer=$3}'

OUTPUT:

search1
file1.in:2
file2.in:2
file3.in:2
search2
file2.in:1
file3.in:3
search3
file1.in:4
file3.in:5

EXPLANATIONS:

grep -inf patterns file*.in will grep all the file*.in with all the patterns located in patterns file thanks to -f option, using -i forces insensitive case, -n will add the line numbers
sort -t':' -k3 you sort the output with the 3rd column to regroup patterns together
awk -F':' 'BEGIN{OFS=FS}{if($3==buffer){print $1,$2}else{print $3; print $1,$2}buffer=$3}' then awk will print the display that you want by using : as Field Separator and Output Field Separator, you use a buffer variable to save the pattern (3rd field) and you print the pattern whenever it changes ($3!=buffer)

answered Mar 16, 2018 at 8:52

Allan

12.5k3 gold badges33 silver badges56 bronze badges

1 Comment

Ed Morton Over a year ago

The problem with that is that grep isn't outputting the string that it was searching for, it's outputting the line that matched the string it was searching for. They just happen to be the same thing in your sample input bu try changing search1 to search1stuff in one of your input files to see what I mean.

Collectives™ on Stack Overflow

Iterative Bash Script Bug

3 Answers 3

2 Comments

Comments

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related