0

Using a bash script, I'm trying to iterate through a text file that only has around 700 words, line-by-line, and run a case-insensitive grep search in the current directory using that word on particular files. To break it down, I'm trying to output the following to a file:

  1. Append a newline to a file, then the searched word, then another newline
  2. Append the results of the grep command using that search
  3. Repeat steps 1 and 2 until all words in the list are exhausted

So for example, if I had this list.txt:

search1
search2

I'd want the results.txt to be:

search1:
grep result here

search2:
grep result here

I've found some answers throughout the stack exchanges on how to do this and have come up with the following implementation:

#!/usr/bin/bash

while IFS = read -r line;
do 
    "\n$line:\n" >> "results.txt";
    grep -i "$line" *.in >> "results.txt";
done < "list.txt"

For some reason, however, this (and the numerous variants I've tried) isn't working. Seems trivial, but I'd it's been frustrating me beyond belief. Any help is appreciated.

3
  • is grep result here only the list of file names that contain the pattern searchX? Do you also need the line number? Commented Mar 16, 2018 at 7:17
  • 1
    shellcheck.net is a good option to quickly find issues with shell script Commented Mar 16, 2018 at 8:11
  • echo -e to interpret the \n newline :) Commented Mar 16, 2018 at 8:21

3 Answers 3

1

Your script would work if you changed it to:

while IFS= read -r line; do
    printf '\n%s:\n' "$line"
    grep -i "$line" *.in
done < list.txt > results.txt

but it'd be extremely slow. See https://unix.stackexchange.com/questions/169716/why-is-using-a-shell-loop-to-process-text-considered-bad-practice for why you should think long and hard before writing a shell loop just to manipulate text. The standard UNIX tool for manipulating text is awk:

awk '
NR==FNR { words2matches[$0]; next }
{
    for (word in words2matches) {
        if ( index(tolower($0),tolower(word)) ) {
            words2matches[word] = words2matches[word] $0 ORS
        }
    }
}
END {
    for (word in words2matches) {
        print word ":" ORS words2matches[word]
    }
}
' list.txt *.in > results.txt

The above is untested of course since you didn't provide sample input/output we could test against.

Sign up to request clarification or add additional context in comments.

2 Comments

One problem I ran into was that the $line variable included a newline, but found a way to format that out. Thanks for the detailed solution!
It is not possible for the $line variable to include a newline since the shell loop is reading one line at a time into that variable. It might contain a carriage return e.g. if your input file was generated on a Windows machine, and you can remove those by running dos2unix or similar on the file. Glad it helped!
0

Possible problems:

  1. bash path - use /bin/bash path instead of /usr/bin/bash
  2. blank spaces - remove ' ' after IFS
  3. echo - use -e option for handling escape characters (here: '\n')
  4. semicolons - not required at end of line

Try following script:

#!/bin/bash

while IFS= read -r line; do
    echo -e "$line:\n" >> "results.txt"
    grep -i "$line" *.in >> "results.txt"
done < "list.txt"

Comments

0

You do not even need to write a bash script for this purpose:

INPUT FILES:

$ more file?.in
::::::::::::::
file1.in
::::::::::::::
abc
search1
def
search3
::::::::::::::
file2.in
::::::::::::::
search2
search1
abc
def
::::::::::::::
file3.in
::::::::::::::
abc
search1
search2
def
search3

PATTERN FILE:

$ more patterns 
search1
search2
search3

CMD:

$ grep -inf patterns file*.in | sort -t':' -k3 | awk -F':' 'BEGIN{OFS=FS}{if($3==buffer){print $1,$2}else{print $3; print $1,$2}buffer=$3}'

OUTPUT:

search1
file1.in:2
file2.in:2
file3.in:2
search2
file2.in:1
file3.in:3
search3
file1.in:4
file3.in:5

EXPLANATIONS:

  • grep -inf patterns file*.in will grep all the file*.in with all the patterns located in patterns file thanks to -f option, using -i forces insensitive case, -n will add the line numbers
  • sort -t':' -k3 you sort the output with the 3rd column to regroup patterns together
  • awk -F':' 'BEGIN{OFS=FS}{if($3==buffer){print $1,$2}else{print $3; print $1,$2}buffer=$3}' then awk will print the display that you want by using : as Field Separator and Output Field Separator, you use a buffer variable to save the pattern (3rd field) and you print the pattern whenever it changes ($3!=buffer)

1 Comment

The problem with that is that grep isn't outputting the string that it was searching for, it's outputting the line that matched the string it was searching for. They just happen to be the same thing in your sample input bu try changing search1 to search1stuff in one of your input files to see what I mean.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.