0

I have a text file with the following structure:

text1;text2;text3;text4
...

I need to write a script that gets 2 arguments: the column we want to search in and the content we want to find.

So the script should output only the lines (WHOLE LINES!) that match content(arg2) found in column x(arg1).

I tried with egrep and sed, but I'm not experienced enough to finish it. I would appreciate some guidance...

5 Answers 5

4

Given your added information of needing to output the entire line, awk is easiest:

awk -F';' -v col=$col -v pat="$val" '$col ~ pat' $input

Explaining the above, the -v options set awk variables without needing to worry about quoting issues in the body of the awk script. Pre-POSIX versions of awk won't understand the -v option, but will recognize the variable assignment without it. The -F option sets the field separator. In the body, we are using a pattern with the default action (which is print); the pattern uses the variables we set with -v for both the column ($ there is awk's "field index" operator, not a shell variable) and the pattern (and pat can indeed hold an awk-style regex).

Sign up to request clarification or add additional context in comments.

2 Comments

mawk is one variant of awk, yes. It should be compatible with what I wrote; some awks won't understand -v, but you can just list the variable settings without the -v prefix in that case (pre-POSIX syntax for command line variable settings).
To use shell variable (by value) in awk I often do: awk -F';' '$'$COL' ~ '$PAT'' $OUTPUT . Note, that $COL is a shell variable, which is seen by awk as value (of that variable). This method is less readable but more universal across many awk implementations.
1
cat text_file.txt| cut -d';' column_num | grep pattern

It prints only the column that is matched and not the entire line. let me think if there is a simple solution for that.

1 Comment

But i need to output the whole line, not just the found column.
1

Python

 #!/usr/bin/env python
 import sys
 column = 1 # the column to search
 value = "the data you're looking for"
 with open("your file","r") as source:
    for line in source:
        fields = line.strip().split(';')
        if fields[column] == value:
             print line

Comments

0

There's also a solution with egrep. It's not a very beautiful one but it works:

egrep "^([^;]+;){`expr $col - 1`}$value;([^;]+;){`expr 3 - $col`}([^;]+){`expr 4 - $col`}$" filename

or even shorter:

egrep "^([^;]+;){`expr $col - 1`}$value(;|$)" filename

Comments

0
grep -B1 -i "string from previous line" |grep -iv 'check string from previous line' |awk -F" " '{print $1}'

This will print your line.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.