3

I'm trying to use awk to print the lines contained in one file2 if the numbers in column 1 or 5 is in file1, but I'm getting a syntax error that I don't understand.

My input is:

file1.dat

1
3
4
6
8
13
14
25

etc...

file2.dat

2 GLU 1 - 3 ARG 2
24 ASP 2 - 12 LYS 1
3 ASP 1 - 25 ARG 2
7 LYS 2 - 17 GLU 2
18 ARG 1 - 13 GLU 2

etc...

In this case I want the output

2 GLU 1 - 3 ARG 2
3 ASP 1 - 25 ARG 2
18 ARG 1 - 13 GLU 2

I tried to do this with the following awk-line

awk -F 'NR==FNR{a[$1||$5]++;next} (a[$1||$5])' file1.dat file2.dat

but I get the error

awk: file1.dat
awk:      ^syntax error

Does anyone know what is causing this error? I have tried to put the file names into variables but that produces the same error.

1
  • 1
    I am not sure how the script (or algorithm) would work, but the syntax error is due to treating "file1.dat" as awk code, because the actual awk code got treated as the "input field separator" by -F. If you remove that -F, this specific syntax error may go away, but not sure if you will get the required output. Commented May 9, 2015 at 18:36

1 Answer 1

4

You have supplied the -F flag for field separator. The next argument is therefore the fields separator (which is your script because you didn't supply a seperator). So awk takes the first file as the script itself.

Try to either drop the -F or adding a separator e.g. awk -F '[ \t]*' '...' file1 file2.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.