1

I have a list of IDs like so:

11002
10995
48981

And a tab delimited file like so:

11002   Bacteria;
10995   Metazoa

I am trying to delete all lines in the tab delimited file containing one of the IDs from the ID list file. For some reason the following won't work and just returns the same complete tab delimited file without any line removed whatsoever:

 grep -v -f ID_file.txt tabdelimited_file.txt > New_tabdelimited_file.txt

I also tried numerous other combinations with grep, but currently I draw blank here.

Any idea why this is failing?

Any help would be greatly appreciated

0

1 Answer 1

2

Since you tagged this with awk, here is one way of doing it:

awk 'BEGIN{FS=OFS="\t"}NR==FNR{ids[$1]++;next}!($1 in ids)' idFile tabFile > new_tabFile

BTW your grep command is correct. Just double check if your file is not formatted for windows.

Sign up to request clarification or add additional context in comments.

5 Comments

Thanks for your answer, but this also doesn't seem to work, same result (nothing gets removed from the tab delimited text file). Not sure how I can check if my files are correctly formatted though..
@user2600287 Do cat -vet on your file. If you see ^M at the end then you need to convert it back to unix format. You can use dos2unix for that.
Thank you for the clarification. Making a new text file in Linux rather than using the one transferred from Windows also solved the problem, so you were right
You could have fixed the files with dos2unix or sed -i 's/\r$//' or tr -d '\r'
I like your awk solution, worked on a very large file whereas the grep -v solution did not.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.