Delete specific rows based on specific word in column

Question

I have very large tab-separated files, and I need delete all rows where the word "TelePacific" appears in a specific column. In this case all the rows where TelePacifc occurs in the 4th column. Here is an example input file:

7/18/13 10:06   0:00:09 TelePacific random person DEREK         9256408665  random company
7/18/13 10:07   0:00:21 TelePacific random person DEREK         9256408665  random company
7/18/13 10:10   0:19:21 TelePacific random person DEREK         9256408665  random company
7/18/13 10:39   0:01:07 random person       107  
7/18/13 11:02   0:01:41 random person Gilbert       107 TelePacific
7/18/13 12:17   0:00:42 random person Gilbert       107 TelePacific
7/18/13 13:35   0:00:41 random person Gilbert       107 TelePacific
7/18/13 13:44   0:12:30 TelePacific ADKNOWLEDGE     8169311771  random company
7/18/13 14:46   0:19:48 TelePacific TOLL FREE CALL  8772933939  random company
7/15/13 10:09   0:01:27 random person Esquivel      272 TelePacific
7/15/13 10:16   0:00:55 random person Esquivel      272 TelePacific
7/15/13 10:59   0:00:51 random person Esquivel      272 TelePacific
7/15/13 11:01   0:01:09 random person Esquivel      272 TelePacific

anubhava · Accepted Answer · 2013-07-24 19:23:28Z

5

Using grep -v:

grep -v "\bTelePacific\b" file > output && mv output file

Or using awk:

awk '$4 != "TelePacific"' file > output && mv output file

answered Jul 24, 2013 at 19:23

anubhava

790k67 gold badges603 silver badges671 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Jeff Bowman Over a year ago

+1 for \b ("match word boundary"), so you only match the word "TelePacific" instead of "FooTelePacific" or "TelePacificFoo".

Jeff Bowman · Accepted Answer · 2013-07-24 19:45:18Z

1

fgrep -v will do this.

fgrep is equivalent to grep -F and prevents grep from interpreting special characters in your pattern as regex control characters. The -v parameter causes fgrep to output all lines that don't match the pattern, in contrast to outputting the lines that do (which is the default).

fgrep -v TelePacific inputfile.tsv > outputfile.tsv

As anubhava noted above, you may choose grep -v "\bTelePacific\b" instead to ensure that you don't accidentally match "TelePacificFoo" or "FooTelePacific".

edited Jul 24, 2013 at 19:45

answered Jul 24, 2013 at 19:23

Jeff Bowman

96.2k19 gold badges227 silver badges266 bronze badges

2 Comments

Fr0ntSight Over a year ago

Is there anyway to do it where it only searches for instances of TelePacific in the 4th column?

Jeff Bowman Over a year ago

@Fr0ntSight That's the point where grep-related tools stop being very helpful. You could write a really nasty regular expression to parse tabs, or make a clever loop in shell script, but awk is actually designed for whitespace-delimited separated fields and that makes anubhava's awk solution the right tool for the job.

Chris Seymour · Accepted Answer · 2013-07-25 19:58:19Z

1

This should do the trick:

$ sed '/TelePacific/d' file

If you are happy with the output use the -i option to store the changes back to the file.

$ sed -i '/TelePacific/d' file

EDIT:

To only return results for TelePacific in the fourth column:

$ awk '$4=="TelePacific"' file

Or the inverse:

$ awk '$4!="TelePacific"' file

edited Jul 25, 2013 at 19:58

answered Jul 24, 2013 at 19:22

Chris Seymour

86.4k32 gold badges166 silver badges209 bronze badges

4 Comments

anubhava Over a year ago

Won't this also delete lines with text FooTelePacific?

ahilsend Over a year ago

Sure it would, but the question wasn't that specific.

anubhava Over a year ago

@ahilsend: Example file and this statement I have very large tab separated files indicates it is a separate word.

Fr0ntSight Over a year ago

Is there anyway to do it where it only searches for instances of TelePacific in the 4th column?

ds_ · Accepted Answer · 2013-07-24 19:22:11Z

0

here is a solution with sed

#!/bin/bash

sed '/TelePacific/d' your_file.txt > file_without_telepacific.txt

answered Jul 24, 2013 at 19:22

ds_

12 bronze badges

Comments

ahilsend · Accepted Answer · 2013-07-24 19:23:31Z

0

Try this:

grep -v TelePacific in-file > out-file

The -v option inverts the search, so grep prints all lines that don't match the search pattern.

This won't work if in-file and out-file are the same. To achive that you have to use a temp file like this:

grep -v TelePacific in-file > in-file.tmp && mv in-file.tmp in-file

answered Jul 24, 2013 at 19:23

ahilsend

9316 silver badges15 bronze badges

Collectives™ on Stack Overflow

Delete specific rows based on specific word in column

5 Answers 5

1 Comment

2 Comments

4 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

1 Comment

2 Comments

4 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related