I have a data set that looks something like this:
Input
Cat 2 1 aa
Dog 1 0 aa
Dog 1 2 aa
Cat 2 7 aa
Mouse 0 0 aa
Cat 1 5
Dog 4 3
. . .
. . .
. . .
Cat 1 5
Dog 4 3
Cat 6 9 bb
Dog 3 1 bb
Dog 3 6 bb
Cat 6 4 bb
Mouse 0 0 bb
With this dataset I want to do the following:
- If column 4 is blank, print the line.
If Column 4 is not blank, print only the first occurrence of the record with each combination of column 1 and column 4.
Output
Cat 2 1 aa
Dog 1 0 aa
Mouse 0 0 aa
Cat 1 5
Dog 4 3
. . .
. . .
. . .
Cat 1 5
Dog 4 3
Cat 6 4 bb
Dog 3 1 bb
Mouse 0 0 bb
Note that here: "Cat 2 1 aa" is the first record with column 1=cat and column 4=aa, so it is printed. "cat 1 5 aa" is not printed since we already have a record with column 1=cat and column 4=aa.