0

I have a text file like below.

1 1223 abc
2 4234 weroi
0 3234 omsder
1 1111 abc 
2 6666 weroi

I want to have unique values for the column 3. So I want to have the below file.

1 1223 abc
2 4234 weroi
0 3234 omsder

Can I do this using some basic commands in Linux? without using Java or something.

1 Answer 1

1

You could do this with some awk scripting. Here is a piece of code I came up with to address your problem :

awk 'BEGIN {col=3; sep=" "; forbidden=sep} {if (match(forbidden, sep $col sep) == 0) {forbidden=forbidden $col sep; print $0}}' input.file

The BEGIN keyword declares the forbidden string, which is used to monitor the 3rd column values. Then, the match keyword check if the 3rd column of the current line contains any forbidden value. If not, it adds the content of the column to the forbidden list and print the whole line.

Here, sep=" " instantiate the separator. We use sep between each forbidden value in order to avoid words created by putting several values next to one another. For instance :

1 1111 ta
2 2222 to
3 3333 t
4 4444 tato

In this case, without a separator, t and tato would be considered a forbidden value. We use " " as a separator as it is used by default to separate each column, thus a column cannot include a space in its name.

Note that if you want to change the number of the column in which you need to remove duplicate, just adapt col=3 with the number of the column you need (0 for the whole line, 1 for the first column, 2 for the second, ...)

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks for the input but for my above file it doesn't remove duplicates. I get the below output for my file. '1 1223 abc, 2 4234 weroi, 0 3234 omsder, 1 1111 abc'. Here '1 1111 abc' should also be removed.
@KillBill I believe you made a mistake copying my code :) You have to write BEGIN {forbidden=" "} : forbidden must be initialized with your separator (in this case, an empty space), otherwise the first line will never be matched (as we look for the " "$3" " pattern in our forbidden string)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.