21

I am trying to remove duplicated rows by one column (e.g the 1st column) in an R matrix. How can I extract the unique set by one column from a matrix? I've used

x_1 <- x[unique(x[,1]),]

While the size is correct, all of the values are NA. So instead, I tried

x_1 <- x[-duplicated(x[,1]),]

But the dimensions were incorrect.

2 Answers 2

29

I think you're confused about how subsetting works in R. unique(x[,1]) will return the set of unique values in the first column. If you then try to subset using those values R thinks you're referring to rows of the matrix. So you're likely getting NAs because the values refer to rows that don't exist in the matrix.

Your other attempt runs afoul of the fact that duplicated returns a boolean vector, not a vector of indices. So putting a minus sign in front of it converts it to a vector of 0's and -1's, which again R interprets as trying to refer to rows.

Try replacing the '-' with a '!' in front of duplicated, which is the boolean negation operator. Something like this:

m <- matrix(runif(100),10,10)
m[c(2,5,9),1] <- 1
m[!duplicated(m[,1]),]
Sign up to request clarification or add additional context in comments.

Comments

13

As you need the indeces of the unique rows, use duplicated as you tried. The problem was using - instead of !, so try:

x[!duplicated(x[,1]),]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.