How to remove duplicated rows by a column in an R matrix

Question

I am trying to remove duplicated rows by one column (e.g the 1st column) in an R matrix. How can I extract the unique set by one column from a matrix? I've used

x_1 <- x[unique(x[,1]),]

While the size is correct, all of the values are NA. So instead, I tried

x_1 <- x[-duplicated(x[,1]),]

But the dimensions were incorrect.

joran · Accepted Answer · 2011-07-26 20:04:37Z

29

I think you're confused about how subsetting works in R. unique(x[,1]) will return the set of unique values in the first column. If you then try to subset using those values R thinks you're referring to rows of the matrix. So you're likely getting NAs because the values refer to rows that don't exist in the matrix.

Your other attempt runs afoul of the fact that duplicated returns a boolean vector, not a vector of indices. So putting a minus sign in front of it converts it to a vector of 0's and -1's, which again R interprets as trying to refer to rows.

Try replacing the '-' with a '!' in front of duplicated, which is the boolean negation operator. Something like this:

m <- matrix(runif(100),10,10)
m[c(2,5,9),1] <- 1
m[!duplicated(m[,1]),]

answered Jul 26, 2011 at 20:04

joran

175k34 gold badges439 silver badges484 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

daroczig · Accepted Answer · 2011-07-26 20:05:13Z

13

As you need the indeces of the unique rows, use duplicated as you tried. The problem was using - instead of !, so try:

x[!duplicated(x[,1]),]

answered Jul 26, 2011 at 20:05

daroczig

28.7k7 gold badges94 silver badges125 bronze badges

Collectives™ on Stack Overflow

How to remove duplicated rows by a column in an R matrix

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related