R: remove duplicated values in across rows and columns

Question

I've found many pages about finding duplicated elements in a list or duplicated rows in a data frame. However, I want to search for duplicated elements throughout the entire data frame. Take this as an example:

df
     coupon1    coupon2    coupon3
1         10         11         12
2         13         16         15
3         16         17         18
4         19         20         21
5         22         23         24
6         25         26         27

You'll notice that df[2,2] and df[3,1] have the same element (16). When I run

duplicated(df)

It returns six "FALSE"s because the entire row isn't duplicated, just one element. How can I check for any duplicated values within the entire data frame? I would like to both know the duplicate exist and also know its value (and the same if there's multiple duplicates).

is it enough for your purposes to map to a vector: duplicated(matrix(df, ncol=1)) — mts
– mts, Commented Jul 7, 2015 at 18:32
The only thing is this matrix can be thousands of lines long, so I'm looking for a solution that deals with it as a data frame. — Kira Tebbe
– Kira Tebbe, Commented Jul 7, 2015 at 18:40

Pierre L · Accepted Answer · 2015-07-07 18:38:50Z

2

This will find global dupes but it searches columnwise. So (3,1) will still be FALSE as it is the first value 16 in the data frame.

m <- matrix(duplicated(unlist(df)), ncol=ncol(df))
#      [,1]  [,2]  [,3]
#[1,] FALSE FALSE FALSE
#[2,] FALSE  TRUE FALSE
#[3,] FALSE FALSE FALSE
#[4,] FALSE FALSE FALSE
#[5,] FALSE FALSE FALSE
#[6,] FALSE FALSE FALSE

You can then use it however you'd like, for example:

df[m]
#[1] 16

edited Jul 7, 2015 at 18:38

answered Jul 7, 2015 at 18:33

Pierre L

28.5k6 gold badges50 silver badges70 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

user227710 · Accepted Answer · 2015-07-07 18:34:02Z

1

which(duplicated(stack(yourdf)[,1]))
[1] 8
stack(yourdf)[,1][which(duplicated(stack(yourdf)[,1]))]
[1] 16

answered Jul 7, 2015 at 18:34

user227710

3,19420 silver badges36 bronze badges

Collectives™ on Stack Overflow

R: remove duplicated values in across rows and columns

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related