Create a new dataframe with same values of a column

Question

I have two dataframes like this:

#
df_1 <- data.frame(x = c('x4','x4','x5','x5','x5','x6','x6'),
                   y = c(0,0,1,1,1,0,0))
#
df_2 <- data.frame(x = c('x4','x4','x5','x5','x5','x7','x7'),
                   z = c(1,1,1,1,1,0,0))

I would like to merge them based on column x but in the new df have only the rows which are the same in both x column of every df. Example output:

I tried this

merge(x = df_1, y = df_2, by = "x", all = TRUE)

but doesn't make. What can I do?

results from

merge(df_1, df_2)
    x y z
1  x4 0 1
2  x4 0 1
3  x4 0 1
4  x4 0 1
5  x5 1 1
6  x5 1 1
7  x5 1 1
8  x5 1 1
9  x5 1 1
10 x5 1 1
11 x5 1 1
12 x5 1 1
13 x5 1 1

Using this:

intersect(df_1$x, df_2$x)
[1] "x4" "x5"

it is possible to see which are the common values in the dataframes. Is it possible to use it as the rule to merge the rows which are only common?

@jogo thank you. This is the question I tried but it didn't worked for me. In x column I have same names and I would to merge by them and keep them. Please see the updated with the simple merge in my answer and this is not what I expected as output. — user8831872
– user8831872, Commented Jan 31, 2018 at 12:52
It is a m:n-join, e.g. each row from df_1 with "x4" is crossed with each row from df_2 with "x4". So you will get 2*2=4 rows in the result. So please define the logic to reduce the result! — jogo
– jogo, Commented Jan 31, 2018 at 12:57
@jogo thank you. I don't think merge is the right solution. Please see my expected result. The only common between with the two dataframes is the column x. From column x I know that there are values in rows which have the same value. I would to create a new dataframe based on this and I would to have the other columns based on the previous. That's why I have in my expected output y and z — user8831872
– user8831872, Commented Jan 31, 2018 at 13:03
Are you looking for cbind(df_1[df_1$x %in% df_2$x,], z=df_2[df_2$x %in% df_1$x, "z"]) ? — jogo
– jogo, Commented Jan 31, 2018 at 13:05
@jogo yes this is a solution but it is a little hard for me to implement it in my real dataset as I have many more columns. I only try to find a way to merge to dataframe into a new based on a column but I want to merge only the rows which have the same value between this to dataframes — user8831872
– user8831872, Commented Jan 31, 2018 at 13:14

Nicolás Velasquez · Accepted Answer · 2018-01-31 16:48:05Z

1

With base, as jogo points out, simply run

merge(df_1, unique(df_2))

With tidyverse,

library(tidyverse)

left_join(df_1, unique(df_2), by = "x")
      x y z
   1 x4 0 1
   2 x4 0 1
   3 x5 1 1
   4 x5 1 1
   5 x5 1 1

edited Jan 31, 2018 at 16:48

answered Jan 31, 2018 at 13:09

Nicolás Velasquez

5,94813 silver badges22 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

user8831872 Over a year ago

thank you but can you see in the output that for example the x4 must be 2 times and it is 4. It seems like it joins duple time

Nicolás Velasquez Over a year ago

All right, I edited the answer. You'd need to reduce df_2 to its unique values. The function unique() does the trick for data.frame.

Collectives™ on Stack Overflow

Create a new dataframe with same values of a column

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related