1

I have two data frames:

df1 <- data.frame(assoc = c(2, 3.4, 4.6, -2.3, -1, 0.48, -0.4), 
                    con = c("A","B","C","D","E","F","T"))
df2 <- data.frame(pos = c("-3", "-2", "-1", "0", "1", "2", "3"),
                  col1 = c("A", "B", "B", "T", "T", "D", "E"),
                  col2 = c("B", "T", "D", "A", "E", "C","F"))
view(df1)

con    assoc 
 A     2  
 B     3.4
 C     4.6
 D    -2.3  
 E    -1
 F     0.48
 T    -0.4

I would like to create a function to match the data frames so that the assigned values from df1 would appear as new columns on df2. The desired output would look like this:

    pos   col1  con1  col2    con2 
    -3     A      2     B      3.4
    -2     B      3.4   T     -0.4
    -1     B      3.4   D     -2.3
     0     T     -0.4   A      2
     1     T     -0.4   E     -1 
     2     D     -2.3   C      4.6
     3     E     -1     F      0.48

I've tried to use:

res <- merge(df1, df2)
view(res)

Unfortunately, it worked just for one example. When I added a new column, it didn't seem to work.

Any help would be highly appreciated!

1
  • 1
    How does df1 link to df2? I can't see any matching columns Commented Sep 29, 2020 at 12:01

4 Answers 4

2

It looks like you are trying to left join df1 onto df2 twice, by different variables:

library(tidyr)

df1 %>%
  left_join(df1, by = c(col1 = "con")) %>% 
  left_join(df1, by = c(col2 = "con"))

#>   pos col1 col2 assoc.x assoc.y
#> 1  -3    A    B     2.0    3.40
#> 2  -2    B    T     3.4   -0.40
#> 3  -1    B    D     3.4   -2.30
#> 4   0    T    A    -0.4    2.00
#> 5   1    T    E    -0.4   -1.00
#> 6   2    D    C    -2.3    4.60
#> 7   3    E    F    -1.0    0.48

Or a double merge:

merge(merge(df1, df2, by.x = "con", by.y = "col1"), 
      df2, by.x = "con", by.y = "col2")
#>   con assoc pos.x col2 pos.y col1
#> 1   A   2.0    -3    B     0    T
#> 2   B   3.4    -2    T    -3    A
#> 3   B   3.4    -1    D    -3    A
#> 4   D  -2.3     2    C    -1    B
#> 5   E  -1.0     3    F     1    T
#> 6   T  -0.4     0    A    -2    B
#> 7   T  -0.4     1    E    -2    B
Sign up to request clarification or add additional context in comments.

Comments

2

You can use match on thw two columns, i.e.

sapply(df2[-1], function(i)df1$assoc[match(i, df1$con)])

     col1  col2
[1,]  2.0  3.40
[2,]  3.4 -0.40
[3,]  3.4 -2.30
[4,] -0.4  2.00
[5,] -0.4 -1.00
[6,] -2.3  4.60
[7,] -1.0  0.48

Comments

1

Do you mean merge like this?

Reduce(
  function(x, y) merge(x, y, by = names(df1)),
  lapply(
    grep("col", names(df2), value = TRUE),
    function(y) merge(df1, df2, by.x = "con", by.y = y)
  )
)

which enables you to merge if you have more than just col1 and col2 in df2, giving you

  assoc con pos.x col2 pos.y col1
1  -0.4   T     0    A    -2    B
2  -0.4   T     1    E    -2    B
3  -1.0   E     3    F     1    T
4  -2.3   D     2    C    -1    B
5   2.0   A    -3    B     0    T
6   3.4   B    -2    T    -3    A
7   3.4   B    -1    D    -3    A

Comments

0

Using a for loop

out <- df2
for(cn in c("col1", "col2")) out <- merge(out, df1, by.x = cn, by.y = 'con', all.x = TRUE)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.