0

I have tried searching for something but cannot find it. I have found similar threads but still they don't get what I want. I know there should be an easy way to do this without writing a loop function. Here it goes

I have two data frame df1 and df2

df1 <- data.frame(ID = c("a", "b", "c", "d", "e", "f"), y = 1:6 )
df2 <- data.frame(x = c("a", "c", "g", "f"), f=c("M","T","T","M"), obj=c("F70", "F60", "F71", "F82"))
df2$f <- as.factor(df2$f)

now I want to match df1 and df2 "ID" and "x" column with each other. But I want to add new columns to the df1 data frame that matches "ID" and "x" from df2 as well. The final output of df1 should look like this

  ID y obj f1 f2
   a 1 F70 M  NA
   b 2 NA  NA NA
   c 3 F60 NA T
   d 4 NA  NA NA
   e 5 NA  NA NA
   f 6 F82 M  NA
2
  • Why T is in f2 column? What do columns f1 and f2 represent ? Commented Mar 3, 2017 at 11:24
  • they represent the f column in df2 Commented Mar 3, 2017 at 11:44

2 Answers 2

2

We can do this with tidyverse after joining the two datasets and spread the 'f' column

library(tidyverse)
left_join(df1, df2, by = c(ID = "x")) %>% 
               group_by(f) %>% 
               spread(f, f) %>%
               select(-6) %>%
               rename(f1 = M, f2 = T)
# A tibble: 6 × 5
#    ID     y    obj     f1     f2
#* <chr> <int> <fctr> <fctr> <fctr>
#1     a     1    F70      M     NA
#2     b     2     NA     NA     NA
#3     c     3    F60     NA      T
#4     d     4     NA     NA     NA
#5     e     5     NA     NA     NA
#6     f     6    F82      M     NA

Or a similar approach with data.table

library(data.table)
dcast(setDT(df2)[df1, on = .(x = ID)], x+obj + y ~ f, value.var = 'f')[, -6, with = FALSE]
Sign up to request clarification or add additional context in comments.

Comments

1

Here is a base R process.

# combine the data.frames
dfNew <- merge(df1, df2, by.x="ID", by.y="x", all.x=TRUE)

# add f1 and f2 variables
dfNew[c("f1", "f2")] <- lapply(c("M", "T"),
                               function(i) factor(ifelse(as.character(dfNew$f) == i, i, NA)))

# remove original factor variable
dfNew <- dfNew[-3]
  ID y  obj   f1   f2
1  a 1  F70    M <NA>
2  b 2 <NA> <NA> <NA>
3  c 3  F60 <NA>    T
4  d 4 <NA> <NA> <NA>
5  e 5 <NA> <NA> <NA>
6  f 6  F82    M <NA>

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.