0

I have two data frames
df1:

DAT1 DAT3     DAT4    ...
 1   this is  this is
 2   this is  this is
 3   this is  this is

df2:

DAT1 DAT3       DAT4      ... 
 1   a comment  a comment
 2   a comment  a comment
 3   a comment  a comment

I want to find a way to append the second dataframe columns (I know both the name and the position of the columns I need to append)to the first one and obtain an updated version of the first one that has:
df3:

DAT1 DAT3               DAT4               ... 
 1   this is a comment  this is a comment  
 2   this is a comment  this is a comment
 3   this is a comment  this is a comment

The thing is that the real dataframes have many rows and columns, so a for() loop would be really inefficient.

2
  • Are the rows ordered? As in row1 with row1? Commented Jul 10, 2019 at 9:42
  • @LyzandeR in theory yes. (Have not seen a case where that doesn't happen) Commented Jul 10, 2019 at 9:47

4 Answers 4

2

We can use Map

cols <- c("DAT3", "DAT4")
df3 <- df1
df3[cols] <- Map(paste, df1[cols], df2[cols])

df3
#  DAT1              DAT3              DAT4
#1    1 this is a comment this is a comment
#2    2 this is a comment this is a comment
#3    3 this is a comment this is a comment
Sign up to request clarification or add additional context in comments.

Comments

2

We can use base R without looping

cols <- c("DAT3", "DAT4")     
df3 <- df1
df3[cols] <-matrix(paste(as.matrix(df1[-1]), as.matrix(df2[-1])), nrow = nrow(df1))
df3
#  DAT1              DAT3              DAT4
#1    1 this is a comment this is a comment
#2    2 this is a comment this is a comment
#3    3 this is a comment this is a comment

data

df1 <- structure(list(DAT1 = 1:3, DAT3 = c("this is", "this is", "this is"
), DAT4 = c("this is", "this is", "this is")), class = "data.frame",
row.names = c(NA, 
-3L))

df2 <- structure(list(DAT1 = 1:3, DAT3 = c("a comment", "a comment", 
"a comment"), DAT4 = c("a comment", "a comment", "a comment")),
   class = "data.frame", row.names = c(NA, 
-3L))

Comments

1

If your data is ordered, I would do something like this:

#initiate the data.frame with the id
df3 <- data.frame(DAT1 = df1$DAT1)

#then run a for-loop with the names you know you need to concatenate
for (i in c('DAT3', 'DAT4')) {
  df3[[i]] <- paste(df1[[i]], df2[[i]])
}

The for-loop iterates over the names only. The core of the code is paste which is vectorised and fast. So, you won't face any speed issues

df3
#  DAT1              DAT3              DAT4
#1    1 this-is a-comment this-is a-comment
#2    2 this-is a-comment this-is a-comment
#3    3 this-is a-comment this-is a-comment

Comments

0

a dplyr version

df1 %>% inner_join(df2, by = "DAT1") %>% rowwise() %>%
  mutate(DAT3 = paste(DAT3.x, DAT3.y, collapse = " "),
         DAT4 = paste(DAT4.x, DAT4.y, collapse = " ")) %>%
  select(everything(), -contains("."))

OutPut

# A tibble: 3 x 3
   DAT1 DAT3              DAT4             
  <dbl> <chr>             <chr>            
1     1 this is a comment this is a comment
2     2 this is a comment this is a comment
3     3 this is a comment this is a comment

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.