0

I have multiple csv files, and these files contain some identical columns as well as different columns. For example,

#1st.csv
col1,col2 
1,2

#2nd.csv
col1,col3,col4
1,2,3

#3rd.csv
col1,col2,col3,col5
1,2,3,4

I try to combine these files based on the same columns, but for those different columns, I simply include all columns but fill the cell with NA (for those data without that columns).

So I expect to see:

col1,col2,col3,col4,col5
1,2,NA,NA,NA            #this is 1st.csv
1,NA,2,3,NA             #this is 2nd.csv
1,2,3,NA,4              #this is 3rd.csv

Here is the r code I give, but it returns an error message

> Combine_data <- smartbind(1st,2nd,3rd)

Error in `[<-.data.frame`(`*tmp*`, , value = list(ID = c(1001, 1001,  : 
  replacement element 1 has 143460 rows, need 143462

Does anyone know any alternative or elegant way to get the expected result?

The R version is 3.3.2.

1
  • try Combine_data <- plyr::rbind.fill(1st,2nd,3rd). That is assuming that you've already imported the data from those csv files. Commented Dec 8, 2016 at 14:51

1 Answer 1

2

You should be able to accomplish this with the bind_rows function from dplyr

df1 <- read.csv(text = "col1, col2 
1,2", header = TRUE)

df2 <- read.csv(text = "col1, col3, col4
1,2,3", header = TRUE)

df3 <- read.csv(text = "col1, col2, col3, col5
1,2,3,4", header = TRUE)

library(dplyr)

res <- bind_rows(df1, df2, df3)
> res
  col1 col2 col3 col4 col5
1    1    2   NA   NA   NA
2    1   NA    2    3   NA
3    1    2    3   NA    4
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.