1

I would like to merge specific columns from two csv files and use the filename as the column header.In this example, I want to merge the third column from each file into a single data frame. the csv files have the same number of rows and columns.

Sample data sets:

File1.csv

V1,V2,V3,V4
1,1986,64,61

File2.csv

V1,V2,V3,V4
1,1990,100,61

Expected Result:

"File1","File2"
64,100

Here's my script:

my.file.list <- list.files(pattern = "*.csv")
my.list <- lapply(X = my.file.list, FUN = function(x) {
        read.csv(x, header=TRUE,colClasses = c("NULL", "NULL", "numeric",    "NULL"), sep = ",")[,1]
    })
my.df <- do.call("cbind", my.list)

How do I add the column headers based from the file names?

I tried this:

sub('.csv','',basename(my.file.list),fixed=TRUE)

but I don't know how to add this as headers.

I'll appreciate any help.

4
  • Please show us your expected output. What does use the filename as the column header mean? Commented Apr 21, 2017 at 3:18
  • Thanks for the comment. I edited my question. Commented Apr 21, 2017 at 3:23
  • What happens in the two files don't have the same number of rows? Commented Apr 21, 2017 at 3:23
  • Both files have the same number of rows and columns. I tried using the basname(), but I dont know how to add them as headers in the output file. Commented Apr 21, 2017 at 3:26

2 Answers 2

1
my.file.list <- list.files(pattern = "*.csv")
my.list <- list()
for (i in 1:length(my.file.list)) {
    df <- read.csv(my.file.list[[i]], header=TRUE, sep=",")["V3"]
    names(df) <- paste0("FILE", i)
    my.list[[i]] <- df
}
my.df <- do.call("cbind", my.list)
Sign up to request clarification or add additional context in comments.

2 Comments

Sorry.The column names should be "File1","File2"
Thank you so much. I got the idea now!
0

@Tim Biegeleisen Many thanks for the help. I got the idea now. Here's the improve version of your answer that I can use for files with different filenames.

 my.file.list <- list.files(pattern = "*.csv")
 my.list <- list()
 for (i in 1:length(my.file.list)) {
   df <- read.csv(my.file.list[[i]], header=TRUE, sep=",")["V3"]
    names(df) <-paste0(sub('.csv','',basename(my.file.list[i]),fixed=TRUE), i)
 my.list[[i]] <- df
 }
 my.df <- do.call("cbind", my.list)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.