3

I have 80 separate .csv files that have the same columns and headers that I was able to import and rbind as one dataframe using the following commands:

 file_names <- dir("~/Desktop/data") 
 df <- do.call(rbind,lapply(file_names,read.csv))

But I would like to add a new variable ("name") that identifies from which .csv file each observation came from. So for example, this variable "name" would be "NY" for all the observations from the 'NY.csv' file and "DC" for all observations from the 'DC.csv' file, etc... Is there any way to do this without adding this new column manually on each .csv? Thanks!

3 Answers 3

3

This should do it:

file_names <- dir("~/Desktop/data") 
df <- do.call(rbind, lapply(file_names, function(x) cbind(read.csv(x), name=strsplit(x,'\\.')[[1]][1])))
Sign up to request clarification or add additional context in comments.

3 Comments

Hey @mpjdem I like where you are going with this, but I get the following error. Any idea why/how to solve it? Thanks! "Error in !header : invalid argument type"
Does it help explicitly set header=TRUE in the read.csv() call? (supposing you do have a header; otherwise header=FALSE)
No, I still get the same error. I have a header; can't figure it out. @mpjdem
2

With readr >= 2.0 just add the id option:

library(readr)
read_csv(file_names, id = "name")

If you would like to remove the csv at the end:

read_csv(file_names, id = "name") %>%
   mutate(name = str_remove_all(name, ".csv"))

See this thread for more options.

Comments

0

Use the idcol argument from data.table's rbindlist() function:

# get a vector of all file names
myfiles <- list.files("path/to/directory/")

# loop over files names, reading in and saving each data.frame as an element in a list
n <- length(myfiles )
datalist <- vector(mode="list", length=n)
for(i in 1:n) {
    cat("importing file", i, ":", myfiles[i], "\n")
    datalist[[i]] <- read.csv(myfiles[i])
}

# assign list elements the file names
names(datalist) <- myfiles 

# combine all data.frames in datalist, use idcol argument to assign original file name
all_data <- data.table::rbindlist(datalist, idcol=TRUE)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.