9

I have several txt files. Each file has the columns of data separated by a comma. And each has its own individual file name.

So far I have combined these files into one big data frame, using the following code:

files = list.files()
data2=lapply(files, read.table, header=FALSE, sep=",")
data_rbind <- do.call("rbind", data2) 
colnames(data_rbind)[c(1,2,3)]<-c("name", "sex", "amount")

This returns:

name sex amount

Anna F 24567

Emma F 23210

Isabelle F 31212

Amanda F 22631

I would like to add a 4th column which specifies next to each line of data, the name of the file that the data was originally sourced from.

So, for example, if the first file 'example1.txt' contained the following:

Anna, F, 24567

Emma, F, 23210

Isabelle, F, 31212

And the second file 'example2.txt' contained the following:

Amanda, F, 22631

Sara, F, 41355

Katie, F, 2387

I would like to get the following:

Name Sex Amount Year

Anna F 24567 example1.txt

Emma F 23210 example1.txt

Amanda F 22631 example2.txt

Sara F 41355 example2.txt

Katie F 2387 example2.txt

Is this possible?

3 Answers 3

7

Try:

files = list.files()
data2=lapply(files, read.table, header=FALSE, sep=",")
for (i in 1:length(data2)){data2[[i]]<-cbind(data2[[i]],files[i])}
data_rbind <- do.call("rbind", data2) 
colnames(data_rbind)[c(1,2,3,4)]<-c("name", "sex", "amount","year")
Sign up to request clarification or add additional context in comments.

Comments

4

You could also use:

   nm1 <- c("Name", "Sex", "Amount", "Year")
  files <- list.files(pattern="^example")
  files
  #[1] "example1.txt" "example2.txt"

  setNames(do.call(rbind,Map(`cbind`, 
           lapply(files, read.table, sep=","), V4=files)), nm1)

   #       Name Sex Amount         Year
   #1     Anna   F  24567 example1.txt
   #2     Emma   F  23210 example1.txt
   #3 Isabelle   F  31212 example1.txt
   #4   Amanda   F  22631 example2.txt
   #5     Sara   F  41355 example2.txt
   #6    Katie   F   2387 example2.txt

Or use rbindlist from data.table

 library(data.table)
 setnames(rbindlist(Map(`cbind`,lapply(files, fread),files)),nm1)[]
 #     Name Sex Amount         Year
 #1:     Anna   F  24567 example1.txt
 #2:     Emma   F  23210 example1.txt
 #3: Isabelle   F  31212 example1.txt
 #4:   Amanda   F  22631 example2.txt
 #5:     Sara   F  41355 example2.txt
 #6:    Katie   F   2387 example2.txt

Comments

0

You can try something like:

data2 = lapply(files, function(x) {
    res <- read.table(x, header=FALSE, sep=",")
    res$year <- x
    res
}, header=FALSE, sep=",")

data_rbind <- do.call("rbind", data2) 
colnames(data_rbind) <- c("name", "sex", "amount", "year")

2 Comments

It works great up until the last line of code where it gives the error: "Error in C(1, 2, 3) <- c("name", "sex", "amount", "year") : target of assignment expands to non-language object" ?
Yea, I just copied this part and didn't check it, it should work now.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.