2

I want to append the file name to my table, but it seems it is not really working.
What I am doing is iterating over a list of filenames, opening them, appending all the data to one data frame and for each appended file I want to add its file name. I expect it to be appended on each row, so that later when I look at the data, I would know from which file is the given row originating.
But it seems that it is not working as expected.

data <- data.frame()
for (file in files){

  name = strsplit(file, split = "\\.")[[1]][1]

  data <- data %>% bind_rows(read_delim(file = file, delim = ";", col_types = cols(
    a = col_double(),
    b = col_double(),
    )) %>% mutate(name = name))
}

I believed that the mutate function should have done the trick, apparently, in the end they all have the same value.

1
  • @akrun I have left it out by mistake. I am splitting the file .. I have edited the code now. Example albert.csv -> name = albert Commented Apr 5, 2020 at 20:35

1 Answer 1

2

As we are using tidyverse, an option is

library(readr)
library(purrr)
files_no_ext <- tools::file_path_sans_ext(files)
out <- imap_dfr(set_names(files, files_no_ext), ~ 
      read_delim(.x, delim = ";", 
         col_types = cols(a = col_double(),b = col_double()),
              .id = 'name')

Or using data.table

library(data.table)
out <- rbindlist(lapply(setNames(files, files_no_ext), fread), idcol = 'name')

In the OP's for loop

dat <- data.frame()
for (file in files){
     name <- tools::file_path_sans_ext(file)
     tmpdat <- read_delim(file = file, delim = ";", 
             col_types = cols(
                    a = col_double(),
                    b = col_double()
                      )) %>%
              mutate(name = name)
      dat <- rbind(dat, tmpdat)

 }
Sign up to request clarification or add additional context in comments.

2 Comments

I adjusted my code and it works. Do you think the first solution is faster/ more effective than the second one?
@CroatiaHR I thhink the first solution would be more readable (subjective), and it automatically creates the column based on the .id, regarding the faster option, you may have to try fread from data.tabl and rbindlist

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.