Create data tables for export based on values in a column

Question

Using this SO question as a starting point, if my data appears like this:

index   state  date              Amount
2       FL     2010-06-08         0
21      FL     2010-10-08        10
6       FL     2010-08-16        30
5       GA     2010-11-25        20
9       GA     2010-01-01         0
8       CA     2011-03-06        10
12      CA     2012-03-12        10
11      CA     2012-06-21        10
15      NY     2010-01-01        30
13      NY     2010-04-06        20

How do I use the loop example from that question's highest voted answer to create data tables for export that are named based on the state value? My goal is to export each state-specific data table to csv for separate analyses. These are large datasets so prefer using data.table package.

Below is the loop from the question linked above using the iris dataset.

iris_split <- split(iris, iris$Species)
new_names <- c("one", "two", "three")
for (i in 1:length(iris_split)) {
  assign(new_names[i], iris_split[[i]])
}

You can use the same code. just change the assign part to write.csv i.e. write.csv(iris_split[[i]], paste0(new_names[i], ".csv"), quote = FALSE, row.names = FALSE) — akrun
– akrun, Commented Nov 28, 2022 at 17:11
sorry, but in the example new_names is hard-coded rather than automatically pulling from the state value so this is not the full solution. — mikeLdub
– mikeLdub, Commented Nov 28, 2022 at 17:17

akrun · Accepted Answer · 2022-11-28 17:41:59Z

3

We could either split as in the solution posted in the link and then loop over the names of the list (split creates a named list), and do either write.csv or fwrite (from data.table as it would be fast for larger datasets)

library(data.table)
my_list <- split(df1, df1$state)
for(nm in names(my_list)) 
   {
     fwrite(my_list[[nm]], paste0(nm, ".csv"))
   }

Or another option would be a standard data.table group by approach. Here, we grouped by 'state' (after converting the data.frame to data.table - setDT), then fwrite the subset of data.table (.SD - doesn't include the group column, and write to a file created with grouping column values (.BY). For large datasets, removing a column can save some space as it is the same value that got repeated. Instead, we can identify with the file name

setDT(df1)[, fwrite(.SD, file.path(getwd(), paste0(.BY, ".csv"))), by = state]

edited Nov 28, 2022 at 17:41

answered Nov 28, 2022 at 17:19

akrun

891k38 gold badges590 silver badges700 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Jilber Urbina · Accepted Answer · 2022-11-28 21:43:30Z

3

Using fwrite from data.table

library(data.table)
my_list <- split(df, df$state)
lapply(seq_along(my_list), function(i){
   fwrite(my_list[[i]], paste0(names(my_list)[i], ".csv"))
})

edited Nov 28, 2022 at 21:43

answered Nov 28, 2022 at 17:17

Jilber Urbina

61.4k10 gold badges116 silver badges141 bronze badges

2 Comments

r2evans Over a year ago

Perhaps seq_len(length(my_list)) -> seq_along(my_list)?

Jilber Urbina Over a year ago

Yes, that`s much better @r2evans

Collectives™ on Stack Overflow

Create data tables for export based on values in a column

2 Answers 2

Comments

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related