0

So I have a folder of excel files all identical in format. Here's a simplified version of what I'm working with:

country   count  year
USA       23232  2019
USA        3993  2019
RUSSIA    67574  2019
JAPAN        31  2019
JAPAN       535  2019

So I would like to do the following to every file in my folder

df %>% 
  group_by(country, year) %>% 
  summarize(count = sum(count))

In one file this will look like:

  country     year count
1 JAPAN       2019   566
2 RUSSIA      2019 67574
3 USA         2019 27225

So how can I do this for every file in my folder, again they are identical. The end goal would be one dataframe with all the file's count data in it. Tidyverse preferred

1
  • can you give an example of your directory structure in your question? Commented Nov 19, 2020 at 15:59

2 Answers 2

1

This code can be useful. You build a function that reads the excel file (I have used sheet = 1 but you can change it). After that you process and assign a key variable with the name of the file. Then, you create a list to do the process and finally bind the content using bind_rows(). Here the code:

library(readxl)
library(dplyr)
#Extract files
vec <- list.files(path = 'Your/Path/Here',pattern = '.xlsx')
#Function
readprocess <- function(x)
{
  y <- read_excel(x,1)
  z <- y %>% 
    group_by(country, year) %>% 
    summarize(count = sum(count)) %>% mutate(Filename=x)
  return(z)
}
#Apply
List <- lapply(vec,readprocess)
#Bind
df <- do.call(bind_rows,List)
Sign up to request clarification or add additional context in comments.

Comments

1
folder_path <- "insert_path_here"
files <- list.files(path=folder_path)

results <- lapply(files, function(x) {
    df <- read.csv(paste0(folder_path, "/", x))
    df <- df %>% 
         group_by(country, year) %>% 
         summarize(count = sum(count))
    }
)
df_results <- do.call(bind_rows, results)    

Alternatively, you could just define your files into a list manually, but the gist of it is that script will:

  • Run through each file in that list, and for each file:
  • Read in the spreadsheet
  • Perform data manipulation
  • Binds all the results into a dataframe

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.