1

What I am looking to do is loop through all Excel files in a directory and, where value in column 'Account.ID' is a match to my created 'IDList', changed column 'Type' value to 'REDACTED'. Dataframe then needs to be exported back to the original Excel file, saved, closed and on to the next file.

The below code works for obtaining the list of files within the directory, opening the first file and making the necessary amendment.

Can anyone help please with writing the data back to the original Excel file and ensuring the loop works through to completion so all files are amended?

#ID list contains thousands, simplified for ease of reading
IDList <- c("7c850aaa", "07311bbb",)

MYFILEPATH <- "\\\\dcf.network\\data\\\\R\\Test Folder"

# get a vector of all filenames
files <- list.files(
  path=MYFILEPATH,
  full.names = TRUE,
  recursive = TRUE
)

for (i in 1:length(files)) {
  data <- read.xlsx(files[i])
  
  cols <- c("Type")
  data[data$Account.ID %in% IDList, cols] <- "REDACTED"
    
}
2
  • Try putting xlsx::write.xlsx(data, files[i]) as the loop's last instruction. Commented Nov 24, 2021 at 12:12
  • Thanks, that leads to this error I'm afraid: Error in .jcall(row[[ir]], "Lorg/apache/poi/ss/usermodel/Cell;", "createCell", : Java Exception <no description because toString() failed> Commented Nov 24, 2021 at 12:30

1 Answer 1

3

Give the openxlsx package a try. I'm assuming there's only one sheet per excel file, but you could theoretically accommodate multiple sheets with the openxlsx package as well if the contents are data tables. For the one sheet solution, it should just be one additional line in your loop to overwrite the original file with the updated data. You can use the openxlsx package to read the data in as well.

install.packages("openxlsx", dependencies = TRUE)

#ID list contains thousands, simplified for ease of reading
IDList <- c("7c850aaa", "07311bbb")

MYFILEPATH <- "\\\\dcf.network\\data\\\\R\\Test Folder"

# get a vector of all filenames
files <- list.files(
  path=MYFILEPATH,
  full.names = TRUE,
  recursive = TRUE
)

for (i in 1:length(files)) {
  data <- openxlsx::read.xlsx(files[i])
  
  cols <- c("Type")
  data[data$Account.ID %in% IDList, cols] <- "REDACTED"
  
  openxlsx::write.xlsx(data, file = files[i], overwrite = TRUE)  
}

As an aside, it might be useful to add something like pattern = ".xlsx" to list.files depending on what is in the folders and how precise you need to be so that only excel files are read in. e.g.

files <- list.files(
  path=MYFILEPATH,
  full.names = TRUE,
  recursive = TRUE,
  pattern = ".xlsx"
)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.