0

I am trying to loop through a list of data frames in the global environment. I want to extract the variable name, substring the variable name, filter (tidyverse) each dataframe, and then save each filtered dataframe. However, I'm having quite a bit of trouble:

query_loop <- function(df){

    name <- deparse(substitute(df));
    cpt <- paste("cpt_","20", substring(name, 14, 15), sep = "");
    assign(cpt, filter(df, CPT == "12345"));
    write.table(cpt, file = paste(deparse(substitute(cpt)), ".txt", sep =""), row.names = F, sep = "\t");
}

dfs1 <- lapply(dfs, query_loop)

The code fails at the first step of my function. When I try to print(deparse(substitute(df))), I get a list of X[[i]], which I understand is because the dataframes are not named when I pass them to lapply. However, I don't know what the correct solution is.

Any help would be greatly appreciated. Thanks!

6
  • names(dfs) is a character vector, so df in your function is a length-1 character vector with the name of the current data frame. Normally one uses deparse(substitute()) to get a string--you already have a string. Commented Jul 28, 2022 at 16:19
  • The correct solution is not to have all your data frames in the global environment, but in a list. How does names(df) return the names of data frames in your global environment? If you already have the names of all the data frames, you can use mget to obtain all those data frames in a list Commented Jul 28, 2022 at 16:19
  • Sorry, I posted my code wrong. I am passing in dfs into lapply, which is a list of the dataframes. The rest of my question is correct, i.e. when I try to print(deparse(substitute(df))), it prints out X[[i]] 6 times (the number of dataframes in the list) Commented Jul 28, 2022 at 16:23
  • So... is your list named? If so, use the names of the list as your code shows. If not, name the list and use the names of the list as your code shows. Commented Jul 28, 2022 at 16:31
  • Otherwise there's this workaround (and I would close your question as a duplicate of that one), but working with the names seems much easier. Commented Jul 28, 2022 at 16:32

1 Answer 1

1

Suggested simplification (untested, obviously, as there's no data to test on).

## assumption: `dfs` is a named list of data frames

# create a list of filtered data frames with appropriate names
filtered_list = lapply(dfs, filter, CPT == "12345")
names(filtered_list) = paste0("cpt_","20", substring(names(dfs), 14, 15))

# write them to files
lapply(names(filtered_list), function(nm) {
  write.table(
    x = filtered_list[[nm]],
    file = paste0(nm, ".txt"),
    row.names = F, sep = "\t")
})
Sign up to request clarification or add additional context in comments.

7 Comments

Thank you - this worked. However, is there a way to programmatically name all the data frame elements of dfs list? I first loaded several data frames into the global environment, and I would like to name the data frame elements of the list using the global environment variable names. I loaded them using Filter(function(x) is(x, "data.frame"), mget(ls())). Also, after I run the code you provided above, the console outputs [[1]] NULL [[2]] NULL [[3]] NULL [[4]] NULL [[5]] NULL [[6]] NULL Is there a reason why?
The NULLs are there because write.table returns NULL. You can stop the printing by adding a invisible() as the last line of function(nm). As for the naming, I'd strongly suggest loading the data frames directly into a list, not the global environment. See my answer at How to make a list of data frames? for examples and discussion.
But I'm also confused that you don't have names, because when I run your codex = Filter(function(x) is(x, "data.frame"), mget(ls())), and I look at names(x), x does have the names of the global environment variables.
That's weird. The way I have been loading the dataframes is by doing: temp = list.files(pattern="*.txt") , then for (i in 1:length(temp)) assign(temp[i], read.delim(temp[i])), then dfs <- Filter(function(x) is(x, "data.frame"), mget(ls())). I will try to load the files into a named list next time.
You should do temp = list.files(pattern="*.txt"); data_list = lapply(temp, read.delim); names(data_list) = temp
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.