I'm having trouble outputting the results of a for loop to a list/vector in R. The loop is running over a df structured as below, where each unique ID is represented by 1 to n rows:
id <- c(1, 2, 2, 2, 3, 4, 5, 6)
string <-c("apple", "grape", "orange", "blueberry", "plum", "tomato", "pear", "plum")
df <- data.frame(id, string)
For each unique ID, I want to write a list collapsing the n rows into a single row containing a concatenated character string based on the information in column "string". So I have:
#write a function to concatenate strings where d = dataframe, n = column name, and s = character to act as separator
concat <- function(d, n, s) {
list_value = paste0(d[[n]], sep = s)
return(list_value)
}
#create two empty lists
string_list <- list()
item_list <- list()
#loop the concatenate function over each unique id in the df
for (i in unique(df$id)) {
item <- filter(df, id == i)
print(item)
item_list[i] <- item
strings <- concat(item, "string", ";")
print(strings)
string_list[i] <- strings
}
I can see from the print statements that the loop is running "correctly" (I'm getting the output I want printed to the console) but I get warnings that "number of items to replace is not a multiple of replacement length" and string_list and item_list are impossibly large objects (a df of ~2000 rows becomes a list of ~10M elements).
If at the beginning of the loop I instead say:
for (i in 1:length(df$id))
I get a list that is the same length as the number of rows in the original df; but it's empty (it returns integer [0] or character [1] for all). There are no NAs in the original df (checked with table(is.na(df$col_name)) for all columns). Same warnings.
Using string_list <- c() instead of string_list <- list() does not seem to help.
I'm missing something simple. What is it? Thanks
EDIT: I think I see part of the problem. The object "item" is a (small) df, and appending a series of dfs to a list would result in a large object. But replacing item_list <- list() with
item_data <- data.frame(Col1 = integer(), Col2 = character(), stringsAsFactors = FALSE)
gives an error, new columns would leave holes after existing columns
item_list[andstring_list[assignments to[[.