I have a loop that recodes values of a column and breaks when a condition is met. I would like to use this loop, or its basic concept, on a list of data frames with the same format.
sample data:
Id <- as.factor(c(rep("01001", 11), rep("01043", 11), rep("01065", 11), rep("01069", 11)))
YearCode <- as.numeric(rep(1:11, 4))
Type <- c(NA,NA,NA,NA,NA,NA,NA,2,NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,
NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,2,NA)
test <- NA
sample_df <- data.frame(Id, YearCode, Type, test)
# A part of sample_df
one_df <- subset(sample_df, sample_df$Id=="01069")
This for loop works fine for one data frame:
# example for loop using example data frame "one_df"
for(i in seq(along=one_df$Id)){
if(is.na(one_df$Type[i])){ # if Type is NA, recode to 0
one_df$test[i] <- 0
} else { # Stop when Type is not NA, and leave remaining NAs that come after
break }
}
However, I have many data frames with this same format in a list. I would like to keep them in the list and apply this loop over the whole list.
# example list : split data frame into list by Id
sample_list <- split(sample_df, sample_df$Id, drop = TRUE)
I've looked around other posts such as this one, but I get stuck when trying to loop over each data frame in the list or write a similar function using lapply. How can I modify this loop to work on the list (sample_list), using either a for loop, lapply, or something else?
Any tips would be greatly appreciated, let me know if I need to clarify anything. Thanks!