0

I have a list of 1000s of dataframes.

Each one has the following structure:

structure(list(frame = c(222, 223, 224, 225, 226, 227, 228, 229, 
230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 
243, 244, 245, 246, 247, 248, 249, 250, 251, 252), room = c("B6", 
NA, NA, NA, NA, "B6", NA, NA, "B6", NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, "B6", NA, NA, NA, NA, NA, NA, "B6"
), id = c(2, NA, NA, NA, NA, 85, NA, NA, 2, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, 32, NA, NA, NA, NA, NA, NA, 
1), id_prob = c(0.710559149006359, NA, NA, NA, NA, 0.676624962451645, 
NA, NA, 0.650006199807849, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, 0.668218888964693, NA, NA, NA, NA, NA, NA, 
0.786722974412071), x = c(1606, NA, NA, NA, NA, 1319, NA, NA, 
1636, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
1316.75, NA, NA, NA, NA, NA, NA, 656.5), y = c(-472.25, NA, NA, 
NA, NA, -516.5, NA, NA, -463.5, NA, NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, -520, NA, NA, NA, NA, NA, NA, -941), 
    orientation = c(84.5596680381038, NA, NA, NA, NA, 51.3401926511951, 
    NA, NA, 71.565048727047, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, 63.4349516145757, NA, NA, NA, NA, 
    NA, NA, 120.963756691571), area = c(-133, NA, NA, NA, NA, 
    -98, NA, NA, -140, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, -130, NA, NA, NA, NA, NA, NA, -166)), row.names = c(NA, 
-31L), class = c("tbl_df", "tbl", "data.frame"))

I have the following code that fills in the gaps of NA values if the max gap is < 20 rows.

df[c('id','x','y')] <- na.locf(df[c('id','x','y')], na.rm = F, maxgap = 20)

This works completely fine on single data frames and results in the following output.

structure(list(frame = c(222, 223, 224, 225, 226, 227, 228, 229, 
230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 
243, 244, 245, 246, 247, 248, 249, 250, 251, 252), room = c("B6", 
NA, NA, NA, NA, "B6", NA, NA, "B6", NA, NA, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, NA, NA, NA, "B6", NA, NA, NA, NA, NA, NA, "B6"
), id = c(2, 2, 2, 2, 2, 85, 85, 85, 2, 2, 2, 2, 2, 2, 2, 2, 
2, 2, 2, 2, 2, 2, 2, 32, 32, 32, 32, 32, 32, 32, 1), id_prob = c(0.710559149006359, 
NA, NA, NA, NA, 0.676624962451645, NA, NA, 0.650006199807849, 
NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 0.668218888964693, 
NA, NA, NA, NA, NA, NA, 0.786722974412071), x = c(1606, 1606, 
1606, 1606, 1606, 1319, 1319, 1319, 1636, 1636, 1636, 1636, 1636, 
1636, 1636, 1636, 1636, 1636, 1636, 1636, 1636, 1636, 1636, 1316.75, 
1316.75, 1316.75, 1316.75, 1316.75, 1316.75, 1316.75, 656.5), 
    y = c(-472.25, -472.25, -472.25, -472.25, -472.25, -516.5, 
    -516.5, -516.5, -463.5, -463.5, -463.5, -463.5, -463.5, -463.5, 
    -463.5, -463.5, -463.5, -463.5, -463.5, -463.5, -463.5, -463.5, 
    -463.5, -520, -520, -520, -520, -520, -520, -520, -941), 
    orientation = c(84.5596680381038, NA, NA, NA, NA, 51.3401926511951, 
    NA, NA, 71.565048727047, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, NA, NA, 63.4349516145757, NA, NA, NA, NA, 
    NA, NA, 120.963756691571), area = c(-133, NA, NA, NA, NA, 
    -98, NA, NA, -140, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 
    NA, NA, NA, NA, -130, NA, NA, NA, NA, NA, NA, -166)), row.names = c(NA, 
-31L), class = c("tbl_df", "tbl", "data.frame"))

However, in order to keep track of which rows are 'filled in' and which ones were already present in the raw data, I only want to apply this to specific columns. I.e. it is critical that only the NA values of the 3 specified columns get filled in. All the other columns should remain as NA.

When I try to apply this code to the list (i.e. to run it on every dataframe within the list) I run this:

test <- lapply(list, function(x) na.locf(x[c('id','x','y')],na.rm = F, maxgap = 20))

Unfortunately this removes all other columns except for those 3 from the data.frame. This option fills in the gaps for every column

test <- lapply(list, function(x) na.locf(x,na.rm = F, maxgap = 20))

Is there a way to apply my original code to the entire list of dataframes?

Thanks!

1 Answer 1

2

You can use the same code that you used for a single data frame:

test <- lapply(list, function(x) {
  x[c('id','x','y')] <- na.locf(x[c('id','x','y')], na.rm = F, maxgap = 20)
  x
})
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.