I have a data frame 'dat' of dim 17000 x 3 of walking data. The interval column is 5 minute intervals for each 24 hour period, the date column is the date and the steps column is the number of steps taken in said 5 minute period on said date. NA's are present.
> head(df1)
steps date interval
1 NA 2012-10-01 0
2 NA 2012-10-01 5
3 NA 2012-10-01 10
4 NA 2012-10-01 15
5 NA 2012-10-01 20
6 NA 2012-10-01 25
I've used dplyr to group my df by date and then created a new df 'df.1' and summarized it as avg=mean(df.1$steps, na.rm = TRUE). This gives me a nice little df of the mean value of steps on each date
date avg
1 2012-10-01 NaN
2 2012-10-02 0.43750
3 2012-10-03 39.41667
4 2012-10-04 42.06944
5 2012-10-05 46.15972
6 2012-10-06 53.54167
What I would like to do is update my original df's NA-values with the mean value from each date.
So in the first table where 2012-10-02 was NA then I'd like to replace ever NA value in table one for 2012-10-02 with the value 0.43750. I've tried using indices, which, %in%, apply family and just can't find anything that is sticking.
Any help would be greatly appreciated.
merge. Also, if you have useddplyr,mutatewould be an option to add the column to the original dataset instead ofsummariselibrary(dplyr); df1 %>% group_by(date) %>% mutate(avg= mean(steps, na.rm=TRUE))mutate. If you need to do, thenmerge(df1, df1.1, by='date', all=TRUE)and then change the NA value in steps by the new column