I have two panel data sets that are mergeable. However, I have to create bins of time as both data sets are highly imbalanced and have highly variable measurement occasions.
One approach is the following:
panel_df1_monthly <- panel_df1 %>%
mutate(TimeMetric_Monthly = floor_date(TimeVariable, unit = "month")) %>%
group_by(ID, TimeMetric_Monthly) %>%
summarize(Monthly_Avg_Value1 = mean(Value1, na.rm = TRUE))
panel_df2_monthly <- panel_df2 %>%
mutate(TimeMetric_Monthly = floor_date(TimeVariable, unit = "month")) %>%
group_by(ID, TimeMetric_Monthly) %>%
summarize(Monthly_Avg_Value2 = mean(Value2, na.rm = TRUE))
However, regardless of the time metric chosen (e.g., month, quarter), the aggregated statistic is the same. The value should vary depending on the time metric and the new data sets (e.g., panel_df1_monthly, panel_df1_quarterly) should look different, with a column for the new time metric and another column with it's corresponding value for another variable. Here, all the new panel datasets look alike.
I have also tried other approaches, but these do not produce harmonizable panel datasets the new time metrics do not align across them.
cols <- c("ID", "TimeVariable", "Value1")and please edit the question with the output ofdput(df[cols]). Or, if it is too big with the output ofdput(head(df[cols], 20)).plyrpackage, which has asummarizefunction which does not see groupings.