Fill in missing rows

Question

I have a data frame of county executives and the year they were inaugurated.

I am runnig a panel study with county-year as the unit of analyis. The date range is 2000 to 2004.

I will like to expand the df such that it lists who was the county executive during each year between the years 2000 and 2004.

For instance, I would like this df

df <- data.frame(year= c(2000, 2001, 2003, 2000, 2002, 2004),
                  executive.name= c("Johnson", "Smith", "Alleghany", "Roberts", "Clarke", "Tollson"),
                 party= c("PartyRed", "PartyYellow", "PartyGreen", "PartyYellow", "PartyOrange", "PartyRed"),
                  district= rep(c(1001, 1002), each=3))

to look like this

df.neat <- data.frame(year= c(2000, 2001, 2002, 2003, 2004, 2000, 2001, 2002, 2003, 2004),
                  executive.name= c("Johnson", "Smith", "Smith", "Alleghany", "Alleghany", "Roberts", "Roberts", "Clarke", "Clarke", "Tollson"),
                  party= c("PartyRed", "PartyYellow", "PartyYellow", "PartyGreen", "PartyGreen", "PartyYellow", "PartyYellow", "PartyOrange", "PartyOrange", "PartyRed"),
                  district= rep(c(1001, 1002), each=5))

Jon Spring · Accepted Answer · 2024-07-17 00:36:11Z

3

df |>
  tidyr::complete(district, year) |>
  dplyr::group_by(district) |>
  tidyr::fill(executive.name, party) |>
  dplyr::ungroup()

Result

# A tibble: 10 × 4
   district  year executive.name party      
      <dbl> <dbl> <chr>          <chr>      
 1     1001  2000 Johnson        PartyRed   
 2     1001  2001 Smith          PartyYellow
 3     1001  2002 Smith          PartyYellow
 4     1001  2003 Alleghany      PartyGreen 
 5     1001  2004 Alleghany      PartyGreen 
 6     1002  2000 Roberts        PartyYellow
 7     1002  2001 Roberts        PartyYellow
 8     1002  2002 Clarke         PartyOrange
 9     1002  2003 Clarke         PartyOrange
10     1002  2004 Tollson        PartyRed

edited Jul 17, 2024 at 0:36

answered Jul 16, 2024 at 17:08

Jon Spring

70.2k4 gold badges42 silver badges70 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

YouLocalRUser Over a year ago

This is very helpful and worked for the most part. The problem is that some of the counties in my full df were created in the course of the time period. COMPLETE assumes that those years are implicit NAs. FILL then drags down the last row of the previous district into the new district. Is there anyway of running this code individually for each group? I re-asked the question with the new parameters here Thank you! stackoverflow.com/questions/78756985/…

Jon Spring Over a year ago

Aha, I should have anticipated that. I would add group_by(district) befor the fill and ungroup() after.

Collectives™ on Stack Overflow

Fill in missing rows

1 Answer 1

2 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related