I am new to dplyr, and I would be curious about a fast way to get from this data:
| ID | Age | YearDied |
|---|---|---|
| 100 | 2 | 2005 |
| 102 | 4 | NA |
| 103 | 1 | NA |
| 106 | 5 | 2002 |
| 108 | 1 | NA |
| 109 | 1 | NA |
| 110 | 4 | NA |
| 112 | 3 | NA |
To this data (counting every survivor per Age, meaning if an ID is 5 years old, it passed ages 1,2,3,4,5, whereas an ID of 2 just passes 2 ages (1,2) (does this make sense?):
| Age | SurvivorNumber |
|---|---|
| 1 | 8 |
| 2 | 5 |
| 3 | 4 |
| 4 | 3 |
| 5 | 1 |
Is it also possible to combine the previous result with these ones (number of IDs in the category)?:
| Age | Current Number of IDs |
|---|---|
| 1 | 3 |
| 2 | 1 |
| 3 | 1 |
| 4 | 2 |
| 5 | 1 |
This was my starting code for the last case:
groupedDf <- inputDf %>%
count(Age) %>%
group_by(Age = case_when(Age == 1 ~ '1',
TRUE ~ as.character(Age))) %>%
group_by(Age = case_when(Age == 2 ~ '2',
TRUE ~ as.character(Age))) %>%
group_by(Age = case_when(Age == 3 ~ '3',
TRUE ~ as.character(Age))) %>%
group_by(Age = case_when(Age == 4 ~ '4',
TRUE ~ as.character(Age))) %>%
group_by(Age = case_when(Age == 5 ~ '5',
TRUE ~ as.character(Age))) %>%
summarise(n = sum(n)) %>%
arrange(nchar(Age), Age)