R dplyr count occurrences that are in multiple conditions [closed]

Question

Closed. This question needs debugging details. It is not currently accepting answers.

Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.

Closed 8 months ago.

Improve this question

I am new to dplyr, and I would be curious about a fast way to get from this data:

ID	Age	YearDied
100	2	2005
102	4	NA
103	1	NA
106	5	2002
108	1	NA
109	1	NA
110	4	NA
112	3	NA

To this data (counting every survivor per Age, meaning if an ID is 5 years old, it passed ages 1,2,3,4,5, whereas an ID of 2 just passes 2 ages (1,2) (does this make sense?):

Age	SurvivorNumber
1	8
2	5
3	4
4	3
5	1

Is it also possible to combine the previous result with these ones (number of IDs in the category)?:

Age	Current Number of IDs
1	3
2	1
3	1
4	2
5	1

This was my starting code for the last case:

groupedDf <- inputDf %>%
               count(Age)  %>%
               group_by(Age = case_when(Age == 1 ~ '1',
                                          TRUE ~ as.character(Age))) %>%
               group_by(Age = case_when(Age == 2 ~ '2',
                                          TRUE ~ as.character(Age))) %>%
               group_by(Age = case_when(Age == 3 ~ '3',
                                          TRUE ~ as.character(Age))) %>%
               group_by(Age = case_when(Age == 4 ~ '4',
                                          TRUE ~ as.character(Age))) %>%
               group_by(Age = case_when(Age == 5 ~ '5',
                                          TRUE ~ as.character(Age))) %>%
               summarise(n = sum(n))  %>%
               arrange(nchar(Age), Age)

Please provide reproducible input data and expected output based on that. — s_baldur
– s_baldur, Commented Apr 15 at 14:10

one · Accepted Answer · 2025-04-15 14:27:25Z

4

It is easier to get the count table first and then generate SurvivorNumber through "reverse cumulative sum":

df %>%
  group_by(Age) %>%
  reframe(`Current Number of IDs`=n()) %>%
  mutate(SurvivorNumber=rev(cumsum(rev(`Current Number of IDs`))))

# A tibble: 5 × 3
    Age `Current Number of IDs` SurvivorNumber
  <int>                   <int>          <int>
1     1                       3              8
2     2                       1              5
3     3                       1              4
4     4                       2              3
5     5                       1              1

edited Apr 15 at 14:27

answered Apr 15 at 14:16

one

4,1322 gold badges7 silver badges29 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Friede · Accepted Answer · 2025-04-16 09:15:00Z

1

Base R

local({
  i = tapply(X$Age, X$Age, FUN=length)
  data.frame(age = names(i), count = i, survivor = rev(cumsum(rev(i))))
})

  age count survivor
1   1     3        8
2   2     1        5
3   3     1        4
4   4     2        3
5   5     1        1

Data

X = data.frame(
  ID = c(100, 102, 103, 106, 108, 109, 110, 112),
  Age = c(2, 4, 1, 5, 1, 1, 4, 3),
  YearDied = c(2005, NA, NA, 2002, NA, NA, NA, NA))

edited Apr 16 at 9:15

answered Apr 15 at 17:48

Friede

11.8k2 gold badges14 silver badges32 bronze badges

2 Comments

Edward Apr 16 at 3:24

i <- tabulate(X$Age) may be more concise.

Friede Apr 16 at 9:16

@Edward That would fill zeros at gaps and requires an additional matching step. Replaced seq() with names()!

Collectives™ on Stack Overflow

R dplyr count occurrences that are in multiple conditions [closed]

2 Answers 2

Comments

2 Comments

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

2 Comments

Related