I have below mentioned dataframe:
ID Date Status Category
TR-1 2018-01-10 Passed A
TR-2 2018-01-09 Passed B
TR-3 2018-01-09 Failed C
TR-3 2018-01-09 Failed A
TR-4 2018-01-08 Failed B
TR-5 2018-01-08 Passed C
TR-5 2018-01-08 Failed A
TR-6 2018-01-07 Passed A
By utilizing the above given dataframe I want a output format as shown below:
The Date should be in descending order and the category sequence should be like C, A and B.
Date count distinct_count Passed Failed
2018-01-10 1 1 1 0
A 1 1 1 0
B 0 0 0 0
C 0 0 0 0
2018-01-09 3 2 1 2
A 1 1 1 0
B 1 1 1 0
C 1 1 1 0
To derive the above output, I have tried below code but it couldn't work and not able to get expected output.
Output<-DF %>%
group_by(Date=Date,A,B,C) %>%
summarise(`Count` = n(),
`Distinct_count` = n_distinct(ID),
Passed=sum(Status=='Passed'),
A=count(category='A'),
B=count(category='B'),
C=count(category='C'),
Failed=sum(Status=='Failed'))
Dput:
structure(list(ID = structure(c(1L, 2L, 3L, 3L, 4L, 5L, 5L, 6L
), .Label = c("TR-1", "TR-2", "TR-3", "TR-4", "TR-5", "TR-6"), class = "factor"),
Date = structure(c(4L, 3L, 3L, 3L, 2L, 2L, 2L, 1L), .Label = c("07/01/2018",
"08/01/2018", "09/01/2018", "10/01/2018"), class = "factor"),
Status = structure(c(2L, 2L, 1L, 1L, 1L, 2L, 1L, 2L), .Label = c("Failed",
"Passed"), class = "factor"), Category = structure(c(1L,
2L, 3L, 1L, 2L, 3L, 1L, 1L), .Label = c("A", "B", "C"), class = "factor")), .Names = c("ID",
"Date", "Status", "Category"), class = "data.frame", row.names = c(NA,
-8L))
dput()of your initial data frame would be useful for recreating the problemgroup_by(Date, Category) %>%? You could then summarize a second table grouped only byDatein order to get the count of Dates, andleft_joinit to the first one to get an extra column that indicates the count of dates...summarise...I think won't work as you propose. Please add the output of your initial data as suggested by Nutle.