How to divide between groups of rows using dplyr?

Question

I have this dataframe:

x <- data.frame(
    name = rep(letters[1:4], each = 2),
    condition = rep(c("A", "B"), times = 4),
    value = c(2,10,4,20,8,40,20,100)
) 
#   name condition value
# 1    a         A     2
# 2    a         B    10
# 3    b         A     4
# 4    b         B    20
# 5    c         A     8
# 6    c         B    40
# 7    d         A    20
# 8    d         B   100

I want to group by name and divide the value of rows with condition == "B" with those with condition == "A", to get this:

data.frame(
    name = letters[1:4],
    value = c(5,5,5,5)
)
#   name value
# 1    a     5
# 2    b     5
# 3    c     5
# 4    d     5

I know something like this can get me pretty close:

x$value[which(x$condition == "B")]/x$value[which(x$condition == "A")]

but I was wondering if there was an easy way to do this with dplyr (My dataframe is a toy example and I got to it by chaining multiple group_by and summarise calls).

Steven Beaupré · Accepted Answer · 2016-05-25 21:37:16Z

18

Try:

x %>% 
  group_by(name) %>%
  summarise(value = value[condition == "B"] / value[condition == "A"])

Which gives:

#Source: local data frame [4 x 2]
#
#    name value
#  (fctr) (dbl)
#1      a     5
#2      b     5
#3      c     5
#4      d     5

answered May 25, 2016 at 21:37

Steven Beaupré

21.7k7 gold badges60 silver badges79 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

vicky Over a year ago

I have the same data as above, the only difference is that sometimes column "condition" does not have "A" or "B" all the time, so there's no denominator or numerator sometimes. I want to remove such rows and continue the division. Do you have any idea?

Steven Beaupré Over a year ago

@vicky just filter them up front? x %>% filter(condition %in% c("A", "B"))

Mhairi McNeill · Accepted Answer · 2016-05-25 21:41:37Z

6

I'd use spread from tidyr.

library(dplyr)
library(tidyr)

x %>%
  spread(condition, value) %>%
  mutate(value = B/A)

  name  A   B value
1    a  2  10     5
2    b  4  20     5
3    c  8  40     5
4    d 20 100     5

You could then do select(-A, -B) to drop the extra columns.

answered May 25, 2016 at 21:41

Mhairi McNeill

2,02114 silver badges21 bronze badges

Comments

akrun · Accepted Answer · 2016-05-26 02:23:03Z

4

Using data.table, convert the 'data.frame' to 'data.table' (setDT(x)), grouped by 'name', we divide the 'value' corresponds to 'B' condition by the those that corresponds to 'A' 'condition'.

library(data.table)
setDT(x)[,.(value = value[condition=="B"]/value[condition=="A"]) , name]
#    name value
#1:    a     5
#2:    b     5
#3:    c     5
#4:    d     5

Or reshape from 'long' to 'wide' and divide the 'B' column by 'A'.

dcast(setDT(x), name~condition, value.var='value')[, .(name, value = B/A)]

answered May 26, 2016 at 2:23

akrun

891k38 gold badges590 silver badges700 bronze badges

3 Comments

akrun Over a year ago

@user5249203 Perhaps you meant Map or may be you want to divide by something like x[-1]/x[-length(x)]

akrun Over a year ago

@user5249203 your comment is not clear to me in the context of this soluiton as here, we are doing a condition check for each row, i.e. the same. Did you meant condtion == 'a', condition == 'b', condition == 'a', 'and so on. IN that case, Map is needed i..e Map(function(x, y) value[condtion == x]/value[condition == y], yourfirstvec_orcol, yoursecondvec)

user5249203 Over a year ago

will try to post a Q.

Collectives™ on Stack Overflow

How to divide between groups of rows using dplyr?

3 Answers 3

2 Comments

Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

2 Comments

Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related