15

Suppose I have data in an R table which looks like this:

Id  Name Price sales Profit Month Category Mode
1   A     2     5     8       1     X       K
1   A     2     6     9       2     X       K
1   A     2     5     8       3     X       K
1   B     2     4     6       1     Y       L
1   B     2     3     4       2     Y       L
1   B     2     5     7       3     Y       L
2   C     2     5    11       1     X       M
2   C     2     5    11       2     X       L
2   C     2     5    11       3     X       K
2   D     2     8    10       1     Y       M
2   D     2     8    10       2     Y       K
2   D     2     5    7        3     Y       K
3   E     2     5    9        1     Y       M
3   E     2     5    9        2     Y       L
3   E     2     5    9        3     Y       M
3   F     2     4    7        1     Z       M
3   F     2     5    8        2     Z       L
3   F     2     5    8        3     Z       M

If I use the table function on this data like:

table(df$Category, df$Mode)

It will show me under each mode which category has how many observations. It's like counting the number of items in each category under each mode.

But what if I want the table to show under each Category which Mode earned how much Profit (sum or mean) and not the total count?

Is there any way to do this with the table function or another function in R?

2
  • You can sum and count this way: tmp = aggregate(df$Category, by=list(Category=df$Mode), FUN=sum) or tmp = aggregate(df$Category, by=list(Category=df$Mode), FUN=NROW) (notice "sum" is lowercase and "NROW" is all caps). Commented Feb 10, 2018 at 2:16
  • The two-dimensional nature of this question and the answers make them more useful than the more general linked question. Commented May 13 at 0:58

3 Answers 3

20

We can use xtabs from base R. By default, the xtabs gets the sum

xtabs(Profit~Category+Mode, df)
#           Mode
#Category  K  L  M
#       X 36 11 11
#       Y 17 26 28
#       Z  0  8 15

Or another base R option that is more flexible to apply different FUN is tapply.

with(df, tapply(Profit, list(Category, Mode), FUN=sum))
#  K  L  M
#X 36 11 11
#Y 17 26 28
#Z NA  8 15

Or we can use dcast to convert from 'long' to 'wide' format. It is more flexible as we can specify the fun.aggregate to sum, mean, median etc.

library(reshape2)
dcast(df, Category~Mode, value.var='Profit', sum)
# Category  K  L  M
#1        X 36 11 11
#2        Y 17 26 28
#3        Z  0  8 15

If you need it in the 'long' format, here is one option with data.table. We convert the 'data.frame' to 'data.table' (setDT(df)), grouped by 'Category' and 'Mode', we get the sum of 'Profit'.

library(data.table)
setDT(df)[, list(Profit= sum(Profit)) , by = .(Category, Mode)]
Sign up to request clarification or add additional context in comments.

Comments

4

Another possibility consists in using the aggregate() function:

profit_dat <- aggregate(Profit ~ Category + Mode, data=df, sum)
#> profit_dat
#  Category Mode Profit
#1        X    K     36
#2        Y    K     17
#3        X    L     11
#4        Y    L     26
#5        Z    L      8
#6        X    M     11
#7        Y    M     28
#8        Z    M     15

Comments

3

I prefer using dplyr for most data analysis:

library(dplyr)

group_by(df, Category, Mode) %>%
  summarise(sum = sum(Profit), count=n())

https://cran.rstudio.com/web/packages/dplyr/vignettes/introduction.html

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.