converting summary created using 'by' to data.frame

Question

df1=data.frame(c(2,1,2),c(1,2,3,4,5,6),seq(141,170)) #create data.frame
names(df1) = c("gender","age","height") #column names
df1$gender <- factor(df1$gender,
levels=c(1,2),
labels=c("female","male")) #gives levels and labels to gender
df1$age <- factor(df1$age,
levels=c(1,2,3,4,5,6),
labels=c("16-24","25-34","35-44","45-54","55-64","65+")) # gives levels and labels to age groups

I am looking to produce a summary of the height values subsetted by gender and then age.

Using the subset and by functions as provides the output I want:

females<-subset(df1,df1$gender==1) #subsetting by gender
males<-subset(df1,df1$gender==2)

foutput=by(females$height,females$age,summary) #producing summary subsetted by age
moutput=by(males$height,males$age,summary)

However I require it to be in a data.frame so that I can export these results alongside frequency tables using XLconnect.

Is there an way to convert the output to a data.frame or an elegant alternative, possibly using plyr?

Chase · Accepted Answer · 2012-02-23 12:31:56Z

4

Here's one approach using plyr:

> ddply(df1, c("gender", "age"), function(x) summary(x$height))
  gender   age Min. 1st Qu. Median Mean 3rd Qu. Max.
1 female 25-34  142     148    154  154     160  166
2 female 55-64  145     151    157  157     163  169
3   male 16-24  141     147    153  153     159  165
4   male 35-44  143     149    155  155     161  167
5   male 45-54  144     150    156  156     162  168
6   male   65+  146     152    158  158     164  170

answered Feb 23, 2012 at 12:31

Chase

69.5k18 gold badges147 silver badges164 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

BuckyOH Over a year ago

That looks ideal. I thought plyr might be the solution!

Chase Over a year ago

@BuckyO - I find it hard to beat plyr for ease of use and consistency between different tasks. You may run into performance issues with large data and/or many groups, but for most "mortal" tasks - I find it quite nice. Good luck!

BuckyOH Over a year ago

Thanks for that. I'll keep in mind the performance issues you have mentioned. Approved this answer as I have tried it with more subsets and other functions and it has worked.

BuckyOH Over a year ago

could you explain function (x) to me. I have looked again at where I used this function with count and another subset, in this case it adds a column called x.

Chase Over a year ago

@BuckyO - function(x) is an "anonymous function. Each "chunk" of df1 is broken up by the combinations of age and gender and passed to the function(x), we're then able to reference that chunk with x in the call to summary. Here's a bit more of a background on anynomous functions, and some specific insight to R: en.wikipedia.org/wiki/Anonymous_function#R

James · Accepted Answer · 2012-02-23 12:59:25Z

2

The output from by is really a list, but it looks different because of the print.by method.

So you can use do.call to rbind the elements into a matrix and then call data.frame on that:

data.frame(do.call(rbind,by(mtcars$hp,mtcars$cyl,summary)),check.names=FALSE)
  Min. 1st Qu. Median   Mean 3rd Qu. Max.
4   52    65.5   91.0  82.64    96.0  113
6  105   110.0  110.0 122.30   123.0  175
8  150   176.2  192.5 209.20   241.2  335

Note the use of the check.names argument to avoid column names sanitisation.

answered Feb 23, 2012 at 12:59

James

67.1k14 gold badges158 silver badges200 bronze badges

3 Comments

BuckyOH Over a year ago

Thanks for your answer and especially about print.by. The minimum values here are lower than the minimum height value, is this an example from another data set?

James Over a year ago

@BuckyO Yes, this is from the built-in mtcars data set. I'm using IE7 and have difficulty copying multiline data examples on here.

BuckyOH Over a year ago

I've approved Chase's answer but I'll try yours was also very useful. Thanks.

Collectives™ on Stack Overflow

converting summary created using 'by' to data.frame

2 Answers 2

5 Comments

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related