Lets call summary(my_data):
year quarter employed newhires separations jobscreated jobsdestroyed
Min. :1990 Min. :1.000 Min. : 6976 Min. : 2321 Min. : 1922 Min. : 1091 Min. : 520
1st Qu.:2000 1st Qu.:2.000 1st Qu.: 28049 1st Qu.: 16858 1st Qu.: 13912 1st Qu.: 6595 1st Qu.: 3862
Median :2003 Median :3.000 Median : 64836 Median : 39188 Median : 32018 Median : 14148 Median : 7727
Mean :2003 Mean :2.509 Mean : 94468 Mean : 59336 Mean : 48973 Mean : 22036 Mean :11843
3rd Qu.:2007 3rd Qu.:4.000 3rd Qu.:121905 3rd Qu.: 75960 3rd Qu.: 61976 3rd Qu.: 26829 3rd Qu.:14993
Max. :2010 Max. :4.000 Max. :571419 Max. :448423 Max. :391454 Max. :166022 Max. :80338
NA's :49 NA's :49 NA's :49
I want to convert this output into a data.table formatted as follows, where all entries (omitted in this depiction) are the raw values of min, 1st quartile. etc. :
year quarter employed newhires separations jobscreated jobsdestroyed
Min.
1st Qu.
Median
Mean
3rd Qu.
Max.
NA's
The following almost achieves this result, except for the fact that Min. , 1st Qu. , Median , Mean , 3rd Qu. , Max. , and NA's carry over into each entry. I want purely the raw numbers.
data.frame(unclass(summary(my_data)), check.names = FALSE, stringsAsFactors = FALSE)
year quarter employed newhires separations jobscreated jobsdestroyed
X Min. :1990 Min. :1.000 Min. : 6976 Min. : 2321 Min. : 1922 Min. : 1091 Min. : 520
X.1 1st Qu.:2000 1st Qu.:2.000 1st Qu.: 28049 1st Qu.: 16858 1st Qu.: 13912 1st Qu.: 6595 1st Qu.: 3862
X.2 Median :2003 Median :3.000 Median : 64836 Median : 39188 Median : 32018 Median : 14148 Median : 7727
X.3 Mean :2003 Mean :2.509 Mean : 94468 Mean : 59336 Mean : 48973 Mean : 22036 Mean :11843
X.4 3rd Qu.:2007 3rd Qu.:4.000 3rd Qu.:121905 3rd Qu.: 75960 3rd Qu.: 61976 3rd Qu.: 26829 3rd Qu.:14993
X.5 Max. :2010 Max. :4.000 Max. :571419 Max. :448423 Max. :391454 Max. :166022 Max. :80338
X.6 <NA> <NA> <NA> NA's :49 <NA> NA's :49 NA's :49
Potential solutions include (1) deriving the table directly from summary(), or (2) using the output above and finding a way to remove Min. , 1st Qu. , Median , Mean , 3rd Qu. , Max. , and NA labels from reach entry and instead list them as column names. Your help is much appreicated!
do.call(cbind, lapply(mydf, summary))works fine. At least with mtcars dataset. However I can't tell if it is ok with NaN values.NA/NaN, sincesummaryusestable(..., useNA="ifany")hard-coded ... so unless all columns have at least one, that will always fail. An alternative is to usefixed_summary <- function(object, ...) { o <- summary(c(object, NA), ...); o["NA's"] <- o["NA's"] - 1L; o; }and thenas.data.frame(sapply(mtcars, fixed_summary))(tested withmtcars[2,2] <- NA; mtcars[3,2] <- NaN).