0

Trying to output summary statistics using summarize from rockchalk package. Want the stats to be rounded to 2 decimals. I get an error message when using round on summarize.

library(rockchalk)
M1 <- structure(c(0.18, 0.2, 0.24, 0.35, -0.22, -0.17, 0.28, -0.28, -0.14, 0.03, 0.87, -0.2, 0.06, -0.1, -0.72, 0.18, 0.01, 0.31, -0.36, 0.61, -0.16, -0.07, -0.13, 0.01, -0.09, 0.26, -0.14, 0.08, -0.62, -0.2, 0.3, -0.21, -0.11, 0.05, 0.06, -0.28, -0.27, 0.17, 0.42, -0.05, -0.15, 0.05, -0.07, -0.22, -0.34, 0.16, 0.34, 0.1, -0.12, 0.24, 0.45, 0.37, 0.61, 0.9, -0.25, 0.02), .Dim = c(56L, 1L))

#This works
round(apply(M1, 2, mean),2)

#This works
summaryround <- function(x) {round(summary(x),2)} 
apply(M1, 2, summaryround)

#This gives error "non-numeric argument"
round(apply(M1, 2, summarize),2)

#Thought this would work but also gives error "non-numeric argument"
summarizeround <- function(x) {round(summarize(x),2)} 
apply(M1, 2, summarizeround)

Any ideas? I can round the output of summary but want to use summarize if possible as I like to get the outputs of kurtosis and skewness in the same printout (of course, could create my own function combining summary and kurtosis and whatever I want, rather not if avoidable).


EDIT: should have mentioned actually running this on a large data frame; turned it into a 1 column matrix as I thought would make the reproducible example simpler.

2 Answers 2

2

You just need to extract the numerics field from summarize result. Besides, I would prefer to use lapply to keep the rownames of the results and use do.call(bind,...) if you have multiple columns to summarize.

summarizeround <- function(x) {round(summarize(x)$numerics,2)} 
summaryDf <- do.call(cbind, lapply(as.data.frame(M1), summarizeround))

             x
0%       -0.72
25%      -0.16
50%       0.02
75%       0.24
100%      0.90
mean      0.04
sd        0.32
var       0.10
skewness  0.45
kurtosis  0.56
NA's      0.00
N        56.00
Sign up to request clarification or add additional context in comments.

2 Comments

Psidom would you know how to keep the column names rather than getting an "x"? Your code works great but I'm running this on a data frame with a few hundred columns, need those column names. Thanks!
You can split the summary into two steps by summaryList <- lapply(as.data.frame(M1), summarizeround); summaryDf <- as.data.frame(do.call(cbind, summaryList)) and then assign the names to the reuslt dataframe names(summaryDf) <- names(summaryList)
0

?rockchalk::summarize says the argument is to be a data frame. So, make M1 a data frame

M1<-as.data.frame(M1)
summarize(M1)

$numerics
              V1
0%       -0.7200
25%      -0.1625
50%       0.0150
75%       0.2400
100%      0.9000
mean      0.0400
sd        0.3152
var       0.0993
skewness  0.4485
kurtosis  0.5626
NA's      0.0000
N        56.0000

$factors
NULL

And to get the rounding

> round(summarize(M1)[[1]],2)
            V1
0%       -0.72
25%      -0.16
50%       0.02
75%       0.24
100%      0.90
mean      0.04
sd        0.32
var       0.10
skewness  0.45
kurtosis  0.56
NA's      0.00
N        56.00

2 Comments

what is rockchalk failure msg?
Running round(summarize(M1),2) gave same error. The code is intended for a large number of columns, so selecting column 1 would not do. Thanks very much anyways!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.