4

My questions is related to this one:

Turn vector output into columns in data.table?

But my situation is a bit more complicated. I am not only returning the vector as the columns, but I am also calculating other columns at the same time. E.g.:

DT = data.table(X = 1:10, Y = 11:20, Z = 21:30, group = rep(1:10, each = 3))

featuresDT <- quote(list(x = mean(X),
                         y = mean(Y),
                         z = mean(Z),
                         as.list(quantile(X))))

DT[, eval(featuresDT), by = "group"]

where quantile returns a length 5 vector. Instead of getting a data.table with 8 columns, I am getting one with 4 columns and the quantile results are displayed as extra rows and x, y and z are duplicated 5 times. I also tried dist = as.list(quantile(X) but that gives the same result but different column name.

1
  • 1
    can you please provide a reproducible example ? Commented Aug 12, 2013 at 11:36

2 Answers 2

3

You should just change the list to c. c with any value of type list will automatically result in a list):

featuresDT <- quote(c(x = mean(X),
                         y = mean(Y),
                         z = mean(Z),
                         as.list(quantile(X))))
DT[, eval(featuresDT), by = "group"]

    group        x        y        z 0% 25% 50% 75% 100%
 1:     1 2.000000 12.00000 22.00000  1 1.5   2 2.5    3
 2:     2 5.000000 15.00000 25.00000  4 4.5   5 5.5    6
 3:     3 8.000000 18.00000 28.00000  7 7.5   8 8.5    9
 4:     4 4.333333 14.33333 24.33333  1 1.5   2 6.0   10
 5:     5 4.000000 14.00000 24.00000  3 3.5   4 4.5    5
 6:     6 7.000000 17.00000 27.00000  6 6.5   7 7.5    8
 7:     7 6.666667 16.66667 26.66667  1 5.0   9 9.5   10
 8:     8 3.000000 13.00000 23.00000  2 2.5   3 3.5    4
 9:     9 6.000000 16.00000 26.00000  5 5.5   6 6.5    7
10:    10 9.000000 19.00000 29.00000  8 8.5   9 9.5   10
Sign up to request clarification or add additional context in comments.

1 Comment

quantile function gives each out a name (eg "0%", "25%"). That's nice. What about arguments that output a vector but one that is unamed (eg range())?. Could I specify names?
1

Try this:

featuresDT <- quote(cbind(list(x = mean(X),
                         y = mean(Y),
                         z = mean(Z)),
                         as.data.table(t(quantile(X)))))

3 Comments

That works - thank you! Is there a way for me to change the colnumn names? I am doing quantile for X Y and Z and the present solution gives me duplicated column names.
@mchangun Just write X = as.data.table(... and an "X." will prefix the current column names.
@Roland I modified your code slightly - I am doing: featuresDT <- quote(c(list(x = mean(X), y = mean(Y), z = mean(Z)), X = as.list(quantile(X)))) Is there any performance difference the two solutions? Keeping everything as lists seem to be more in line with the "spirit" of data.tables.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.