How to pass variable arguments to the Cube function in spark sql?

Question

How to pass variable arguments to the Cube function in spark sql and also agg function of the cube?

I have a list of columns, and I want to find the cube function on the columns and also aggerations function.

For example:

val columnsInsideCube = List("data", "product","country")
val aggColumns = List("revenue")

I want something like this:

dataFrame.cube(columns:String*).agg(aggcolumns:String*)

This is not like passing scala array to the Cube. Cube is predefined class in the datafram.we have to send it in a proper manner.

I formatted your text and fixed your grammar, because I love you. Next time do it from yourself, thank you. And don't forget: "I" is always capital case on English! — peterh
– peterh, Commented Jun 14, 2016 at 15:12
Possible duplicate of How pass scala Array into scala vararg method? — zero323
– zero323, Commented Jun 15, 2016 at 1:30

InLaw · Accepted Answer · 2019-01-27 08:27:04Z

0

You could use

Spark (new in version 1.4)

import pyspark.sql.DataFrame.cube
df.cube("name", df.age).count().orderBy("name", "age").show()

or HiveSQL

GROUP BY a, b, c WITH CUBE

or which is equivalent to

GROUP BY a, b, c GROUPING SETS ( (a, b, c), (a, b), (b, c), (a, c), (a), (b), (c), ( ))

or you could use other libraries like

import com.activeviam.sparkube._

answered Jan 27, 2019 at 8:27

InLaw

2,7372 gold badges27 silver badges36 bronze badges

Sign up to request clarification or add additional context in comments.

1 Answer 1