I have a dataframe with 1 column of type integer.
I want to create a new column with an array containing n elements (n being the # from the first column)
For example:
x = spark.createDataFrame([(1,), (2,),],StructType([ StructField("myInt", IntegerType(), True)]))
+-----+
|myInt|
+-----+
| 1|
| 2|
| 3|
+-----+
I need the resulting data frame to look like this:
+-----+---------+
|myInt| myArr|
+-----+---------+
| 1| [1]|
| 2| [2, 2]|
| 3|[3, 3, 3]|
+-----+---------+
Note, It doesn't actually matter what the values inside of the arrays are, it's just the count that matters.
It'd be fine if the resulting data frame looked like this:
+-----+------------------+
|myInt| myArr|
+-----+------------------+
| 1| [item]|
| 2| [item, item]|
| 3|[item, item, item]|
+-----+------------------+