I have a dataframe -
values = [('A',8),('B',7)]
df = sqlContext.createDataFrame(values,['col1','col2'])
df.show()
+----+----+
|col1|col2|
+----+----+
| A| 8|
| B| 7|
+----+----+
I want the list of even numbers from 0 till col2.
#Returns even numbers
def make_list(col):
return list(map(int,[x for x in range(col+1) if x % 2 == 0]))
make_list = udf(make_list)
df = df.withColumn('list',make_list(col('col2')))
df.show()
+----+----+---------------+
|col1|col2| list|
+----+----+---------------+
| A| 8|[0, 2, 4, 6, 8]|
| B| 7| [0, 2, 4, 6]|
+----+----+---------------+
df.printSchema()
root
|-- col1: string (nullable = true)
|-- col2: long (nullable = true)
|-- list: string (nullable = true)
I get the list I want, but the list is of string type rather than int, as you can see in the printschema above.
How can I get the list of int type? Without int type, I cannot explode this dataframe.
Any ideas as to how can I get a list of integers?
udf, it will default toStringTypeexplodethe list, you can also try a variation of the code from this question.