enter code hereI am practising to add a list into dataframe col. I can def udf and register and then apply on dataframe but I want to try a different approach that extracting a list from dataframe col and them map it and then readd to the original dataframe in new column.
val df = spark.createDataFrame(Seq(("A",1),("B",2),("C",3))).toDF("Str", "Num")
+---+---+
|Str|Num|
+---+---+
| A| 1|
| B| 2|
| C| 3|
+---+---+
list collected:
scala> var ls : List[String] = df.select("Str").collect().map(f=>f.getString(0)).toList
var ls: List[String] = List(A, B, C, d)
Transformation:
def f(x : String) : String = {
if (x=="A") {x + "100"}
else {x + x.length.toString}
}
apply transformation:
scala> ls.map(x => f(x))
val res95: List[String] = List(A100, B1, C1, d1)
add column from the list: ERROR
import org.apache.spark.sql.functions.{lit,col}
df.withColumn("new", lit(ls)).show()
error: org.apache.spark.SparkRuntimeException: The feature is not supported: literal for 'List(A100, B1, C1)' of class scala.collection.immutable.$colon$colon.
//Please correct here