I used this code to aggragate the grouped data:
val result=union_df.orderBy(desc("timestamp")).groupBy("id").agg(collect_set("region") as "region")
Then I got the datatype:
org.apache.spark.sql.DataFrame = [id: string, region: array<string>]
What is the different between array<string> and Array<String>? How do I iterate over array<string> in map function (there is no getArray function for Row)?
array<string>andstringare not Scala types, they are just results oftoStringcall on aDataTypewhich is called bytoStringonDataFrame.