Could you please help me in understanding the following method:
def extractGlobalID(custDimIndex :Int)(gaData:DataFrame) : DataFrame = {
val getGlobId = udf[String,Seq[GenericRowWithSchema]](genArr => {
val globId: List[String] =
genArr.toList
.filter(_(0) == custDimIndex)
.map(custDim => custDim(1).toString)
globId match {
case Nil => ""
case x :: _ => x
}
})
gaData.withColumn("globalId", getGlobId('customDimensions))
}
collectFirstfollowed byfoldwould be cleaner and faster?