Scala Spark udf java.lang.UnsupportedOperationException

Question

I have created this currying function to check for null values for endDateStr inside an udf, the code is as follows:(Type of col x is ArrayType[TimestampType]):

    def _getCountAll(dates: Seq[Timestamp]) = Option(dates).map(_.length)
    def _getCountFiltered(endDate: Timestamp)(dates: Seq[Timestamp]) = Option(dates).map(_.count(!_.after(endDate)))

    val getCountUDF = udf((endDateStr: Option[String]) => {
      endDateStr match {
        case None => _getCountAll _
        case Some(value) => _getCountFiltered(Timestamp.valueOf(value + " 23:59:59")) _
      }
    })
    df.withColumn("distinct_dx_count", getCountUDF(lit("2009-09-10"))(col("x")))

But I am getting this exception while executing:

java.lang.UnsupportedOperationException: Schema for type Seq[java.sql.Timestamp] => Option[Int] is not supported

Can anyone please help me to figure out my mistake?

Alper t. Turker · Accepted Answer · 2018-06-12 16:30:34Z

1

You cannot curry udf like this. If you want curry-like behavior you should return udf from the outer function:

def getCountUDF(endDateStr: Option[String]) = udf {
  endDateStr match {
    case None => _getCountAll _
    case Some(value) => 
      _getCountFiltered(Timestamp.valueOf(value + " 23:59:59")) _
  }
}

df.withColumn("distinct_dx_count", getCountUDF(Some("2009-09-10"))(col("x")))

otherwise just drop currying and provide both arguments at the same time:

val getCountUDF = udf((endDateStr: String, dates: Seq[Timestamp]) => 
  endDateStr match {
    case null => _getCountAll(dates)
    case _ => 
      _getCountFiltered(Timestamp.valueOf(endDateStr + " 23:59:59"))(dates)
  }
)

df.withColumn("distinct_dx_count", getCountUDF(lit("2009-09-10"), col("x")))

edited Jun 12, 2018 at 16:30

answered Jun 12, 2018 at 16:25

Alper t. Turker

35.3k9 gold badges89 silver badges118 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Scala Spark udf java.lang.UnsupportedOperationException

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related