0

I am trying to utilize Scala udfs from pyspark and running into 'pyspark.sql.utils.AnalysisException: UDF class doesn't implement any UDF interface' error

My scala code looks something like this

package com.spark.udfexample

Object test{
  case class (a: Option[String], b: Option[String])
   
    val foo: StructType = new StructType().add("a", StringType).add("b", StringType)

    def udf1:UserDefinedFunction = udf((param: String, param: Seq[Row])) {
       ...
       ...
    }
    ..
.
}

I am packing this object class as jar and passing to pyspark context.

spark.udf.registerJavaFunction("udf1", "com.spark.udfexample.test") - running into the error here, can someone help? thx

1 Answer 1

0

I'd suggest looking at this blog post as it gives a run down of why you are getting that error.

Your pseudo code looks to be a setup of scala udf potentially even using class state and what the python interface is looking for is a no args public class implementing a UDFx interface.

If you can successfully register your actual scala udf function (and use it) within scala spark then you can create a function in an object to register it via scala and call that in your python code.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.