0

I receive and error while calling udf from within withColumn in Spark using Scala. This error happens while building with SBT.

val hiveRDD = sqlContext.sql("select * from iac_trinity.ctg_us_clickstream")
hiveRDD.persist()

val trnEventDf = hiveRDD
  .withColumn("system_generated_id", getAuthId(hiveRDD("session_user_id")))
  .withColumn("application_assigned_event_id", hiveRDD("event_event_id"))


val getAuthId = udf((session_user_id:String) => {
    if (session_user_id != None){
        if (session_user_id != "NULL"){
            if (session_user_id != "null"){
            session_user_id
          }else "-1"
        }else "-1"
    }else "-1"
  }

)

I receive the error which is -

scala:58: No TypeTag available for String
val getAuthId = udf((session_user_id:String) => {

It compiles properly when instead of (session_user_id:String) I use (session_user_id:Any) but fails in runtime as Any is not recognized in Spark. Please let me know how to handle this.

1
  • Error or not this doesn't make sense. Object of class String cannot be None! Commented Jul 4, 2016 at 8:14

1 Answer 1

1

Have you tried being explicit with your types?

udf[String, String]((session_user_id:String)...
Sign up to request clarification or add additional context in comments.

2 Comments

Yes , I have tried being explicit - val getAuthId = udf[String,String]((session_user_id:String) => if (session_user_id == None) .... the error is the same - scala:57: No TypeTag available for String [error] val getAuthId = udf[String,String]((session_user_id:String) => if (session_user_id == None)"-1"
@preitamojha are you sure you are executing the same code your are giving us ? It seems unlikely that this doesn't work. I can't reproduce the error.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.