0

I have a Spark SQL DF, in which i am trying to call one UDF [ which i created using Spark SQL udf.

val udfName = udf(somemethodName)
val newDF = df.withColumn("columnnew", udfName(col("anotherDFColumn"))

I'm trying to use another DF stored as a val inside the somemethodName, but the DF is coming as null.

This is happening only when i use where clause in the newDF.

Am i missing something?Is it not possible to use another variable / method inside UDF method?

Or do i have to do something with broadcast? Currently i am running this on local, not in the cluster though.

1 Answer 1

5

Is it not possible to use another variable / method inside UDF method

It is possible if and only if that variable / method can be serialized - a UDF is a closure that must be serialized and distributed to executors.

A Dataframe cannot be serialized (it's a pointer to other distributed data, so there's no logical way to serialize it without collecting it into Driver memory), therefore appears as null when you try to use the UDF.

You're probably going to need to join the two dataframes on some key, and then use a UDF (or a standard transformation) that takes columns from the joined Dataframe.

Sign up to request clarification or add additional context in comments.

4 Comments

In fact Dataset / Dataframe can be serialized. Just cannot be used in the UDF closure.
@Tzach Zohar: There is no common key between the dataframes, so i cannot join, the other dataframe is just a lookup file., having some ranges values like 0 to 60 etc.. how can i use the lookup inside the UDF? i hope i can read the lookup file inside udf method, but for each and every record i need to load the entire file...
@LostInOverflow: Can i use the List, which is defined outside the UDF method name? I`m planning to read the range from the file and create a list, then use the list inside UDF method.. will it work?
Yes, you can use any non-distributed data structure.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.