1

I need to replace null values present in a column in Spark dataframe. Below is the code I tried

df=df.na.fill(0,Seq('c_amount')).show()

But it is throwing me an error NameError: name 'Seq' is not defined

Below is my table

   +------------+--------+
   |c_account_id|c_amount|
   +------------+--------+ 
   |           1|    null|    
   |           2|    123 |
   |           3|    null|
   +------------+--------+

Expected output

   +------------+--------+
   |c_account_id|c_amount|
   +------------+--------+ 
   |           1|       0|    
   |           2|     123|
   |           3|       0|
   +------------+--------+
0

1 Answer 1

1

You need to use like this

df = df.fillna("<BLANK>", subset=['col_name'])
Sign up to request clarification or add additional context in comments.

6 Comments

I am getting <BLANK> in the null place.Should I use "0" there ? If I use that won't that be considered as string ?
yes please.use 0. can you also please mind approving the answer ..
Sir After applying that code I am able to get correct answer.But after that if I try printSchema() it says 'NoneType' object has no attribute 'printSchema'. Can you elaborate on the problem Sir ?
NoneType means that instead of an instance of whatever Class or Object you think you're working with, you've actually got None. That usually means that an assignment or function call up above failed or returned an unexpected result.
But why I get such error after using this function ? Is there a way to solve this ?
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.