0

I have a spark dataframe like below

Id value
1   \N
2   \N
3    a
4    b
5   \N

I want to remove the \N records, which are null, from the df. How to do this?

1
  • you can just filter them out. filter(df.value != r'\N') Commented Dec 2, 2022 at 7:30

1 Answer 1

1

the simple filter should work.

data_sdf.filter(data_sdf.value != r'\N').show()

# +---+-----+
# | id|value|
# +---+-----+
# |  3|    a|
# |  4|    b|
# +---+-----+
Sign up to request clarification or add additional context in comments.

1 Comment

@NamithaJanardhanan - did you use the r before the quoted character? it shouldn;t give you the error if used correctly

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.