0

Notice: this is for Spark version 2.1.1.2.6.1.0-129

I have a spark dataframe (Python). I would like to replace all instances of 0 across the entirety of the dataframe (without specifying particular column names), with NULL values.

The following is the code that I have written:

my_df = my_df.na.replace(0, None)

The following is the error that I receive:

  File "<stdin>", line 1, in <module>
  File "/usr/hdp/current/spark2-client/python/pyspark/sql/dataframe.py", line 1634, in replace
    return self.df.replace(to_replace, value, subset)
  File "/usr/hdp/current/spark2-client/python/pyspark/sql/dataframe.py", line 1323, in replace
    raise ValueError("value should be a float, int, long, string, list, or tuple")
ValueError: value should be a float, int, long, string, list, or tuple 
2
  • my_df.na.replace(0, None) works fine for me (Spark 3.x) What's your Spark version? Commented Oct 24, 2021 at 20:16
  • I have spark version 2.1.1.2.6.1.0-129. I have also updated the question to include the version Commented Oct 24, 2021 at 20:18

1 Answer 1

1

Apparently in Spark 2.1.1, df.na.replace does not support None. None option is only available since 2.3.0, which is not applicable in your case.

To replace values dynamically (i.e without typing columns name manually), you can use either df.columns or df.dtypes. The latter will give you the option to compare datatype as well.

from pyspark.sql import functions as F

for c in df.dtypes:
    if c[1] == 'bigint':
        df = df.withColumn(c[0], F.when(F.col(c[0]) == 0, F.lit(None)).otherwise(F.col(c[0])))

# Input
# +---+---+
# | id|val|
# +---+---+
# |  0|  a|
# |  1|  b|
# |  2|  c|
# +---+---+

# Output
# +----+---+
# |  id|val|
# +----+---+
# |null|  a|
# |   1|  b|
# |   2|  c|
# +----+---+
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.