0

I have a dataframe df, but because 3 of its columns that should normally be "double" have values like "NULL", the automatic type is cast as string.

df =

col_0      col_1      col_2            col_3
Word       73452     3859.0     4734.0000000
Word1      23452     3859.0             NULL
Word2      73452       NULL     4758.0000000
Word1      73454       NULL     4756.0000000
Word2      73452     3456.0     4758.0000000

I want to change this and my attempt was:

from pyspark.sql.types import DoubleType

def my_function(df):
    df_1 = df.withColumn("col_1", df["col_1"].cast(DoubleType()))
    df_2 = df_1.withColumn("col_2", df_1["col_2"].cast(DoubleType()))
    df_3 = df_2.withColumn("col_3", df_2["col_3"].cast(DoubleType()))
    return df_3

But I don't know how to replace the string "NULL" with an empty kind of null. And even so, would that be enough?

2
  • 1
    What is NULL above? Is it text? Commented Nov 4, 2022 at 14:14
  • I tested it - .cast(DoubleType()) replaces every non-double convertible entry to empty kind of null you are expecting. Commented Nov 4, 2022 at 14:20

1 Answer 1

1

You can try and replacing the strings with value NULL with the Python's None type and then casting to correct types, like this:

df = spark.createDataFrame([("Word1", 23452, 3859.0, "NULL"), ("Word2", 73452, "NULL", 4758.0000000)], "col_0: string, col_1: int, col_2: string, col_3: string")
print(df.dtypes)

# [('col_0', 'string'), ('col_1', 'int'), ('col_2', 'string'), ('col_3', 'string')]

from pyspark.sql.functions import col

print(df.na.replace("NULL", None).select(col("col_0"), col("col_1").cast("int"), col("col_2").cast("double"), col("col_3").cast("double")).dtypes)

# [('col_0', 'string'), ('col_1', 'int'), ('col_2', 'double'), ('col_3', 'double')]
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.