Suppose we have a simple dataframe:
from pyspark.sql.types import *
schema = StructType([
StructField('id', LongType(), False),
StructField('name', StringType(), False),
StructField('count', LongType(), True),
])
df = spark.createDataFrame([(1,'Alice',None), (2,'Bob',1)], schema)
The question is how to detect null values? I tried the following:
df.where(df.count == None).show()
df.where(df.count is 'null').show()
df.where(df.count == 'null').show()
It results in error:
condition should be string or Column
I know the following works:
df.where("count is null").show()
But is there a way to achieve with without the full string? I.e. df.count...?