I have a data frame with columns nullability as True. Wanted to convert to False in Pyspark.
I can do it in the below way. But I don't want to convert to rdd because I'm reading as structured streaming and converting to rdd is not recommended.
def set_df_columns_nullable(self, spark, df, column_list, nullable=True):
for struct_field in df.schema:
if struct_field.name in column_list:
struct_field.nullable = nullable
df_mod = spark.createDataFrame(df.rdd, df.schema)
return df_mod
Thanks in Advance