I want to use regexp_replace() in PySpark to convert all question marks and back slashes in my data frame to null values. This is the code I used:
question = "?"
empty_str = "\\\"\\\""
for column in df.columns:
df = df.withColumn(column, regexp_replace(column, question, None)
df = df.withColumn(column, regexp_replace(column, empty_str, None)
However, when I use this code all the values in my dataframe turn into null values - not just the question marks and back slashes. Is there a way I can change my code to fix this?