I am trying to fill the null values from ColY with values from ColX whilst storing the output as a new column in my DataFrame Col_new. I am using pyspark in databricks, however I am fairly new to this.
Sample Data is as follows:
ColX ColY
apple orange
pear null
grapefruit pear
apple null
The desired output would look like the following:
ColX ColY Col_new
apple orange orange
pear null pear
grapefruit pear pear
apple null apple
I have tried several lines of code to no avail. My latest attempt was as follows:
.withColumn("Col_new", col('ColX').select(coalesce('ColY')))
Any help would be greatly appreciated. Many Thanks.