Replace null values in 1 column with data from another column

Question

I am trying to replace all null data in column count_1 with data that may be in column count_2. Below is the expected output with a given input. How can I do this in Spark Scala?

Input Dataframe

name   count_1 count_2
Java   10000   null
Python null    20000
Scala  30000   null
R      null    null
Swift  50000   65000

Output Dataframe

name   merged
Java   10000
Python 20000
Scala  30000
R      null
Swift  50000

Vikas Saxena · Accepted Answer · 2021-09-06 02:10:59Z

1

you can do a coalesce on the said columns

This is what I would do (this would work if you have more columns like count_4):

// find columns to do a coalesce
val cols = df.columns.filter(_.startsWith("count")).map(col(_)

// do the actual coalesce
df.select($"name", coalesce(cols: _*).as("merged"))

answered Sep 6, 2021 at 2:10

Vikas Saxena

1,1581 gold badge13 silver badges24 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Replace null values in 1 column with data from another column

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related