I want to fill null values on a Spark df based on the values of the id column.
Pyspark df:
| index | id | animal | name |
|---|---|---|---|
| 1 | 001 | cat | doug |
| 2 | 002 | dog | null |
| 3 | 001 | cat | null |
| 4 | 003 | null | null |
| 5 | 001 | null | doug |
| 6 | 002 | null | bob |
| 7 | 003 | bird | larry |
Expected result:
| index | id | animal | name |
|---|---|---|---|
| 1 | 001 | cat | doug |
| 2 | 002 | dog | bob |
| 3 | 001 | cat | doug |
| 4 | 003 | bird | larry |
| 5 | 001 | cat | doug |
| 6 | 002 | dog | bob |
| 7 | 003 | bird | larry |