How to combine duplicate rows in pandas, filling in missing values?
In the example below, some rows have missing values in the c1 column, but the c2 column has duplicates that can be used as an index to look up and fill in those missing values.
the input data looks like this:
c1 c2
id
0 10.0 a
1 NaN b
2 30.0 c
3 10.0 a
4 20.0 b
5 NaN c
desired output:
c1 c2
0 10 a
1 20 b
2 30 c
But how to do this?
Here is the code to generate the example data:
import pandas as pd
df = pd.DataFrame({
'c1': [10, float('nan'), 30, 10, 20, float('nan')]
'c2': [100, 200, 300, 100, 200, 300],
})