I'm trying to apply a function to all the columns of a pandas DataFrame. The function consists on divide each column (considered a pandas Series) by a parameter indicated on another DataFrame (df_reference), to which I access through the column name (Series.name).
Nevertheless, the operation is not working and the final df is full of NaNs values. I think is failing the way I'm inferring the name of the column on each iteration.
Here I show the code:
# This is an example of the df I'd like to operate over:
df = pd.DataFrame({'P01':np.random.random(50),
'P02':np.random.random(50)},
index=pd.period_range(start='2015-03-09', periods=50))
>>> df
P01 P02
2015-03-09 0.575955 0.735709
2015-03-10 0.290656 0.989249
2015-03-11 0.859850 0.387678
2015-03-12 0.939810 0.085914
2015-03-13 0.278855 0.031567
... ... ...
# This is an example of the reference df I'd like to consult about:
df_reference = pd.DataFrame({'ID':['P01', 'P02'], 'Lat':[37.261, 37.258],
'Lon':[-6.431, -6.433], 'Z':[-0.63, -0.825]})
>>> df_reference
ID Lat Lon Z
0 P01 37.261 -6.431 -0.630
1 P02 37.258 -6.433 -0.825
Apply operation:
df.apply(lambda x: x/df_reference.loc[df_reference['ID']==x.name]['Z'], axis=1)
Result:
P01 P02
2015-03-09 NaN NaN
2015-03-10 NaN NaN
2015-03-11 NaN NaN
2015-03-12 NaN NaN
... ... ...
Any clue on what could be happening?
x.namedoes not contain the column name but the index label since you useaxis=1