2

I'm trying to apply a function to all the columns of a pandas DataFrame. The function consists on divide each column (considered a pandas Series) by a parameter indicated on another DataFrame (df_reference), to which I access through the column name (Series.name).

Nevertheless, the operation is not working and the final df is full of NaNs values. I think is failing the way I'm inferring the name of the column on each iteration.

Here I show the code:

# This is an example of the df I'd like to operate over:

df = pd.DataFrame({'P01':np.random.random(50),
                   'P02':np.random.random(50)},
                   index=pd.period_range(start='2015-03-09', periods=50))

>>> df

              P01          P02
2015-03-09  0.575955    0.735709
2015-03-10  0.290656    0.989249
2015-03-11  0.859850    0.387678
2015-03-12  0.939810    0.085914
2015-03-13  0.278855    0.031567
   ...        ...         ...
# This is an example of the reference df I'd like to consult about:

df_reference = pd.DataFrame({'ID':['P01', 'P02'], 'Lat':[37.261, 37.258],
                             'Lon':[-6.431, -6.433], 'Z':[-0.63, -0.825]})

>>> df_reference

    ID    Lat     Lon      Z
0   P01 37.261  -6.431  -0.630
1   P02 37.258  -6.433  -0.825

Apply operation:

df.apply(lambda x: x/df_reference.loc[df_reference['ID']==x.name]['Z'], axis=1)

Result:

            P01 P02
2015-03-09  NaN NaN
2015-03-10  NaN NaN
2015-03-11  NaN NaN
2015-03-12  NaN NaN
   ...      ... ...

Any clue on what could be happening?

2
  • x.name does not contain the column name but the index label since you use axis=1 Commented Feb 5, 2022 at 12:51
  • How could I infer column name then? Commented Feb 5, 2022 at 12:53

1 Answer 1

2

Try:

>>> df / df_reference.set_index('ID')['Z']

# OR

>>> df.apply(lambda x: x/(df_reference.set_index('ID').loc[x.name].Z))

                 P01       P02
2015-03-09 -1.130257 -0.633978
2015-03-10 -0.367410 -0.655255
2015-03-11 -1.358091 -0.405920
2015-03-12 -0.085972 -0.637737
2015-03-13 -0.031896 -0.306626
2015-03-14 -0.934217 -0.257150
2015-03-15 -0.081206 -0.461807
2015-03-16 -1.100641 -1.202574
2015-03-17 -0.523478 -0.354512
2015-03-18 -0.303866 -1.030580
Sign up to request clarification or add additional context in comments.

6 Comments

Is it what you expect?
Nope... I'd need to vinculate the ID element on each column (P01 or P02)
Can you update your post with the expected output from 2015-03-09 to 2015-03-12 please?
Can you explain me how did you find this result, please?
I forgot it was random arrays... Of course the result will vary on each execution. It's working now for me just setting the index as you say! Thanks a lot!
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.