Make interpolation method conditional in pandas in python

Question

import numpy as np, pandas as pd
data = np.array([[[3, 2, 1, np.nan, np.nan],
              [22, 1, 1, 4, 4],
              [4, 2, 3, 3, 4],
              [1, 1, 4, 1, 5],
              [2, 4, 5, 2, 1]],

             [[6, 7, 10, 6, np.nan],
              [np.nan, 7, 8, 6, 9],
              [6, 10, 9, 8, 10],
              [6, 8, 7, 10, 8],
              [10, 9, 9, 10, 8]],

             [[6, 7, 10, np.nan, np.nan],
              [19, 19, 8, 6, 9],
              [6, 10, 9, 8, 10],
              [6, 8, 7, 10, 8],
              [10, 9, 9, 10, 8]],

             [[6, 7, 10, 6, np.nan],
              [19, 21, 8, 6, 9],
              [6, 10, 9, 8, 10],
              [6, 8, 7, 10, 8],
              [10, 9, 9, 10, 8]],

             [[12, 14, 12, 15, np.nan],
              [19, 11, 14, 14, 11],
              [13, 13, 16, 15, 11],
              [14, 15, 14, 16, 14],
              [13, 15, 11, 11, 14]]])

new_data = data.reshape(5,25)
df = pd.DataFrame(new_data)
result = df.interpolate(axis=0,method='cubic').values.reshape(data.shape)

print result

Though some locations has 4 non nan values, the whole process is stopped, saying the 'cubic' method requires at least 4 non nan values. How can I make it conditional to apply the 'cubic' method to change values for those locations which can run 'cubic' method?

I could get a good results in this case doing new_data= data.reshape(-1, 5); pd.DataFrame(new_data).interpolate(axis=1, method='linear') — Saullo G. P. Castro
– Saullo G. P. Castro, Commented May 3, 2014 at 8:03
Yes, but I want to try spline method. But seems there is not the required functionaly in pandas — user3235542
– user3235542, Commented May 3, 2014 at 8:22
you can "filter" the np.ndarray doing cond=new_data.shape[1] - np.sum(np.isnan(new_data), axis=1) >=4, and then creating the dataframe like pd.DataFrame(new_data[cond, :]), this will accept the cubic interpolation, but in this case some of the nans still do not vanish... — Saullo G. P. Castro
– Saullo G. P. Castro, Commented May 3, 2014 at 8:25

TomAugspurger · Accepted Answer · 2014-05-03 11:52:48Z

1

This should work for you (assuming you have at least 4 valid entries in each column that isn't all NaN)

df.dropna(how='all', axis=1).interpolate(method='cubic')

This will drop the rows of all NaNs.

If you need to recover the original shape, I'd suggest storing the columns:

cols = df.columns

Then doing the interpolation. Follow that up with a reindex:

result.reindex_axis(cols, axis=1)

answered May 3, 2014 at 11:52

TomAugspurger

29k8 gold badges90 silver badges71 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Make interpolation method conditional in pandas in python

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related