1
import numpy as np, pandas as pd
data = np.array([[[3, 2, 1, np.nan, np.nan],
              [22, 1, 1, 4, 4],
              [4, 2, 3, 3, 4],
              [1, 1, 4, 1, 5],
              [2, 4, 5, 2, 1]],

             [[6, 7, 10, 6, np.nan],
              [np.nan, 7, 8, 6, 9],
              [6, 10, 9, 8, 10],
              [6, 8, 7, 10, 8],
              [10, 9, 9, 10, 8]],

             [[6, 7, 10, np.nan, np.nan],
              [19, 19, 8, 6, 9],
              [6, 10, 9, 8, 10],
              [6, 8, 7, 10, 8],
              [10, 9, 9, 10, 8]],

             [[6, 7, 10, 6, np.nan],
              [19, 21, 8, 6, 9],
              [6, 10, 9, 8, 10],
              [6, 8, 7, 10, 8],
              [10, 9, 9, 10, 8]],

             [[12, 14, 12, 15, np.nan],
              [19, 11, 14, 14, 11],
              [13, 13, 16, 15, 11],
              [14, 15, 14, 16, 14],
              [13, 15, 11, 11, 14]]])

new_data = data.reshape(5,25)
df = pd.DataFrame(new_data)
result = df.interpolate(axis=0,method='cubic').values.reshape(data.shape)

print result

Though some locations has 4 non nan values, the whole process is stopped, saying the 'cubic' method requires at least 4 non nan values. How can I make it conditional to apply the 'cubic' method to change values for those locations which can run 'cubic' method?

3
  • I could get a good results in this case doing new_data= data.reshape(-1, 5); pd.DataFrame(new_data).interpolate(axis=1, method='linear') Commented May 3, 2014 at 8:03
  • Yes, but I want to try spline method. But seems there is not the required functionaly in pandas Commented May 3, 2014 at 8:22
  • you can "filter" the np.ndarray doing cond=new_data.shape[1] - np.sum(np.isnan(new_data), axis=1) >=4, and then creating the dataframe like pd.DataFrame(new_data[cond, :]), this will accept the cubic interpolation, but in this case some of the nans still do not vanish... Commented May 3, 2014 at 8:25

1 Answer 1

1

This should work for you (assuming you have at least 4 valid entries in each column that isn't all NaN)

df.dropna(how='all', axis=1).interpolate(method='cubic')

This will drop the rows of all NaNs.

If you need to recover the original shape, I'd suggest storing the columns:

cols = df.columns

Then doing the interpolation. Follow that up with a reindex:

result.reindex_axis(cols, axis=1)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.