Pandas/Python: interpolation of multiple columns based on values specified for one reference column

Question

df
Out[1]: 
             PRES   HGHT  TEMP  DWPT  RELH   MIXR  DRCT  SKNT   THTA   THTE   THTV
        0   978.0    345  17.0  16.5    97  12.22     0     0  292.0  326.8  294.1
        1   977.0    354  17.8  16.7    93  12.39     1     0  292.9  328.3  295.1
        2   970.0    416  23.4  15.4    61  11.47     4     2  299.1  332.9  301.2
        3   963.0    479  24.0  14.0    54  10.54     8     3  300.4  331.6  302.3
        4   948.7    610  23.0  13.4    55  10.28    15     6  300.7  331.2  302.5
        5   925.0    830  21.4  12.4    56   9.87    20     5  301.2  330.6  303.0
        6   916.0    914  20.7  11.7    56   9.51    20     4  301.3  329.7  303.0
        7   884.0   1219  18.2   9.2    56   8.31    60     4  301.8  326.7  303.3
        8   853.1   1524  15.7   6.7    55   7.24    35     3  302.2  324.1  303.5
        9   850.0   1555  15.4   6.4    55   7.14    20     2  302.3  323.9  303.6
        10  822.8   1829  13.3   5.6    60   6.98   300     4  302.9  324.0  304.1

How do I interpolate the values of all the columns on specified PRES (pressure) values at say PRES=[950, 900, 875]? Is there an elegant pandas type of way to do this?

The only way I can think of doing this is to first start with making empty NaN values for the entire row for each specified PRES values in a loop, then set PRES as index and then use the pandas native interpolate option:

df.interpolate(method='index', inplace=True)

Is there a more elegant solution?

jezrael · Accepted Answer · 2019-01-23 07:08:51Z

Use your solution with no loop - reindex by union original index values with PRES list, but working only if all values are unique:

PRES=[950, 900, 875]
df = df.set_index('PRES')
df = df.reindex(df.index.union(PRES)).sort_index(ascending=False).interpolate(method='index')
print (df)
         HGHT  TEMP  DWPT  RELH   MIXR   DRCT  SKNT   THTA   THTE   THTV
978.0   345.0  17.0  16.5  97.0  12.22    0.0   0.0  292.0  326.8  294.1
977.0   354.0  17.8  16.7  93.0  12.39    1.0   0.0  292.9  328.3  295.1
970.0   416.0  23.4  15.4  61.0  11.47    4.0   2.0  299.1  332.9  301.2
963.0   479.0  24.0  14.0  54.0  10.54    8.0   3.0  300.4  331.6  302.3
950.0  1829.0  13.3   5.6  60.0   6.98  300.0   4.0  302.9  324.0  304.1
948.7   610.0  23.0  13.4  55.0  10.28   15.0   6.0  300.7  331.2  302.5
925.0   830.0  21.4  12.4  56.0   9.87   20.0   5.0  301.2  330.6  303.0
916.0   914.0  20.7  11.7  56.0   9.51   20.0   4.0  301.3  329.7  303.0
900.0  1829.0  13.3   5.6  60.0   6.98  300.0   4.0  302.9  324.0  304.1
884.0  1219.0  18.2   9.2  56.0   8.31   60.0   4.0  301.8  326.7  303.3
875.0  1829.0  13.3   5.6  60.0   6.98  300.0   4.0  302.9  324.0  304.1
853.1  1524.0  15.7   6.7  55.0   7.24   35.0   3.0  302.2  324.1  303.5
850.0  1555.0  15.4   6.4  55.0   7.14   20.0   2.0  302.3  323.9  303.6
822.8  1829.0  13.3   5.6  60.0   6.98  300.0   4.0  302.9  324.0  304.1

If possible not unique values in PRES column, then use concat with sort_index:

PRES=[950, 900, 875]
df = df.set_index('PRES')

df = (pd.concat([df, pd.DataFrame(index=PRES)])
        .sort_index(ascending=False)
        .interpolate(method='index'))

Collectives™ on Stack Overflow

Pandas/Python: interpolation of multiple columns based on values specified for one reference column

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related