Initializing pandas dataframes with and without index,columns yields different results

Question

If I use the following methodology to construct a pandas.DataFrame, I get an output that (I think) is peculiar:

import pandas, numpy

df = pandas.DataFrame(
    numpy.random.rand(100,2), index = numpy.arange(100), columns = ['s1','s2'])
smoothed = pandas.DataFrame(
    pandas.ewma(df, span = 21), index = df.index, columns = ['smooth1','smooth2'])

When I go to look at the smoothed values, I get:

>>> smoothed.tail()
smooth1  smooth2
95      NaN      NaN
96      NaN      NaN
97      NaN      NaN
98      NaN      NaN
99      NaN      NaN

This seems like it an aggregation of the following fragmented calls, which yield different results:

smoothed2 = pandas.DataFrame(pandas.ewma(df, span = 21))
smoothed2.index = df.index
smoothed2.columns = ['smooth1','smooth2']

Again using the DataFrame.tail() invocation I get:

>>> smoothed2.tail()
smooth1   smooth2
95  0.496021  0.501153 
96  0.506118  0.507541
97  0.516655  0.544621
98  0.520212  0.543751
99  0.518170  0.572429

Can anyone provide rationale as to why these to DataFrame construction methodologies should be different?

Wes McKinney · Accepted Answer · 2012-02-23 21:25:24Z

6

The result of ewma(df, span=21) is already a DataFrame, so when you pass it to the DataFrame constructor along with a list of columns, it "selects" out the columns that you passed. It's difficult in this particular case to break the link between label and data. If you had done instead:

In [23]: smoothed = DataFrame(ewma(df, span = 21).values, index=df.index, columns = ['smooth1','smooth2'])
In [24]: smoothed.head()
Out[24]: 
    smooth1   smooth2
0  0.218350  0.877693
1  0.400214  0.813499
2  0.308564  0.739426
3  0.433341  0.641891
4  0.525260  0.620541

that is no problem. of course

smoothed = ewma(df, span=21)
smoothed.columns = ['smooth1', 'smooth2']

is perfectly fine too

answered Feb 23, 2012 at 21:25

Wes McKinney

106k32 gold badges146 silver badges109 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

benjaminmgross Over a year ago

Wes, you're amazing. Thanks for building such an amazing piece of abstraction and thanks for such a prompt response!

Collectives™ on Stack Overflow

Initializing pandas dataframes with and without index,columns yields different results

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related