38

I have a small dataframe, say this one :

    Mass32      Mass44  
12  0.576703    0.496159
13  0.576658    0.495832
14  0.576703    0.495398    
15  0.576587    0.494786
16  0.576616    0.494473
...

I would like to have a rolling mean of column Mass32, so I do this:

x['Mass32s'] = pandas.rolling_mean(x.Mass32, 5).shift(-2)

It works as in I have a new column named Mass32s which contains what I expect it to contain but I also get the warning message:

A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy

I'm wondering if there's a better way to do it, notably to avoid getting this warning message.

2
  • I don't get the warning message, when I run with your sample code can you check if earlier in your code you've set x as a copy of a data frame, something like x = x[x.Mass32.notnull()] Commented Jul 17, 2015 at 6:30
  • there was a couple of Nas in the dataframe that apparently were messing with me here. Fixing them with fillna(0) and .loc solved it. Thanks Commented Jul 20, 2015 at 5:01

1 Answer 1

85

This warning comes because your dataframe x is a copy of a slice. This is not easy to know why, but it has something to do with how you have come to the current state of it.

You can either create a proper dataframe out of x by doing

x = x.copy()

This will remove the warning, but it is not the proper way

You should be using the DataFrame.loc method, as the warning suggests, like this:

x.loc[:,'Mass32s'] = pandas.rolling_mean(x.Mass32, 5).shift(-2)
Sign up to request clarification or add additional context in comments.

5 Comments

there was a couple of Nas in the dataframe that apparently were messing with me here. Fixing them with fillna(0) and .loc solved it. Thanks
the first method sometimes works... the second method method gets me the following error indexer = self._get_setitem_indexer(key) , in _get_setitem_indexer raise IndexingError(key) IndexingError: (slice(None, None, None), 'Mass32s')
@Lcat sounds like you have none values in the index of your dataframe. Can you make a new question with some example data? and pass it on to me?
Why is this method better?
@mxbi copying the dataframe makes a copy, thus doubles the memory used. Even if you overwrite the variable as in my example x = x.copy(), there will be a spike in memory usage.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.