8

I have a pandas dataframe with two columns:

ddf.head()

    a    b
0   3136 13280
1   3072 13312
2   3152 13296
3   3120 13248
4   3120 13200

I would like to calculate the difference between consecutive elements in the same column. Now, if I do it for one column at a time (ddf['a'].diff()) it works as I expect, but if I try ddf.diff() it gives:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-68-6ff864856571> in <module>()
----> 1 ddf.diff()

/home/app/anaconda/lib/python2.7/site-packages/pandas/core/frame.pyc in diff(self, periods)
   4285         diffed : DataFrame
   4286         """
-> 4287         new_data = self._data.diff(periods)
   4288         return self._constructor(new_data)
   4289 

/home/app/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in diff(self, *args, **kwargs)
   1287 
   1288     def diff(self, *args, **kwargs):
-> 1289         return self.apply('diff', *args, **kwargs)
   1290 
   1291     def interpolate(self, *args, **kwargs):

/home/app/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in apply(self, f, *args, **kwargs)
   1267                 applied = f(blk, *args, **kwargs)
   1268             else:
-> 1269                 applied = getattr(blk,f)(*args, **kwargs)
   1270 
   1271             if isinstance(applied,list):

/home/app/anaconda/lib/python2.7/site-packages/pandas/core/internals.pyc in diff(self, n)
    423     def diff(self, n):
    424         """ return block for the diff of the values """
--> 425         new_values = com.diff(self.values, n, axis=1)
    426         return make_block(new_values, self.items, self.ref_items, fastpath=True)
    427 

/home/app/anaconda/lib/python2.7/site-packages/pandas/core/common.pyc in diff(arr, n, axis)
    643     if arr.ndim == 2 and arr.dtype.name in _diff_special:
    644         f = _diff_special[arr.dtype.name]
--> 645         f(arr, out_arr, n, axis)
    646     else:
    647         res_indexer = [slice(None)] * arr.ndim

/home/app/anaconda/lib/python2.7/site-packages/pandas/algos.so in pandas.algos.diff_2d_int16 (pandas/algos.c:91446)()

ValueError: Buffer dtype mismatch, expected 'float32_t' but got 'double'
1
  • How about df.apply(np.diff)? Commented Nov 12, 2013 at 21:15

1 Answer 1

8

You can use this:

>>> df - df.shift(1)
    a   b
0 NaN NaN
1 -64  32
2  80 -16
3 -32 -48
4   0 -48

But actually, at my machine, df.diff() works ok:

>>> df.diff()
    a   b
0 NaN NaN
1 -64  32
2  80 -16
3 -32 -48
4   0 -48
Sign up to request clarification or add additional context in comments.

1 Comment

You are right. I did df.astype(float32).diff() and it worked. Thanks

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.