Skip to content

Conversation

@mikelkelle
Copy link

ObjectVector class resizes its array without reseting its capacity count, so subsequent appends are invalid.

Mac OS 10.9, Python 2.7.6, numpy 1.9.0.dev-ee49411.

==57654== Invalid write of size 8
==57654==    at 0x137F856: __pyx_f_6pandas_9hashtable_12ObjectVector_append (in /Users/mtk/Projects/pandas/pandas/hashtable.so)
==57654==    by 0x139B16F: __pyx_pw_6pandas_9hashtable_17PyObjectHashTable_25get_labels (in /Users/mtk/Projects/pandas/pandas/hashtable.so)
==57654==    by 0x138CA9E: __pyx_pw_6pandas_9hashtable_10Factorizer_5factorize (in /Users/mtk/Projects/pandas/pandas/hashtable.so)
==57654==    by 0xD227E: PyEval_EvalFrameEx (in /usr/local/anaconda/lib/libpython2.7.dylib)

==57654==  Address 0x10095efd0 is 16 bytes inside a block of size 256 free'd
==57654==    at 0x7858: realloc (in /usr/local/Cellar/valgrind/3.9.0/lib/valgrind/vgpreload_memcheck-amd64-darwin.so)
==57654==    by 0x13C0F55: PyDataMem_RENEW (in /usr/local/anaconda/lib/python2.7/site-packages/numpy/core/multiarray.so)
==57654==    by 0x1488DE7: PyArray_Resize (in /usr/local/anaconda/lib/python2.7/site-packages/numpy/core/multiarray.so)
==57654==    by 0x14647A0: array_resize (in /usr/local/anaconda/lib/python2.7/site-packages/numpy/core/multiarray.so)
==57654==    by 0x1396A12: __pyx_pw_6pandas_9hashtable_12ObjectVector_5to_array (in /Users/mtk/Projects/pandas/pandas/hashtable.so)
==57654==    by 0x138CFEE: __pyx_pw_6pandas_9hashtable_10Factorizer_5factorize (in /Users/mtk/Projects/pandas/pandas/hashtable.so)
==57654==    by 0xD227E: PyEval_EvalFrameEx (in /usr/local/anaconda/lib/libpython2.7.dylib)

@jreback
Copy link
Contributor

jreback commented May 17, 2014

can u construct a test that fails w/o the fix?

this doesn't fail in numpy 1.9 afaict (in 64-bit linux)

as we test this in Travis

@mikelkelle
Copy link
Author

Added a test case that crashed on linux also.

@jreback
Copy link
Contributor

jreback commented May 17, 2014

gr8

can u add more dtypes to your test

eg loop thru object, int64, float 64

and release note (0.14.0)

thanks

@jreback jreback added this to the 0.14.0 milestone May 17, 2014
@mikelkelle
Copy link
Author

Done. Also needed to handle the special case where a vector is resized to 0.

@jreback
Copy link
Contributor

jreback commented May 18, 2014

looks good
ping when green

@jreback jreback merged commit eb97319 into pandas-dev:master May 18, 2014
@jreback
Copy link
Contributor

jreback commented May 18, 2014

thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Algos Non-arithmetic algos: value_counts, factorize, sorting, isin, clip, shift, diff Bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants