There are numerous posts about numpy memory errors in google land, but I can't find one that resolves my issue. I'm running someone else's software using a high-end server with 256GB of RAM, 64-bit opensuse 13.1, 64-bit python, and 64-bit numpy (as far as I can tell). See below.
The original author is not available for help requests, so I did my best to determine the memory size for the object numpy is attempting to create. First, here is the stack trace:
File "/home/<me>/cmsRelease/trunk/Classes/DotData.py", line 193, in __new__
DataObj = numpy.rec.fromarrays(Columns,names = names)
File "/usr/lib64/python2.7/site-packages/numpy/core/records.py", line 562, in fromarrays
_array = recarray(shape, descr)
File "/usr/lib64/python2.7/site-packages/numpy/core/records.py", line 400, in __new__
self = ndarray.__new__(subtype, shape, (record, descr), order=order)
MemoryError
I used the following for loop to estimate the object size as best I know how:
size = 0
for i in Columns: # Columns is the list passed into numpy.rec.fromarrays
size += sys.getsizeof(i)
print "Columns size: " + str(size)
The result is Columns size: 12051648. Unless I'm mistaken, that's only 12MB, but in either case, it's a far cry from 256GB.
Based on this information, I suspect there is a system limit (ulimit) preventing python from accessing the memory. Running ulimit -a reports the following (I set ulimit -s 256000000 before I run the program):
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 2065541
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 10000
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 256000000
cpu time (seconds, -t) unlimited
max user processes (-u) 2065541
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Questions:
- What am I missing?
- Did I not measure the
Columnslist object size correctly? - Is there another system property I need to set?
I wish the Memory Error would be more specific. I appreciate your help.
Supporting system information:
System memory:
> free -h
total used free shared buffers cached
Mem: 252G 1.6G 250G 4.2M 12M 98M
-/+ buffers/cache: 1.5G 250G
Swap: 2.0G 98M 1.9G
OS version:
> cat /etc/os-release
NAME=openSUSE
VERSION="13.1 (Bottle)"
VERSION_ID="13.1"
PRETTY_NAME="openSUSE 13.1 (Bottle) (x86_64)"
ID=opensuse
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:opensuse:opensuse:13.1"
BUG_REPORT_URL="https://bugs.opensuse.org"
HOME_URL="https://opensuse.org/"
ID_LIKE="suse"
Python version:
Python 2.7.6 (default, Nov 21 2013, 15:55:38) [GCC] on linux2
>>> import platform; platform.architecture()
('64bit', 'ELF')
Numpy version:
>>> numpy.version
<module 'numpy.version' from '/usr/lib64/python2.7/site-packages/numpy/version.pyc'>
>>> numpy.version.version
'1.7.1'
shapeargument passed to the__new__call?sys.getsizeofonly returns the size of the container, not the size of its contents, so it's hard to say where the overload is without know more about the nature of the data.Columnsis a list of numpyndarrays, useColumns[n].nbytesto get the size of the nth column in bytes.shapesince it's part ofnumpy. Can you clarify? Debugging this software is not realistic. It uses four major languages (python, perl, java, and C). Yeah, really.numpy.ndarrayandnumpy.matrixobjects have a.shapeattribute, which is a tuple containing the number of elements in each dimension of the array/matrix.sys.sizeof(<instance>)would be the correct way to get the size of a built-in scalar (int,float,double,stretc.). I also asked him for the class of the scalar elements, since the corresponding numpy arrayitemsizewill of course be smaller than the whole Python container.