3

There are numerous posts about numpy memory errors in google land, but I can't find one that resolves my issue. I'm running someone else's software using a high-end server with 256GB of RAM, 64-bit opensuse 13.1, 64-bit python, and 64-bit numpy (as far as I can tell). See below.

The original author is not available for help requests, so I did my best to determine the memory size for the object numpy is attempting to create. First, here is the stack trace:

File "/home/<me>/cmsRelease/trunk/Classes/DotData.py", line 193, in __new__
  DataObj = numpy.rec.fromarrays(Columns,names = names)
File "/usr/lib64/python2.7/site-packages/numpy/core/records.py", line 562, in fromarrays
  _array = recarray(shape, descr)
File "/usr/lib64/python2.7/site-packages/numpy/core/records.py", line 400, in __new__
  self = ndarray.__new__(subtype, shape, (record, descr), order=order)
MemoryError

I used the following for loop to estimate the object size as best I know how:

size = 0
for i in Columns:  # Columns is the list passed into numpy.rec.fromarrays
    size += sys.getsizeof(i)
print "Columns size: " + str(size)

The result is Columns size: 12051648. Unless I'm mistaken, that's only 12MB, but in either case, it's a far cry from 256GB.

Based on this information, I suspect there is a system limit (ulimit) preventing python from accessing the memory. Running ulimit -a reports the following (I set ulimit -s 256000000 before I run the program):

core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 2065541
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 10000
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 256000000
cpu time               (seconds, -t) unlimited
max user processes              (-u) 2065541
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Questions:

  1. What am I missing?
  2. Did I not measure the Columns list object size correctly?
  3. Is there another system property I need to set?

I wish the Memory Error would be more specific. I appreciate your help.

Supporting system information:

System memory:

> free -h
             total       used       free     shared    buffers     cached
Mem:          252G       1.6G       250G       4.2M        12M        98M
-/+ buffers/cache:       1.5G       250G
Swap:         2.0G        98M       1.9G

OS version:

> cat /etc/os-release
NAME=openSUSE
VERSION="13.1 (Bottle)"
VERSION_ID="13.1"
PRETTY_NAME="openSUSE 13.1 (Bottle) (x86_64)"
ID=opensuse
ANSI_COLOR="0;32"
CPE_NAME="cpe:/o:opensuse:opensuse:13.1"
BUG_REPORT_URL="https://bugs.opensuse.org"
HOME_URL="https://opensuse.org/"
ID_LIKE="suse"

Python version:

Python 2.7.6 (default, Nov 21 2013, 15:55:38) [GCC] on linux2
>>> import platform; platform.architecture()
('64bit', 'ELF')

Numpy version:

>>> numpy.version
<module 'numpy.version' from '/usr/lib64/python2.7/site-packages/numpy/version.pyc'>
>>> numpy.version.version
'1.7.1'
9
  • 2
    What is the value of the shape argument passed to the __new__ call? sys.getsizeof only returns the size of the container, not the size of its contents, so it's hard to say where the overload is without know more about the nature of the data. Commented Aug 2, 2014 at 19:32
  • Assuming that Columns is a list of numpy ndarrays, use Columns[n].nbytes to get the size of the nth column in bytes. Commented Aug 2, 2014 at 21:10
  • I apologize, but I don't know how to get the value of shape since it's part of numpy. Can you clarify? Debugging this software is not realistic. It uses four major languages (python, perl, java, and C). Yeah, really. Commented Aug 2, 2014 at 21:11
  • @zugzug numpy.ndarray and numpy.matrix objects have a .shape attribute, which is a tuple containing the number of elements in each dimension of the array/matrix. Commented Aug 2, 2014 at 21:12
  • 1
    @JoeKington Based on the OP's description, it seems that he has nested lists containing scalar values, not numpy arrays. As far as I'm aware, sys.sizeof(<instance>) would be the correct way to get the size of a built-in scalar (int, float, double, str etc.). I also asked him for the class of the scalar elements, since the corresponding numpy array itemsize will of course be smaller than the whole Python container. Commented Aug 3, 2014 at 16:37

1 Answer 1

4

Really embarrassing. I really was running out of memory. I started top and watch all 256GB get consumed. Why I never checked that during all my investigation is a mystery to even myself. My apologies for overlooking the obvious.

Sign up to request clarification or add additional context in comments.

1 Comment

Running out of memory but, why?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.