3

I have object d connected to h5 dataset:

>>> data = d[:, :, 0].astype(np.float32)
>>> data.shape
(17201, 10801)
>>> data[data==-32768] = data[data>0].min()
Traceback (most recent call last):
  File "<interactive input>", line 1, in <module>
MemoryError

Can I do some other slicing trick to avoid this error?

10
  • What are you trying to achieve with the last line? The left hand side evaluates to an array of negative values -32768 while the right hand side is the smallest positive number in the array. The result thus will be an array of False values (the length will be the number of -32768 values in the array due to numpy broadcasting). Maybe there is another way of achieving your goal? Commented Dec 17, 2012 at 14:37
  • @David: I'm normalizing all values equal to -32768 to the minimal value of array greater then 0. If it's not obvious... Commented Dec 17, 2012 at 14:44
  • I'm sorry, I misread the = in the last line as a ==. Your code is perfectly clear and should not need more explanation. My bad! Apparently, you're running into memory bounds. Can you evaluate the expression data[data>0].min() individually to determine the minimal number? Commented Dec 17, 2012 at 14:50
  • 2
    How big is this datafile? from my quick calculation, you're taking : (17201*10801*32)/1024./1024./1000. = 5.669799835205078Gb of disk space just to hold this array section. Commented Dec 17, 2012 at 14:59
  • 3
    Also, just a word of caution here. You're using floating point numbers (np.float32) but checking equality with an integer: data==-32768. This can work, but you need to be extremely careful due to inaccuracies with floating point arithmetic (which you may already be aware of -- In which case, think of this as a warning to others who might stumble upon this post). Commented Dec 17, 2012 at 15:02

1 Answer 1

2

OK, I'm writing answer myself, as there is acceptable solution, gained after @mgilson questioned data type.

If data allows, memory error can be avoided by using simpler data type while operating on array. Considering initial question this worked for me:

>>> data = d[:, :, 0].astype(np.short)
>>> data[data==-32768] = data[data>0].min()
>>> data = data.astype(np.float32)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.