1

How can I allocate/store a single or couple of bytes (e.g. 2 or 4) bytes of information in Python ?

I am not looking for alternative of malloc/new in Python but may be some datatype which doesn't take huge amount of memory.

I tried the following but as shown below, all are taking huge amount of memory.

Python 2.7.6 (default, Jun 22 2015, 17:58:13)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys
>>> i = 1 ; sys.getsizeof(i)
24
>>> i = None ; sys.getsizeof(i)
16
>>> i = 'c' ; sys.getsizeof(i)
38
>>> i = 'good' ; sys.getsizeof(i)
41
>>> i = bytearray(0) ; sys.getsizeof(i)
48
>>> i = bytearray(1) ; sys.getsizeof(i)
50
>>> from struct import *
>>> i = pack('h', 1) ; sys.getsizeof(i)
39
>>> i = array('l', [1]) ; sys.getsizeof(i)
64L

I love Python and am writing an application which will be storing some 100,000 firewall rules. Each rule will be some 500 bytes of information if I use conventional datatypes (integer, string) of Python. I want to save the space and avoid switching to C/C++ too as most of the rest of the application is in Python (2.7).

Also, I can not persist the memory as my application will check for update or modification of rules almost every 2 minutes.

My idea is to save memory by compressing the information. For example, instead of storing the 'direction' of a rule as 'input' or 'output' or 'inout' in a string or integer, I would dedicate 2 or 3 bits for marking the particular direction. With that I am assuming my one rule information can be saved into less than 10 bytes. For this, I want to know a method of storing only 2/4 bytes of information.

Appreciate your feedback / suggestions / pointers.

11
  • 4
    Just making sure we're tackling the right problem, is your application made in such a way that 100k rules must be in memory at the same time? Commented Nov 1, 2016 at 18:06
  • 2
    Even if you did store them all at 500 bytes each, that'd only be 500 * 100000 / 1024^2 MB (47.7MB), which isn't an awful lot. A tab of Google Chrome uses over 100MB in some cases. Commented Nov 1, 2016 at 18:07
  • 4
    Don't make each rule its own object. A bytearray or array.array representing an array of rules will be much more efficient. Commented Nov 1, 2016 at 18:08
  • 1
    In Python, everything is an object -- which comes with some overhead. @user2357112's suggestion of using arrays is probably the best you can do. Commented Nov 1, 2016 at 18:11
  • 2
    @ViFI: Sometimes Stackoverflow users downvote questions in order to obtain their "critic" badge (which requires you to downvote some question). I suspect that the more negative votes a question already has accumulated, the more tempting it might appear to add another downvote to the lot... Commented Nov 1, 2016 at 18:19

1 Answer 1

1

In measuring your sizes you didn't take care to exclude underlying class overhead from the size of the data stored. For example, below shows bytearray has about 48 bytes of overhead, but then each byte added takes about 1 byte. I presume the jump from 50 bytes to 53 to 56 indicates memory access optimization.

>>> i = bytearray()
>>> sys.getsizeof(i)
48
>>> i = bytearray((1))
>>> sys.getsizeof(i)
50
>>> i = bytearray((1,2))
>>> sys.getsizeof(i)
53
>>> i = bytearray((1,2,3))
>>> sys.getsizeof(i)
53
>>> i = bytearray((1,2,3,4))
>>> sys.getsizeof(i)
53
>>> i = bytearray((1,2,3,4,5))
>>> sys.getsizeof(i)
56
Sign up to request clarification or add additional context in comments.

2 Comments

I observed that but didn't know how to avoid this overhead. That could be another headline for the question but I am afraid if it would have attracted even more downvotes !!
@ViFI Perhaps a new post having some sample data and the question, `How to store this data most compactly?' Although, you may end up trading compactness for time required to unpack the data.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.