2

I use instances of a custom class (similar to a 2D vector with a bunch of extra stuff among other things) as a dict key.
It uses custom hash and equals magic-methods that basically makes it equal to a tuple with the initialization data.

The data set I'm dealing with is so big that memory (RAM) is a main concern and I need multiple different data structures with the same custom object instances as keys.

I want access to the actual reference of the dict keys.

If I can obtain a dict key from a tuple of the initialization data, I can prevent different custom class instances with the same internal data in different data structures, and instead use the same instance.

Is that possible? And if it is, how?

Example:

dict1 = {}
dict2 = {}

One code segment:

v = MyVect(1,5,"data",True)
dict1[v] = ("important", "data")

Second segment:
(this part has only access to the data that was used to create MyVect but no actual reference.)

keydata_without_reference = (1,5,"data",True)
mykey = dict1.getkeyref(keydata_without_reference) # getkeyref somehow
dict2[mykey] = "some other data"

As result I would save almost half the memory.
This is just to setup the initial data structures that the program uses later.

5
  • Your occasional change in terms is a bit confusing, but the code helps a lot. For instance, there is no such thing as a "2D vector" (that's a matrix), and your code uses only simple 4-tuples. Commented Jun 16, 2016 at 18:44
  • Now ... what stops you from using the same vector to key each dict (data structure)? Python dictionaries do their own hashing for you. Commented Jun 16, 2016 at 18:45
  • I don't know how to phrase it better but it seems like you don't understand what I am trying to explain. I mean the term "vector" not like a c++ vector that more or less translates into a python list (a dynamic array) but like a math vector used in games etc. It's an object with the data to point to a unique point in a simulated environment. So ofc it just is a bunch of data you could also represent in a tuple, but tuples can't represent a state with methods etc. Commented Jun 16, 2016 at 19:21
  • another thing it seems i have to clarify. the second part of the example code has no access to the "v" variable containing the object instance. it only has access to the data needed to create an instance of MyVect that matches the instance it needs. Commented Jun 16, 2016 at 19:28
  • Thanks; that explains what I missed. In that case, I think that @jstlaurent covered what you need. Commented Jun 16, 2016 at 19:59

1 Answer 1

1

What you want to do, essentially, is to control the creation of MyVect instances so that, for a given set of initialization data, only one instance of MyVect is created.

I would suggest using a Factory method pattern, implemented as a static method on the MyVect class, that will keep track of all the instances of the class that have been created.

class MyVect(object):

    instances = {}

    @staticmethod
    def get_instance(*args):
        instance = MyVect.instances.get(args)
        if instance is None:
            MyVect.instances[args] = instance = MyVect(*args)
        return instance

    def __init__(*args):
        # Memory intensive initialization here

I would recommend matching the factory method signature to that of the class constructor. I'm also using a simple dict as a cache, keyed with the initial arguments, but you can tweak this depending on your performance requirements to something more appropriate.

When you need to create a new key to access your data, you can use the factory method to get the MyVect instance back.

keydata_without_reference = (1,5,"data",True)
mykey = MyVect.get_instance(*keydata_without_reference)
dict2[mykey] = "some other data"

In languages with access control to methods and attributes (like C++, Java, etc.), you would set the class constructor to private, which would force the calling code to use the factory method to obtain instances and avoid any issues. In Python, that is not possible, so you will have to be careful not to call the constructor directly.

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you, that's a solution but it would require a LOT of refactoring because its a working project that just unexpectely outgrowed the limitations of the initial implementation. btw. A private commented constructor should not be a problem because of the small team size. I think you somewhat missunderstood my question but your answer ist correct nontheless xD. There are just maaaany MyVect instances, not just a few with massive data in them. I will wait with the answer accept one or two days for other answers because I still hope for a easier to refactor solution.
You're right, I'm not sure I understand what you are trying to do. You want to have fewer instances of MyVect, correct? Do you have many of them because you have several instances for a given set of initialization data, or because you have an equally large number of initialization data sets?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.