90

I have a large list like:

[A][B1][C1]=1
[A][B1][C2]=2
[A][B2]=3
[D][E][F][G]=4

I want to build a multi-level dict like:

A
--B1
-----C1=1
-----C2=1
--B2=3
D
--E
----F
------G=4

I know that if I use recursive defaultdict I can write table[A][B1][C1]=1, table[A][B2]=2, but this works only if I hardcode those insert statement.

While parsing the list, I don't how many []'s I need beforehand to call table[key1][key2][...].

2

9 Answers 9

228

You can do it without even defining a class:

from collections import defaultdict

nested_dict = lambda: defaultdict(nested_dict)
nest = nested_dict()

nest[0][1][2][3][4][5] = 6
Sign up to request clarification or add additional context in comments.

8 Comments

this is sweet! but how about if i want leaves to initialize via a standard (int, list, etc) factory? eg, i want to be able to say: table[0][1][2][3][4][5] += 1
is there a way to do the same with a built-in dict and .get() ?
class l(dict): __missing__=lambda a,b:a.setdefault(b,l()) and then continue from table=l()
PyCharm says it violates PEP 8: "do not assign a lambda expression use a def". Any ways to do this with a function to get rid of the warning?
def nested_dict(): return defaultdict(nested_dict) but i like the lambda version better. it looks a bit more cryptic ;-)
|
20

Your example says that at any level there can be a value, and also a dictionary of sub-elements. That is called a tree, and there are many implementations available for them. This is one:

from collections import defaultdict
class Tree(defaultdict):
    def __init__(self, value=None):
        super(Tree, self).__init__(Tree)
        self.value = value

root = Tree()
root.value = 1
root['a']['b'].value = 3
print root.value
print root['a']['b'].value
print root['c']['d']['f'].value

Outputs:

1
3
None

You could do something similar by writing the input in JSON and using json.load to read it as a structure of nested dictionaries.

4 Comments

I think the value construct is unnecessary, at least with respect to the proposed problem. Just remove references to value and assign values directly to dictionary keys.
+1: Although the value arg/attribute isn't really necessary.
@Martineau @Jason. The value instance variable is necessary because otherwise you'd loose the children when you assign directly to a node (see my comment to Jason's elegant solution). Intervening __setitem__ would provide for a much more robust solution, but it would be a too complicated solution to the simple requirements.
I was unclear how to modify the other answers to that the collection property was a list instead of a int/float. This answer makes it clear, where self.value = [] was exactly what I was looking for!
11

I'd do it with a subclass of dict that defines __missing__:

>>> class NestedDict(dict):
...     def __missing__(self, key):
...             self[key] = NestedDict()
...             return self[key]
...
>>> table = NestedDict()
>>> table['A']['B1']['C1'] = 1
>>> table
{'A': {'B1': {'C1': 1}}}

You can't do it directly with defaultdict because defaultdict expects the factory function at initialization time, but at initialization time, there's no way to describe the same defaultdict. The above construct does the same thing that default dict does, but since it's a named class (NestedDict), it can reference itself as missing keys are encountered. It is also possible to subclass defaultdict and override __init__.

5 Comments

That's not enough. You'll get an error if you try table['A']['B1']['C1']['D2'] = 2. The nodes must be able to hold a value and the children.
@Apalala: Actually, from the OP's example input, it appears that nodes only need to be able to hold a value or children, never both -- which is why @Jason and I claimed your answer's value attribute was unnecessary.
@martinau MHO is that it all becomes unstable (bug-prone) unless it is solved as a tree. Syntax and implementation are irrelevant. Is it, or is it not a problem that requires a tree structure? My point is that one should not force a design towards a pretty syntax unless there are compelling reasons to do it. KISS.
@Apalala I know this is old. but how do we implement a defaultdict that holds both values and children?
@HalcyonAbrahamRamirez Look at Apalala's answer in this same question.
6

This is equivalent to the above, but avoiding lambda notation. Perhaps easier to read ?

def dict_factory():
   return defaultdict(dict_factory)

your_dict = dict_factory()

Also -- from the comments -- if you'd like to update from an existing dict, you can simply call

your_dict[0][1][2].update({"some_key":"some_value"})

In order to add values to the dict.

1 Comment

This solution does not offer the ability to pass an initial value. I think Dan O'Huiginn's solution (via Dvd Avins post) is slightly better for this reason.
4

Dan O'Huiginn posted a very nice solution on his journal in 2010:

http://ohuiginn.net/mt/2010/07/nested_dictionaries_in_python.html

>>> class NestedDict(dict):
...     def __getitem__(self, key):
...         if key in self: return self.get(key)
...         return self.setdefault(key, NestedDict())


>>> eggs = NestedDict()
>>> eggs[1][2][3][4][5]
{}
>>> eggs
{1: {2: {3: {4: {5: {}}}}}}

2 Comments

I find this approach nice when I want to create a nested dictionary quickly. If I want to "re-enable" KeyError, it's easy to convert back to a standard dictionary using dict().
return self.setdefault(key, NestedDict()) is sufficient. No need for the if.
3

You may achieve this with a recursive defaultdict.

from collections import defaultdict

def tree():
    def the_tree():
        return defaultdict(the_tree)
    return the_tree()

It is important to protect the default factory name, the_tree here, in a closure ("private" local function scope). Avoid using a one-liner lambda version, which is bugged due to Python's late binding closures, and implement this with a def instead.

The accepted answer, using a lambda, has a flaw where instances must rely on the nested_dict name existing in an outer scope. If for whatever reason the factory name can not be resolved (e.g. it was rebound or deleted) then pre-existing instances will also become subtly broken:

>>> nested_dict = lambda: defaultdict(nested_dict)
>>> nest = nested_dict()
>>> nest[0][1][2][3][4][6] = 7
>>> del nested_dict
>>> nest[8][9] = 10
# NameError: name 'nested_dict' is not defined

Comments

2

To add to @Hugo
To have a max depth:

l=lambda x:defaultdict(lambda:l(x-1)) if x>0 else defaultdict(dict)
arr = l(2)

Comments

2

A slightly different possibility that allows regular dictionary initialization:

from collections import defaultdict

def superdict(arg=()):
    update = lambda obj, arg: obj.update(arg) or obj
    return update(defaultdict(superdict), arg)

Example:

>>> d = {"a":1}
>>> sd = superdict(d)
>>> sd["b"]["c"] = 2

Comments

1

You could use a NestedDict.

from ndicts.ndicts import NestedDict

nd = NestedDict()
nd[0, 1, 2, 3, 4, 5] = 6

The result as a dictionary:

>>> nd.to_dict()
{0: {1: {2: {3: {4: {5: 6}}}}}}

To install ndicts

pip install ndicts

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.