1

I'm transforming fields from an XML document so I can load into a normal relational DB. I've transformed the XML document to a bunch of nested dictionaries. Some of the values I wish to extract are in nested dictionaries, so I need to flatten it first.

Easy enough, but I'd like to create a mapping that lets me specify upfront what to extract.

Example

input_dict = {
'authors': [{'name': u'Google, Inc.'}],
'islink': False,
}

mapping = ['islink',<???>]

Desired output

In: tuple(input_dict[key] for key in mapping)
Out: (False, 'Google, Inc.')

This obviously doesn't work:

In: [input_dict[key] for key in ['islink',['authors'][0]['name']]]
Out: TypeError: string indices must be integers, not str

2 Answers 2

3

and what about:

from collections import Iterable

def flatten(x):
    result = []
    if isinstance(x, dict):
        x = x.values()
    for el in x:
        if isinstance(el, Iterable) and not isinstance(el, str):
            result.extend(flatten(el))
        else:
            result.append(el)
    return result

which, this time is python3 friendly ;-)

>>> dd = {'a': 42, 'c': 12, 'b': [{1: 2, 2: 3, 3: 4}]}
>>> flatten(dd)
[42, 12, 2, 3, 4]

here's a version that supports key filtering:

def flatten(x, keys=None):
    result = []
    if isinstance(x, dict):
        if keys is None:
            x = x.values()
        else:
            x = dict(filter(lambda t: t[0] in keys, x.items())).values()
    for el in x:
        if isinstance(el, Iterable) and not isinstance(el, str):
            result.extend(flatten(el, keys))
        else:
            result.append(el)
    return result

results:

>>> flatten(dd, keys=['a'])
[42]
>>> flatten(dd, keys=['a','b']) # even though 'b' is selected, no sub key has been
[42]
>>> flatten(dd, keys=['a','b',1]) # to get a subkey, 'b' needs to be selected
[42, 2]
>>> flatten(dd,keys=['a',1]) # if you don't then there's no subkey selected
[42]
>>> flatten(dd, keys=['a','b',2,3])
[42, 3, 4]

and for your use case:

>>> input_dict = {'authors': [{'name': 'Google, Inc.'}],'islink': False,}
>>> flatten(input_dict)
[False, 'Google, Inc.']

N.B.: I adapted my answer from that answer about list flattening

Sign up to request clarification or add additional context in comments.

5 Comments

Note: IIRC - the compiler module has been deprecated since 2.6/7 and is no longer in 3.x
@JonClements python3 ready, now, though I failed to build a generator based version :-s
Don't forget strings may need special casing as they're Iterable
of course, that's why I got the and not isinstance(el, str) special case!
My bad - was looking on mobile and didn't think to tap and scroll across! :-D
1

What about this:

indices = [['islink',], ['authors', 0, 'name']]
result = []
for index in indices:
  value = input_dict
  for single_index in index:
    value=value[single_index]
  result.append(value)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.