Extract values from dictionaries in dataframe columns

Question

I'm working on a quite simple thing : requesting something to a database, this one is returning me an [huge] dictionary. It's ok, I love dictionnaries. But i'm not a pro with this thing in Python.

My problem is that I want to convert this dictionary into a DataFrame. It's ok, I googled it and it works. But in my dictionary, I have others dictionaries (yeah I know...).

I want to take from those dictionaries (which are into my dataframe) the values of the "value" key

Here's a sample and what I tried. Thank's in advance.

[[res is my huge dictionary, the result from the query]]

res :

{'head': {'vars': ['id', 'marque', 'modele']},
 'results': {'bindings': [{'id': {'type': 'literal', 'value': '1362'},
    'marque': {'type': 'literal', 'value': 'PEUGEOT'},
    'modele': {'type': 'literal', 'value': '206'}},....

pd.DataFrame(res['results']['bindings'],columns=res['head']['vars']) :

As you can see, there's another dictionary into my dataframe ! What I want is to take the values from the "value" key, in an efficient way (indeed, I know how to do that with a big for statement but please, not in python).

I tried the things like res['results']['bindings']['values'], or res['results']['bindings'].values() (or .values), and others things on the dataframe like df.values()['value'] = df.values() but it doesn't work.

cs95 · Accepted Answer · 2017-09-25 13:15:03Z

4

IIUC, you can use an applymap and extract the value associated with the value key from every dictionary.

import operator

df = pd.DataFrame(res['results']['bindings'], columns=res['head']['vars']) 
df = df.applymap(operator.itemgetter('value'))

This operates under the assumption the each cell value is a dictionary.

It could be possible some of your dictionaries do not contain value as a key. In that case, a slight modification is required, using dict.get:

df = df.applymap(lambda x: x.get('value', np.nan) \
                        if isinstance(x, dict) else np.nan)

This will also handle the potential problems that arise when x is not a dict.

edited Sep 25, 2017 at 13:15

answered Sep 25, 2017 at 13:08

cs95

406k106 gold badges744 silver badges797 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Clement B Over a year ago

It works. And it's very efficient. Thank you ! :) Clément Edit : thanks for your advice on the second part of your answer, it's usefull and yes, it can happen.

cs95 Over a year ago

@ClementB Glad it helped.

jezrael · Accepted Answer · 2017-09-25 13:33:40Z

3

You can use json_normalize which perfectly add NaNs:

d = {'head': {'vars': ['id', 'marque', 'modele']},
 'results': {'bindings': [{'id': {'type': 'literal', 'value': '1362'},
    'marque': {'type': 'literal', 'value': 'PEUGEOT'},
    'modele': {'type': 'literal', 'value': '206'}},{'id': {'type': 'literal', 'value': '1362'},
    'marque': {'type': 'literal', 'value': 'PEUGEOT'}}]}}

from pandas.io.json import json_normalize    
df = json_normalize(d['results']['bindings']).filter(like='value')
df.columns = df.columns.str.replace('.value','')
print (df)
     id   marque modele
0  1362  PEUGEOT    206
1  1362  PEUGEOT    NaN

edited Sep 25, 2017 at 13:33

answered Sep 25, 2017 at 13:20

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Collectives™ on Stack Overflow

Extract values from dictionaries in dataframe columns

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related