1

I have a data that is converted to a list. How do I convert this to a DataFrame such that each row in that list becomes a column in the DataFrame?

[{'date': '2019-01-01',
  'stats': [{'metrics': {'comp_1': 149, 
                         'comp_2': 276}}]},
 {'date': '2019-01-02',
  'stats': [{'metrics': {'comp_1': 232, 
                         'comp_2': 842}}]}]

I tried to do pd.DataFrame(c) where c was the variable holding the list but I saw all components of each date was stored in a single row

Expected output:

date, comp_1, comp_2
2019-01-01,149,276
2019-01-02,232,842
0

1 Answer 1

4

Use json.json_normalize first:

a = [{'date': '2019-01-01',
'stats': [{'metrics': {'comp_1': 149,
 'comp_2': 276}}]},
{'date': '2019-01-02',
'stats': [{'metrics': {'comp_1': 232,
 'comp_2': 842}}]}]

from pandas.io.json import json_normalize

df = json_normalize(a,'stats', ['date'])
print (df)
   metrics.comp_1  metrics.comp_2        date
0             149             276  2019-01-01
1             232             842  2019-01-02

Last convert columns names to list and reorder by subset:

df = df[df.columns[-1:].tolist() + df.columns[:-1].tolist()]
print (df)
         date  metrics.comp_1  metrics.comp_2
0  2019-01-01             149             276
1  2019-01-02             232             842

EDIT:

Solution with loop:

out = []
for x in a:
    for k, v in x.items():
        for z in v:
            if isinstance(z, dict):
                d = z['metrics']
                d['year'] = x['date']
                out.append(d)

df = pd.DataFrame(out)   
print (df)
   comp_1  comp_2        year
0     149     276  2019-01-01
1     232     842  2019-01-02
Sign up to request clarification or add additional context in comments.

11 Comments

the output I see has a date column and another column that stores both the metrics as a dict.. thanks..
@scottmartin - so input values a are different?
the only difference at my end is I have changed the metric label .. Everything else is same.
@scottmartin - are daat in json file? Maybe need with open('file.json') as data_file: df = json.load(data_file)
I load this data from another system so I have done this to get the list mentioned above.. b = json.loads(a.decode('utf-8')) and then I perform c = list(b) and I tried doing the steps you mentioned to the variable c
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.