4

I'm trying to get all the data out of an API call which is returned in the json format.

For this purpose I'm using the json_normalize library from pandas, but I'm left with a list within that list that is not unwrapped.

This is the code I am using:

data=requests.get(url,endpointParams)
data_read=json.loads(data.content)
values=json_normalize(data_read['data'])

This is what I end up with:

name period values title description id
follower_count day [{'value': 0, 'end_time': '2022-03-27T07:00:00+0000'}, {'value': 0, 'end_time': '2022-03-28T07:00:00+0000'}] Follower Count Total number of unique accounts following this profile 1/insights/follower_count/day
impressions day [{'value': 19100, 'end_time': '2022-03-27T07:00:00+0000'}, {'value': 6000, 'end_time': '2022-03-28T07:00:00+0000'}] Impressions Total number of times the Business Account's media objects have been viewed 1/insights/impressions/day
profile_views day [{'value': 80, 'end_time': '2022-03-27T07:00:00+0000'}, {'value': 90, 'end_time': '2022-03-28T07:00:00+0000'}] Profile Views Total number of users who have viewed the Business Account's profile within the specified period 1/insights/profile_views/day
reach day [{'value': 5000, 'end_time': '2022-03-27T07:00:00+0000'}, {'value': 2000, 'end_time': '2022-03-28T07:00:00+0000'}] Reach Total number of times the Business Account's media objects have been uniquely viewed 1/insights/reach/day

My question is how do I unwrap the values column?

EDIT: Here's the data_read before normalizing:

{'data': [{'name': 'follower_count',
   'period': 'day',
   'values': [{'value': 50, 'end_time': '2022-03-27T07:00:00+0000'},
    {'value': 50, 'end_time': '2022-03-28T07:00:00+0000'}],
   'title': 'Follower Count',
   'description': 'Total number of unique accounts following this profile',
   'id': '1/insights/follower_count/day'},
  {'name': 'impressions',
   'period': 'day',
   'values': [{'value': 19000, 'end_time': '2022-03-27T07:00:00+0000'},
    {'value': 6000, 'end_time': '2022-03-28T07:00:00+0000'}],
   'title': 'Impressions',
   'description': "Total number of times the Business Account's media objects have been viewed",
   'id': '1/insights/impressions/day'},
  {'name': 'profile_views',
   'period': 'day',
   'values': [{'value': 90, 'end_time': '2022-03-27T07:00:00+0000'},
    {'value': 99, 'end_time': '2022-03-28T07:00:00+0000'}],
   'title': 'Profile Views',
   'description': "Total number of users who have viewed the Business Account's profile within the specified period",
   'id': '1/insights/profile_views/day'},
  {'name': 'reach',
   'period': 'day',
   'values': [{'value': 5000, 'end_time': '2022-03-27T07:00:00+0000'},
    {'value': 2000, 'end_time': '2022-03-28T07:00:00+0000'}],
   'title': 'Reach',
   'description': "Total number of times the Business Account's media objects have been uniquely viewed",
   'id': '1/insights/reach/day'}],
 'paging': {'previous': 'someotherurl.com',
  'next': 'someurl.com'}}
1
  • Can you provide a sample of data_read? Commented Mar 28, 2022 at 13:43

1 Answer 1

4

Try:

metadata = ['name', 'period', 'title', 'description', 'id']
out = pd.json_normalize(data_read['data'], 'values', metadata)
value end_time name period title description id
50 2022-03-27T07:00:00+0000 follower_count day Follower Count Total number of unique accounts following this profile 1/insights/follower_count/day
50 2022-03-28T07:00:00+0000 follower_count day Follower Count Total number of unique accounts following this profile 1/insights/follower_count/day
19000 2022-03-27T07:00:00+0000 impressions day Impressions Total number of times the Business Account's media objects have been viewed 1/insights/impressions/day
6000 2022-03-28T07:00:00+0000 impressions day Impressions Total number of times the Business Account's media objects have been viewed 1/insights/impressions/day
90 2022-03-27T07:00:00+0000 profile_views day Profile Views Total number of users who have viewed the Business Account's profile within the specified period 1/insights/profile_views/day
99 2022-03-28T07:00:00+0000 profile_views day Profile Views Total number of users who have viewed the Business Account's profile within the specified period 1/insights/profile_views/day
5000 2022-03-27T07:00:00+0000 reach day Reach Total number of times the Business Account's media objects have been uniquely viewed 1/insights/reach/day
2000 2022-03-28T07:00:00+0000 reach day Reach Total number of times the Business Account's media objects have been uniquely viewed 1/insights/reach/day
Sign up to request clarification or add additional context in comments.

5 Comments

Thank you for the answer, it works. I've been reading up on the json_normalize but I can't understand how this solution works - could you just briefly explain the logic of it? Thanks so much
The third parameter of pd.json_normalize is the columns to be used as metadata (the outer level). So you extract all rows in values and set other fields as metadata.
master of the json_normalize :p
@Corralien How would you do this without having to list the column names in "metadata"? I have a json with 30+ fields so looking for a method where I won't have to write each column name individually
@Mitch. It depends on your data structure. For example, you can take the first record of your data and use set operations to build dynamically the metadata list.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.