19

I have some JSON data like:

{
  "status": "200",
  "msg": "",
  "data": {
    "time": "1515580011",
    "video_info": [
      {
          "announcement": "{\"announcement_id\":\"6\",\"name\":\"INS\\u8d26\\u53f7\",\"icon\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-08-18_19:44:54\\\/ins.png\",\"icon_new\":\"http:\\\/\\\/liveme.cms.ksmobile.net\\\/live\\\/announcement\\\/2017-10-20_22:24:38\\\/4.png\",\"videoid\":\"15154610218328614178\",\"content\":\"FOLLOW ME PLEASE\",\"x_coordinate\":\"0.22\",\"y_coordinate\":\"0.23\"}",
          "announcement_shop": "",

etc.

How do I grab the content "FOLLOW ME PLEASE"? I tried using

replay_data = raw_replay_data['data']['video_info'][0]
announcement = replay_data['announcement']

But now announcement is a string representing more JSON data. I can't continue indexing announcement['content'] results in TypeError: string indices must be integers.

How can I get the desired string in the "right" way, i.e. respecting the actual structure of the data?

1
  • I reopened the question after some fixes; it is not a proper duplicate, since it specifically deals with the question of interpreting nested JSON data as a string that needs to be re-parsed. Most of the questions that were originally closed as a duplicate of this, should be duplicates of stackoverflow.com/questions/12788217/… instead; I will edit them. Commented Jul 2, 2022 at 1:18

3 Answers 3

39

In a single line -

>>> json.loads(data['data']['video_info'][0]['announcement'])['content']
'FOLLOW ME PLEASE'

To help you understand how to access data (so you don't have to ask again), you'll need to stare at your data.

First, let's lay out your data nicely. You can either use json.dumps(data, indent=4), or you can use an online tool like JSONLint.com.

{
    'data': {
        'time': '1515580011',
        'video_info': [{
            'announcement': (    # ***
            """{
                "announcement_id": "6",
                "name": "INS\\u8d26\\u53f7",
                "icon": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-08-18_19:44:54\\\\/ins.png",
                "icon_new": "http:\\\\/\\\\/liveme.cms.ksmobile.net\\\\/live\\\\/announcement\\\\/2017-10-20_22:24:38\\\\/4.png",
                "videoid": "15154610218328614178",
                "content": "FOLLOW ME PLEASE",
                "x_coordinate": "0.22",
                "y_coordinate": "0.23"
            }"""),
            'announcement_shop': ''
        }]
    },
    'msg': '',
    'status': '200'
} 

*** Note that the data in the announcement key is actually more json data, which I've laid out on separate lines.

First, find out where your data resides. You're looking for the data in the content key, which is accessed by the announcement key, which is part of a dictionary inside a list of dicts, which can be accessed by the video_info key, which is in turn accessed by data.

So, in summary, "descend" the ladder that is "data" using the following "rungs" -

  1. data, a dictionary
  2. video_info, a list of dicts
  3. announcement, a dict in the first dict of the list of dicts
  4. content residing as part of json data.

First,

i = data['data']

Next,

j = i['video_info']

Next,

k = j[0] # since this is a list

If you only want the first element, this suffices. Otherwise, you'd need to iterate:

for k in j:
    ...

Next,

l = k['announcement']

Now, l is JSON data. Load it -

import json
m = json.loads(l)

Lastly,

content = m['content']

print(content)
'FOLLOW ME PLEASE'

This should hopefully serve as a guide should you have future queries of this nature.

Sign up to request clarification or add additional context in comments.

3 Comments

How can we handle null cases here efficiently . Suppose if no m['content'] is present in JSON
@JibinMathew I'd imagine something along the lines of a try-except AttributeError or if block should be more than enough.
3

You have nested JSON data; the string associated with the 'annoucement' key is itself another, separate, embedded JSON document.

You'll have to decode that string first:

import json

replay_data = raw_replay_data['data']['video_info'][0]
announcement = json.loads(replay_data['announcement'])
print(announcement['content'])

then handle the resulting dictionary from there.

Comments

-2

The content of "announcement" is another JSON string. Decode it and then access its contents as you were doing with the outer objects.

1 Comment

@IgncioVazquez-Abrams Can you provide a code example?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.