0

I'm trying to create a Pandas DataFrame from a JSON file that looks like this:

{
  "GameID": "1,218,463,841",
  "Date - Start": "1761097369",
  "Date - End": "1761098306",
  "TagSetID": 79,
  "Netplay": 1,
  "StadiumID": 5,
  "Away Player": "margoose",
  "Home Player": "dev",
  "Away Score": 0,
  "Home Score": 2,
  "Innings Selected": 5,
  "Innings Played": 5,
  "Quitter Team": 255,
  "Average Ping": 18,
  "Lag Spikes": 8,
  "Version": "2.1.1",
  "Character Game Stats": {
    "Away Roster 0": {
      "Team": "0",
      "RosterID": 0,
      "CharID": 15,
      "Superstar": 0,
      "Captain": 0,
      "Fielding Hand": 0,
      "Batting Hand": 0,
      "Defensive Stats": {
        "Batters Faced": 0,
        "Runs Allowed": 0,
        "Earned Runs": 0,
        "Batters Walked": 0,
        "Batters Hit": 0,
        "Hits Allowed": 0,
        "HRs Allowed": 0,
        "Pitches Thrown": 0,
        "Stamina": 10,
        "Was Pitcher": 0,
        "Strikeouts": 0,
        "Star Pitches Thrown": 0,
        "Big Plays": 0,
        "Outs Pitched": 0,
        "Batters Per Position": [
          {
            "CF": 3,
            "RF": 15
          }
        ],
        "Batter Outs Per Position": [
          {
            "CF": 2,
            "RF": 10
          }
        ],
        "Outs Per Position": [
          {
            "RF": 1
          }
        ]
      },
      "Offensive Stats": {
        "At Bats": 3,
        "Hits": 2,
        "Singles": 2,
        "Doubles": 0,
        "Triples": 0,
        "Homeruns": 0,
        "Successful Bunts": 0,
        "Sac Flys": 0,
        "Strikeouts": 0,
        "Walks (4 Balls)": 0,
        "Walks (Hit)": 0,
        "RBI": 0,
        "Bases Stolen": 0,
        "Star Hits": 0
      }
    },
    "Away Roster 1": {
      "Team": "0",
      "RosterID": 1,
      "CharID": 6,
      "Superstar": 0,
      "Captain": 0,
      "Fielding Hand": 0,
      "Batting Hand": 1,
      "Defensive Stats": {
        "Batters Faced": 0,
        "Runs Allowed": 0,
        "Earned Runs": 0,
        "Batters Walked": 0,
        "Batters Hit": 0,
        "Hits Allowed": 0,
        "HRs Allowed": 0,
        "Pitches Thrown": 0,
        "Stamina": 10,
        "Was Pitcher": 0,
        "Strikeouts": 0,
        "Star Pitches Thrown": 0,
        "Big Plays": 0,
        "Outs Pitched": 0,
        "Batters Per Position": [
          {
            "SS": 18
          }
        ],
        "Batter Outs Per Position": [
          {
            "SS": 12
          }
        ],
        "Outs Per Position": [
        ]
      },
      "Offensive Stats": {
        "At Bats": 3,
        "Hits": 0,
        "Singles": 0,
        "Doubles": 0,
        "Triples": 0,
        "Homeruns": 0,
        "Successful Bunts": 0,
        "Sac Flys": 0,
        "Strikeouts": 1,
        "Walks (4 Balls)": 0,
        "Walks (Hit)": 0,
        "RBI": 0,
        "Bases Stolen": 0,
        "Star Hits": 0
      }
    },
    "Away Roster 2": {
      "Team": "0",
      "RosterID": 2,
      "CharID": 20,
      "Superstar": 0,
      "Captain": 0,
      "Fielding Hand": 0,
      "Batting Hand": 1,
      "Defensive Stats": {
        "Batters Faced": 0,
        "Runs Allowed": 0,
        "Earned Runs": 0,
        "Batters Walked": 0,
        "Batters Hit": 0,
        "Hits Allowed": 0,
        "HRs Allowed": 0,
        "Pitches Thrown": 0,
        "Stamina": 10,
        "Was Pitcher": 0,
        "Strikeouts": 0,
        "Star Pitches Thrown": 0,
        "Big Plays": 0,
        "Outs Pitched": 0,
        "Batters Per Position": [
          {
            "3B": 18
          }
        ],
        "Batter Outs Per Position": [
          {
            "3B": 12
          }
        ],
        "Outs Per Position": [
          {
            "3B": 2
          }
        ]
      },
      "Offensive Stats": {
        "At Bats": 3,
        "Hits": 0,
        "Singles": 0,
        "Doubles": 0,
        "Triples": 0,
        "Homeruns": 0,
        "Successful Bunts": 0,
        "Sac Flys": 0,
        "Strikeouts": 1,
        "Walks (4 Balls)": 0,
        "Walks (Hit)": 0,
        "RBI": 0,
        "Bases Stolen": 0,
        "Star Hits": 0
      }
    }
  }
}

I'd like the resulting table to look something like this, but I can't figure out how:

CharID Batters Faced Innings Pitched At Bats 
0  15   0             0               3

Any help is very appreciated. I'm quite unfamiliar with JSON. I've attempted using "CharID" and "Defensive Stats" as a record_path, but I got the error:

If specifying a record_path, all elements of data should have the path.

2 Answers 2

2

Use the keys in the JSON to retrieve the values you need. Then build a list of dictionaries to create your data frame.

import pandas as pd

# Your JSON data
json_data = {
  "GameID": "1,218,463,841",
  "Date - Start": "1761097369",
  "Date - End": "1761098306",
  "TagSetID": 79,
  "Netplay": 1,
  "StadiumID": 5,
  "Away Player": "margoose",
  "Home Player": "dev",
  "Away Score": 0,
  "Home Score": 2,
  "Innings Selected": 5,
  "Innings Played": 5,
  "Quitter Team": 255,
  "Average Ping": 18,
  "Lag Spikes": 8,
  "Version": "2.1.1",
  "Character Game Stats": {
    "Away Roster 0": {
      "Team": "0",
      "RosterID": 0,
      "CharID": 15,
      "Defensive Stats": {
        "Batters Faced": 0,
        "Outs Pitched": 0
      },
      "Offensive Stats": {
        "At Bats": 3
      }
    },
    "Away Roster 1": {
      "Team": "0",
      "RosterID": 1,
      "CharID": 6,
      "Defensive Stats": {
        "Batters Faced": 0,
        "Outs Pitched": 0
      },
      "Offensive Stats": {
        "At Bats": 3
      }
    },
    "Away Roster 2": {
      "Team": "0",
      "RosterID": 2,
      "CharID": 20,
      "Defensive Stats": {
        "Batters Faced": 0,
        "Outs Pitched": 0
      },
      "Offensive Stats": {
        "At Bats": 3
      }
    }
  }
}

character_stats = json_data["Character Game Stats"]

player_data_list = []

for player in character_stats.values():

    char_id = player['CharID']
    batters_faced = player['Defensive Stats']['Batters Faced']
    at_bats = player['Offensive Stats']['At Bats']
    
    outs_pitched = player['Defensive Stats']['Outs Pitched']
    innings_pitched = outs_pitched / 3.0
    
    player_data_list.append({
        'CharID': char_id,
        'Batters Faced': batters_faced,
        'Innings Pitched': innings_pitched,
        'At Bats': at_bats
    })

df = pd.DataFrame(player_data_list)

print(df)
Sign up to request clarification or add additional context in comments.

4 Comments

I really don't think you should do this with manual record iteration.
@reinderin any reason you think iteration is not appropriate here?
It's work that you shouldn't be doing yourself when pandas offers this functionality.
What pandas offers for this particular one is convenience. This process does give the user more control though and in some scenarios could even be more performant. If control does not outweigh the convenience (depending on use case) then yes stick with json.normalize
0

Pandas has the machinery for this, so use it: json_normalize.

import json
import pandas as pd

content = ''' ... '''
as_json = json.loads(content)

stats = as_json["Character Game Stats"]
df = (
    pd.json_normalize(stats.values())
    [['CharID', 'Defensive Stats.Batters Faced', 'Offensive Stats.At Bats']]
    .rename(columns={
        'Defensive Stats.Batters Faced': 'Batters Faced',
        'Offensive Stats.At Bats': 'At Bats',
    })
)
print(df)
   CharID  Batters Faced  At Bats
0      15              0        3
1       6              0        3
2      20              0        3

Comments

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.