I have data in a column that looks like this in a pandas dataframe:
'Column name':
[{'Name': Dan Smith, 'Attribute1': 4, 'Attribute2': 10, 'Attribute3': 6}, {'Name': Bob Smith, 'Attribute1': 4, 'Attribute2': 10, 'Attribute3': 6}],
[{'Name': Shelly Smith, 'Attribute1': 4, 'Attribute2': 10, 'Attribute3': 6}, {'Name': Sam Smith, 'Attribute1': 4, 'Attribute2': 10, 'Attribute3': 6}],
{'Name': Jane Smith, 'Attribute1': 4, 'Attribute2': 10, 'Attribute3': 6},
[{'Name': Chris Smith, 'Attribute1': 4, 'Attribute2': 10, 'Attribute3': 6}, {'Name': Darryl Smith, 'Attribute1': 4, 'Attribute2': 15, 'Attribute3': 6}],
Companies are separated by [] except in instances where the company only has 1 observation (such as the 3rd observation in this example with Jane Smith). My problem is trying to parse the nested keys when the nested keys are identical. My goal is to grab the attribute with the highest value for each company.
I've tried:
df = df.explode('Column Name')
However, that does not do anything. The observations appear the same as before. After some research I tried the following
from ast import literal_eval
df['Column name'] = df['Column name'].apply(literal_eval)
df = df.explode('Column Name')
However, when I do this, I get a "KeyError:0" return. I found that this error occurs due to instances like the 3rd row where there is only 1 observation for that company. I can explode small samples of my data and grab the highest attribute and proceed according to plan. However, I have 1.62 million rows so splitting the sample into small batches is not wise.
Is there a way to pass the 'KeyError:0' exceptions? Or is there a better way to get where I'm trying to go? I'm new to Python/Pandas.