2

I have a df, sample data looks like below:

df

user    collaborators
A       {'first_name' : 'jack', 'last_name' : 'abc', 'id' : '[email protected]'}, {'first_name' : 'john', 'last_name' : 'pqr', 'id' : '[email protected]'}
B       {'first_name' : 'tom', 'last_name' : 'pwc', 'id' : '[email protected]'}
C       {'first_name' : 'jill', 'last_name' : 'def', 'id' : '[email protected]'}, {'first_name' : 'jill', 'last_name' : 'def', 'id' : '[email protected]'}

I need to get the first_name, last_name and id in individual columns. Expected Output:

enter image description here

I first tried to remove the square brackets using:

df['collaborators'].str.strip('{}')
df['collaborators'].str.replace('[{}]', '')

But I got only NaN as the result, not sure why. I thought of exploding the column after converting it to a list and extracting first_name, last_name and id. But all this looks like lot of steps. Could someone please let me know if there's a shorter way.

Updated DF:

enter image description here

I used to_list to convert the three columns into list:

df['first_name'] = df['first_name'].to_list()
df['last_name'] = df['last_name'].to_list()
df['id'] = df['id'].to_list()

I then tried below code from SO:

 df.set_index('collaborators').apply(lambda x:
    x.apply(pd.Series).stack()).reset_index().drop('level_1', 1)

But it didn't work for me.

3
  • have a look at pd.json_normalize() or df.explode() Commented May 5, 2022 at 7:55
  • stackoverflow.com/questions/46391291/… Commented May 5, 2022 at 8:04
  • What exactly is your data? Can you load it into pandas, apply .values on it and paste the output here? Commented May 5, 2022 at 8:13

2 Answers 2

1

Use ast.literal_eval with add [] for lists of DataFrames, so possible use DataFrame.explode with json_normalize and last append to original with remove helper column new in DataFrame.pop:

import ast

df['new'] = df['collaborators'].apply(lambda x: ast.literal_eval('[' + x + ']'))

df1 = df.explode('new').reset_index(drop=True)

df = df1.join(pd.json_normalize(df1.pop('new')))
print (df)

  user                                      collaborators first_name  \
0    A  {'first_name' : 'jack', 'last_name' : 'abc', '...       jack   
1    A  {'first_name' : 'jack', 'last_name' : 'abc', '...       john   
2    B  {'first_name' : 'tom', 'last_name' : 'pwc', 'i...        tom   
3    C  {'first_name' : 'jill', 'last_name' : 'def', '...       jill   
4    C  {'first_name' : 'jill', 'last_name' : 'def', '...       jill   

  last_name             id  
0       abc  [email protected]  
1       pqr  [email protected]  
2       pwc  [email protected]  
3       def  [email protected]  
4       def  [email protected]  
Sign up to request clarification or add additional context in comments.

17 Comments

When I run on my data, I am getting this error: could you please let me know how to fix this: ValueError: malformed node or string: <ast.Name object at 0x7fe68bb2c7c0>
@KarthikS - what is print (list(df.loc[0, 'collaborators'])) ?
I am getting the the array within a list. Apologies can't paste the actual output as it's company data. But for example it's looks like this: [{'first_name' : 'jack', 'last_name' : 'abc', 'id' : '[email protected]'}, {'first_name' : 'john', 'last_name' : 'pqr', 'id' : '[email protected]'}]
looking at the error, I am not able to figure out the issue. Could you please let me know if there's another way
No worries, have figured out, thanks a bunch for your patience and help
|
0

Use:

(df[0]+",").apply(eval).explode().apply(pd.Series)

Output:

enter image description here

Complete code:

data="""{'first_name' : 'jack', 'last_name' : 'abc', 'id' : '[email protected]'},  {'first_name' : 'john', 'last_name' : 'pqr', 'id' : '[email protected]'}
{'first_name' : 'tom', 'last_name' : 'pwc', 'id' : '[email protected]'}
{'first_name' : 'jill', 'last_name' : 'def', 'id' : '[email protected]'},  {'first_name' : 'jill', 'last_name' : 'def', 'id' : '[email protected]'}"""
df = pd.DataFrame([x for x in data.split('\n')])#
(df[0]+",").apply(eval).explode().apply(pd.Series)

1 Comment

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.