How to flatten pandas dataframe

Question

Here is my pandas dataframe, and I would like to flatten. How can I do that ?

The input I have

key column
1 {'health_1': 45, 'health_2': 60, 'health_3': 34, 'health_4': 60, 'name': 'Tom'}   
2 {'health_1': 28, 'health_2': 10, 'health_3': 42, 'health_4': 07, 'name': 'John'}  
3 {'health_1': 86, 'health_2': 65, 'health_3': 14, 'health_4': 52, 'name': 'Adam'}

The expected output

All the health and name will become a column name of their own with their corresponding values. In no particular order.

health_1 health_2 health_3 health_4 name key
45          60       34       60    Tom  1
28          10       42       07    John 2
86          65       14       52    Adam 3

Please show the expected output. Do you want e.g. 4 rows (health_...) from each source row? — Valdi_Bo
– Valdi_Bo, Commented Dec 5, 2018 at 14:38
@Valdi_Bo not sure if I understood you correctly, basically every row has 5 columns. If that helps you — PolarBear10
– PolarBear10, Commented Dec 5, 2018 at 14:44

A l w a y s S u n n y · Accepted Answer · 2018-12-05 14:52:48Z

6

You can do it with one line solution,

df_expected = pd.concat([df, df['column'].apply(pd.Series)], axis = 1).drop('column', axis = 1)

Full version:

import pandas as pd
df = pd.DataFrame({"column":[
{'health_1': 45, 'health_2': 60, 'health_3': 34, 'health_4': 60, 'name': 'Tom'}   ,
{'health_1': 28, 'health_2': 10, 'health_3': 42, 'health_4': 7, 'name': 'John'}  ,
{'health_1': 86, 'health_2': 65, 'health_3': 14, 'health_4': 52, 'name': 'Adam'}
]})

df_expected = pd.concat([df, df['column'].apply(pd.Series)], axis = 1).drop('column', axis = 1)
print(df_expected)

DEMO: https://repl.it/repls/ButteryFrightenedFtpclient

edited Dec 5, 2018 at 14:52

answered Dec 5, 2018 at 14:47

A l w a y s S u n n y

38.8k9 gold badges68 silver badges120 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

BhishanPoudel · Accepted Answer · 2018-12-05 15:24:50Z

4

This should work:

df['column'].apply(pd.Series)

Gives:

   health_1  health_2  health_3  health_4  name
0  45        60        34        60        Tom 
1  28        10        42        7         John
2  86        65        14        52        Adam

answered Dec 5, 2018 at 15:24

BhishanPoudel

17.4k27 gold badges123 silver badges190 bronze badges

Comments

Scott Boston · Accepted Answer · 2018-12-05 14:51:24Z

2

Try:

pd.concat([pd.DataFrame(i, index=[0]) for i in df.column], ignore_index=True)

Output:

   health_1  health_2  health_3  health_4  name
0        45        60        34        60   Tom
1        28        10        42         7  John
2        86        65        14        52  Adam

answered Dec 5, 2018 at 14:51

Scott Boston

154k15 gold badges160 silver badges207 bronze badges

Comments

user3483203 · Accepted Answer · 2018-12-05 16:13:33Z

2

The solutions using apply are going overboard. You can create your desired DataFrame using a list of dictionaries like you have in your column Series. You can easily get this list of dictionaries by using the tolist method:

res = pd.concat([df.key, pd.DataFrame(df.column.tolist())], axis=1)
print(res)

   key  health_1  health_2  health_3  health_4  name
0    1        45        60        34        60   Tom
1    2        28        10        42         7  John
2    3        86        65        14        52  Adam

edited Dec 5, 2018 at 16:13

answered Dec 5, 2018 at 16:07

user3483203

51.3k10 gold badges72 silver badges104 bronze badges

Comments

Rich · Accepted Answer · 2018-12-05 14:35:19Z

0

Not sure I understand - This is the default format for a DataFrame?

import pandas as pd
df = pd.DataFrame([
{'health_1': 45, 'health_2': 60, 'health_3': 34, 'health_4': 60, 'name': 'Tom'}   ,
{'health_1': 28, 'health_2': 10, 'health_3': 42, 'health_4': 7, 'name': 'John'}  ,
{'health_1': 86, 'health_2': 65, 'health_3': 14, 'health_4': 52, 'name': 'Adam'}
])

answered Dec 5, 2018 at 14:35

Rich

3,7703 gold badges22 silver badges24 bronze badges

2 Comments

PolarBear10 Over a year ago

your answer is how I wanted to look like, but unfortunately, I have the columns nested in the row with their values. This is data I get from a really bad server, that I need to convert to a pandas dataframe

PolarBear10 Over a year ago

the one I am presenting above is what I have. A dataframe with 2 columns, key and column. I would like to unpack the rows, so that each key in the row becomes a column in itself

Collectives™ on Stack Overflow

How to flatten pandas dataframe

5 Answers 5

Comments

Comments

Comments

Comments

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Comments

Comments

Comments

Comments

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related