How to reshape pandas dataframe with pivot?

Question

I have a dataframe,

    Year  Start  End   Name       Price
0   nan   0101   0331  Squirtle    876
1  2021   0101   1231  Squirtle    200
2   nan   0101   0331  Wartortle   1000
3  2021   0101   1231  Wartortle   1200
4   nan   0101   0331  Blastoise   3100
5  2021   0101   1231  Blastoise   4200
6  2022   0101   1231  Blastoise   10000

I want to reshape it like this,

                   Name    Squirtle      Wartortle       Blastoise
Year  Start End
nan   0101  0331              876           1000            3100
2021  0101  1231              200           1200            4200
2022  0101  1231                                            10000

I tried, df.pivot(index=['Year', 'Start', 'End'], columns='Name', values='Price'). But didn't get any luck. Any help would be appreciated!

James · Accepted Answer · 2020-01-23 21:34:02Z

4

You are pretty close. Use pivot_table instead of pivot to get the grouping you want. The only caveat is you will need to replace the NA values (if they are actually NA and not the string 'nan').

df.fillna('NA').pivot_table(index=['Year', 'Start', 'End'], columns='Name', values='Price')
# returns:
Name               Blastoise  Squirtle  Wartortle
Year   Start End
2021.0 101   1231     4200.0     200.0     1200.0
2022.0 101   1231    10000.0       NaN        NaN
NA     101   331      3100.0     876.0     1000.0

answered Jan 23, 2020 at 21:34

James

37k4 gold badges54 silver badges79 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Scott Boston · Accepted Answer · 2020-01-23 22:26:30Z

3

Use set_index and unstack:

df.set_index(['Year','Start','End','Name'])['Price'].unstack()

Output:

Name               Blastoise  Squirtle  Wartortle
Year   Start End                                 
NaN    101   331      3100.0     876.0     1000.0
2021.0 101   1231     4200.0     200.0     1200.0
2022.0 101   1231    10000.0       NaN        NaN

answered Jan 23, 2020 at 22:26

Scott Boston

154k15 gold badges160 silver badges207 bronze badges

Comments

Kaj · Accepted Answer · 2020-01-23 22:14:07Z

I think you mistakenly used pivot instead of pivot_table.

data = [[np.NaN, 101, 331, 'Squirtle', 876],
[2021, 101, 1231, 'Squirtle', 200],
[np.NaN, 101, 331, 'Wartortle', 1000],
[2021, 101, 1231, 'Wartortle', 1200],
[np.NaN, 101, 331, 'Blastoise', 3100],
[2021, 101, 1231, 'Blastoise', 4200],
[2022, 101, 1231, 'Blastoise', 10000]]

df.pivot_table(index=['Year', 'Start', 'End'], columns='Name', values='Price')

Outputs:

Name               Blastoise  Squirtle  Wartortle
Year   Start End                                 
2021.0 101   1231     4200.0     200.0     1200.0
2022.0 101   1231    10000.0       NaN        NaN

Whereas if you replace the values with a placeholder value such as 1000

df = df.fillna(1000)
df.pivot_table(index=['Year', 'Start', 'End'], columns='Name', values='Price')

You would get what you desire of:

Name               Blastoise  Squirtle  Wartortle
Year   Start End                                 
1000.0 101   331      3100.0     876.0     1000.0
2021.0 101   1231     4200.0     200.0     1200.0
2022.0 101   1231    10000.0       NaN        NaN

Collectives™ on Stack Overflow

How to reshape pandas dataframe with pivot?

3 Answers 3

Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related