2

I have a dataframe,

    Year  Start  End   Name       Price
0   nan   0101   0331  Squirtle    876
1  2021   0101   1231  Squirtle    200
2   nan   0101   0331  Wartortle   1000
3  2021   0101   1231  Wartortle   1200
4   nan   0101   0331  Blastoise   3100
5  2021   0101   1231  Blastoise   4200
6  2022   0101   1231  Blastoise   10000

I want to reshape it like this,

                   Name    Squirtle      Wartortle       Blastoise
Year  Start End
nan   0101  0331              876           1000            3100
2021  0101  1231              200           1200            4200
2022  0101  1231                                            10000

I tried, df.pivot(index=['Year', 'Start', 'End'], columns='Name', values='Price'). But didn't get any luck. Any help would be appreciated!

3 Answers 3

4

You are pretty close. Use pivot_table instead of pivot to get the grouping you want. The only caveat is you will need to replace the NA values (if they are actually NA and not the string 'nan').

df.fillna('NA').pivot_table(index=['Year', 'Start', 'End'], columns='Name', values='Price')
# returns:
Name               Blastoise  Squirtle  Wartortle
Year   Start End
2021.0 101   1231     4200.0     200.0     1200.0
2022.0 101   1231    10000.0       NaN        NaN
NA     101   331      3100.0     876.0     1000.0
Sign up to request clarification or add additional context in comments.

Comments

3

Use set_index and unstack:

df.set_index(['Year','Start','End','Name'])['Price'].unstack()

Output:

Name               Blastoise  Squirtle  Wartortle
Year   Start End                                 
NaN    101   331      3100.0     876.0     1000.0
2021.0 101   1231     4200.0     200.0     1200.0
2022.0 101   1231    10000.0       NaN        NaN

Comments

2

I think you mistakenly used pivot instead of pivot_table.

data = [[np.NaN, 101, 331, 'Squirtle', 876],
[2021, 101, 1231, 'Squirtle', 200],
[np.NaN, 101, 331, 'Wartortle', 1000],
[2021, 101, 1231, 'Wartortle', 1200],
[np.NaN, 101, 331, 'Blastoise', 3100],
[2021, 101, 1231, 'Blastoise', 4200],
[2022, 101, 1231, 'Blastoise', 10000]]

df.pivot_table(index=['Year', 'Start', 'End'], columns='Name', values='Price')

Outputs:

Name               Blastoise  Squirtle  Wartortle
Year   Start End                                 
2021.0 101   1231     4200.0     200.0     1200.0
2022.0 101   1231    10000.0       NaN        NaN

Whereas if you replace the values with a placeholder value such as 1000

df = df.fillna(1000)
df.pivot_table(index=['Year', 'Start', 'End'], columns='Name', values='Price')

You would get what you desire of:

Name               Blastoise  Squirtle  Wartortle
Year   Start End                                 
1000.0 101   331      3100.0     876.0     1000.0
2021.0 101   1231     4200.0     200.0     1200.0
2022.0 101   1231    10000.0       NaN        NaN

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.