3

I have a table like:

country | name  | medals_won | year
-----------------------------------
US      | sarah |      1     | 2010
US      | sarah |      2     | 2011
US      | sarah |      5     | 2015
US      | alice |      3     | 2010
US      | alice |      4     | 2012
US      | alice |      1     | 2015
AU      | jones |      2     | 2013
AU      | jones |      8     | 2015

I want it like:

country | name  | 2010 | 2011 | 2012 | 2013 | 2014 | 2015
---------------------------------------------------------
US      | sarah | 1    | 2    | 0    | 0    | 0    | 5
US      | alice | 3    | 0    | 4    | 0    | 0    | 1
AU      | jones | 0    | 0    | 0    | 2    | 0    | 8

I've tinkered with df.apply, or even brute-force iteration, but you can probably guess that the tricky part is that these row values aren't strictly sequential, so this isn't a simple transpose operation (nobody won any medals in 2014, for e.g., but I want the resulting table to show that in a column full of zeros).

2 Answers 2

6

You can use set_index + unstack:

df = df.set_index(['country','name','year'])['medals_won'].unstack(fill_value=0)
print (df)
year           2010  2011  2012  2013  2015
country name                               
AU      jones     0     0     0     2     8
US      alice     3     0     4     0     1
        sarah     1     2     0     0     5

If duplicates need aggregation like mean, sum... with pivot_table or groupby + aggregate function + unstack:

print (df)
  country   name  medals_won  year
0      US  sarah           1  2010 <-same US  sarah 2010, different 1
1      US  sarah           4  2010 <-same US  sarah 2010, different 4
2      US  sarah           2  2011
3      US  sarah           5  2015
4      US  alice           3  2010
5      US  alice           4  2012
6      US  alice           1  2015
7      AU  jones           2  2013
8      AU  jones           8  2015

df = df.pivot_table(index=['country','name'], 
                    columns='year', 
                    values='medals_won', 
                    fill_value=0, 
                    aggfunc='mean')
print (df)
year           2010  2011  2012  2013  2015
country name                               
AU      jones   0.0     0     0     2     8
US      alice   3.0     0     4     0     1
        sarah   2.5     2     0     0     5 <- (1+4)/2 = 2.5

Alternatively:

df = df.groupby(['country','name','year'])['medals_won'].mean().unstack(fill_value=0)
print (df)
year           2010  2011  2012  2013  2015
country name                               
AU      jones   0.0   0.0   0.0   2.0   8.0
US      alice   3.0   0.0   4.0   0.0   1.0
        sarah   2.5   2.0   0.0   0.0   5.0

Last:

df = df.reset_index().rename_axis(None, axis=1)
print (df)
  country   name  2010  2011  2012  2013  2015
0      AU  jones     0     0     0     2     8
1      US  alice     3     0     4     0     1
2      US  sarah     1     2     0     0     5
Sign up to request clarification or add additional context in comments.

2 Comments

Now that is just greedy taking all the answers. :-)
Yes, but always I think is necessary explain difference between solutions - if automatically use always pivot_table and no idea about aggreagtion, then all works nice but if data changed people are surprised whats going on. But If know there is aggreagtion, no problem. ;)
-1

You can use the pivot_table() function of pandas and fill nan values with zero using pd.fillna(0)

    df = pd.DataFrame({
        'country' : pd.Series(['US', 'US', 'US', 'US', 'US', 'US', 'AU', 'AU']),
        'name' : pd.Series(['sarah', 'sarah','sarah','alice','alice','alice','jones','jones']),
        'medals_won' : pd.Series([1,2,5,3,4,1,2,8]),
        'year': pd.Series([2010,2011,2015,2010,2012,2015,2013,2015])    
        })
    pd.pivot_table(df, index=['country','name'], columns='year', aggfunc='sum').fillna(0)

my output

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.