1

I have this following dataframe:

  Status    Percentage  Value   Name    Tahun
0   X       66.666667    4.0     A      2021
1   Y       33.333333    2.0     A      2021
2   Z       0.000000     0.0     A      2021
0   X       25.000000    2.0     A      2020
1   Y       62.500000    5.0     A      2020
2   Z       12.500000    1.0     A      2020           

I want to transpose the dataframe and change the column header to Status values. Ideally the output should look like

X            Y           Z          Type         Name    Tahun
66.666667    33.333333   0.000000   Percentage    A       2021 
4.0          2.0         0.0        Value         A       2021
25.000000    62.500000   12.500000  Percentage    A       2020
2.0          5.0         1.0        Value         A       2020                             
               

I tried this one:

df = df.set_index('Status').T

but I didnt get output as my expected. How can I change the rest of column names?

2 Answers 2

1

stack (Percentage and Value) + unstack (Status):

(df.set_index(['Name', 'Tahun', 'Status'])
   .stack()
   .unstack(level='Status')
   .rename_axis(('Name', 'Tahun', 'Type'))
   .reset_index())

Status Name  Tahun        Type          X          Y     Z
0         A   2020  Percentage  25.000000  62.500000  12.5
1         A   2020       Value   2.000000   5.000000   1.0
2         A   2021  Percentage  66.666667  33.333333   0.0
3         A   2021       Value   4.000000   2.000000   0.0
Sign up to request clarification or add additional context in comments.

Comments

0

Or just use melt and pivot:

(df.melt(['Name', 'Tahun', 'Status'], var_name='Type')
   .pivot('value', ['Name', 'Tahun', 'Type'], 'Status')
   .reset_index()
   .rename_axis(columns=None))

  Name  Tahun        Type          X          Y     Z
0    A   2020  Percentage  25.000000  62.500000  12.5
1    A   2020       Value   2.000000   5.000000   1.0
2    A   2021  Percentage  66.666667  33.333333   0.0
3    A   2021       Value   4.000000   2.000000   0.0

This code melts the dataframe so that the Percentage and Value columns get merged and a new column Type get's created, then it pivots it so that the Status column values become columns.

If there are duplicates:

(df.melt(['Name', 'Tahun', 'Status'], var_name='Type')
   .pivot_table('value', ['Name', 'Tahun', 'Type'], 'Status')
   .reset_index()
   .rename_axis(columns=None))

Difference is that pivot_table has an aggfunc argument, default set to mean, so if there are duplicate values, it will find the average of the other values, whereas pivot doesn't have that argument.

9 Comments

@jezrael Ah, edited my answer now
@jezrael Then what should be aggfunc?
@jezrael What? I don't really understand what you say... :)
@jezrael What if I do aggfunc=pd.Series?
Super, it is exactly what missing here. Btw, your answers are very ofteen missing explanation :(
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.