1

I have a Pandas dataframe that looks like this:

      X1     X1     X1     X2     X2    X2
ABC   12.4   34.3   25.4   29.3   53.2  38.9
DEF   22.3   28.6   32.8   24.6   29.4  25.3

The left column is the index, and the top values are column labels. I am trying to swap the column names and index so that it looks like this:

      ABC    ABC    ABC    DEF    DEF   DEF
X1   12.4   34.3   25.4    22.3   28.6  32.8
X2   29.3   53.2   38.9    24.6   29.4  25.3

I can get the axes switched using stack and unstack if I add a numbered index, but the replicates are listed vertically instead of horizontally. I can't figure out how to do it so that the individual replicates stay side-by-side, which is necessary for what I am trying to do with the table. The replicates need to stay separate, I do not want the Average/Sum/etc.

Any help/suggestions would be greatly appreciated.

Thanks!

edit:

This code gives a dataframe that is similar in structure to my actual data but with fewer columns:

names = ["G1","G2","G3","G4", "G5", "G6", "G7", "G8"]
df = pd.DataFrame([(7.345,"NaN","NaN",239.947,295.893,349.834),(13.872,"NaN","NaN",20.485,14.852,29.598),(764.298,"NaN","NaN",492.854,432.943,539.950),(0.00385,"NaN","NaN",0.184,0.384,0.285),(285.836,"NaN","NaN",495.284,395.486,368.952),(7.385,"NaN","NaN",5.293,4.295,4.692),(21.693,"NaN","NaN",25.843,15.843,15.386),(8.583,"NaN","NaN",4.397,6.295,6.39)], names, ["S1", "S1", "S1", "482.1", "482.1", "482.1"])

Giving this dataframe:

           S1   S1   S1    482.1    482.1    482.1
G1    7.34500  NaN  NaN  239.947  295.893  349.834
G2   13.87200  NaN  NaN   20.485   14.852   29.598
G3  764.29800  NaN  NaN  492.854  432.943  539.950
G4    0.00385  NaN  NaN    0.184    0.384    0.285
G5  285.83600  NaN  NaN  495.284  395.486  368.952
G6    7.38500  NaN  NaN    5.293    4.295    4.692
G7   21.69300  NaN  NaN   25.843   15.843   15.386
G8    8.58300  NaN  NaN    4.397    6.295    6.390

Running:

df2 = df.copy()
m = dict(zip(df2.index.unique(), df2.columns.unique()))
df2.index = df2.index.map(m.get)
df2.columns = df2.columns.map({v : k for k, v in m.items()}.get)

gives:

              G1   G1   G1       G2       G2       G2
S1       7.34500  NaN  NaN  239.947  295.893  349.834
482.1   13.87200  NaN  NaN   20.485   14.852   29.598
NaN    764.29800  NaN  NaN  492.854  432.943  539.950
NaN      0.00385  NaN  NaN    0.184    0.384    0.285
NaN    285.83600  NaN  NaN  495.284  395.486  368.952
NaN      7.38500  NaN  NaN    5.293    4.295    4.692
NaN     21.69300  NaN  NaN   25.843   15.843   15.386
NaN      8.58300  NaN  NaN    4.397    6.295    6.390

The column and index labels have moved, but the data associated with them have not, and several columns are missing. Running:

df2 = df.copy()
m = dict(zip(df2.index.unique(), df2.columns.unique()))
df2 = df2.rename(index=m, columns={v : k for k, v in m.items()})

gives:

              G1   G1   G1       G2       G2       G2
S1       7.34500  NaN  NaN  239.947  295.893  349.834
482.1   13.87200  NaN  NaN   20.485   14.852   29.598
G3     764.29800  NaN  NaN  492.854  432.943  539.950
G4       0.00385  NaN  NaN    0.184    0.384    0.285
G5     285.83600  NaN  NaN  495.284  395.486  368.952
G6       7.38500  NaN  NaN    5.293    4.295    4.692
G7      21.69300  NaN  NaN   25.843   15.843   15.386
G8       8.58300  NaN  NaN    4.397    6.295    6.390

Which is also wrong for similar reasons.

4
  • What if you only have two rows, but you have X1 X1 X1 X2 X2 X3 as columns? Commented Jan 5, 2018 at 22:50
  • Personally I think both representations will lead to a lot of problems. Usually it is not a good idea that columns/indices are repeated. Commented Jan 5, 2018 at 22:59
  • I realize that if I was going to be manipulating the data more in python I would need a different solution. The output is going to be used in a GUI program that requires this format, though. Commented Jan 7, 2018 at 0:19
  • Can you add your expected output for this data? It seems I've badly misunderstood the intent of your question (my apologies). Commented Jan 8, 2018 at 22:20

1 Answer 1

2
New_df=df.T.groupby(level=0).agg(lambda x : x.values.tolist()).stack().apply(pd.Series).unstack().sort_index(level=1,axis=1)
New_df.columns=New_df.columns.droplevel(level=0)
New_df
Out[229]: 
     ABC   ABC   ABC   DEF   DEF   DEF
X1  12.4  34.3  25.4  22.3  28.6  32.8
X2  29.3  53.2  38.9  24.6  29.4  25.3
Sign up to request clarification or add additional context in comments.

2 Comments

Nice, you got it!
@cᴏʟᴅsᴘᴇᴇᴅ This is not a common question ...since duplicated columns' name will create a lot of issue

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.