1

I have an issue in reshaping a pandas DatFrame. It looks like this (the numbers of lines and columns can vary) :

columns       col1        col2       col3       col4
Species                                                
sp1     218.000000  521.000000 533.000000 793.000000
sp1       0.105569    0.252300   0.258111   0.384019
sp1              2           2          2          3
sp2     225.000000  521.000000 540.000000 800.000000
sp2       0.107862    0.249760   0.258869   0.383509
sp2              2           2          2          3
sp3     217.000000  477.000000 512.000000 725.000000
sp3       0.112377    0.247022   0.265148   0.375453
sp3              1           1          3          3

The column Species is my index. I want to reshape it like this :

Species columns          c        f p
sp1        col1 218.000000 0.105569 2
sp1        col2 521.000000 0.252300 2
sp1        col3 533.000000 0.258111 2
sp1        col4 793.000000 0.384019 3
sp2
sp2
sp2
sp2
sp3                         etc
sp3
sp3
sp3

But I can't find how to do.

The purpose is to then make a heatmap with the p.rect() function of bokeh, the x-axis being the columns c or f, the y-axis being the column Species. The size of the rectangle would be determined by the column p.

Thanks in advance.

1 Answer 1

3

First create MultiIndex by floor division and then reshape by stack and unstack:

c = np.array(['c','f','p'])
df.index = [df.index, c[np.arange(len(df.index)) % 3]]
print (df)
columns          col1        col2        col3        col4
Species                                                  
sp1     c  218.000000  521.000000  533.000000  793.000000
        f    0.105569    0.252300    0.258111    0.384019
        p    2.000000    2.000000    2.000000    3.000000
sp2     c  225.000000  521.000000  540.000000  800.000000
        f    0.107862    0.249760    0.258869    0.383509
        p    2.000000    2.000000    2.000000    3.000000
sp3     c  217.000000  477.000000  512.000000  725.000000
        f    0.112377    0.247022    0.265148    0.375453
        p    1.000000    1.000000    3.000000    3.000000

df = df.stack().unstack(1).reset_index()
print (df)
   Species columns      c         f    p
0      sp1    col1  218.0  0.105569  2.0
1      sp1    col2  521.0  0.252300  2.0
2      sp1    col3  533.0  0.258111  2.0
3      sp1    col4  793.0  0.384019  3.0
4      sp2    col1  225.0  0.107862  2.0
5      sp2    col2  521.0  0.249760  2.0
6      sp2    col3  540.0  0.258869  2.0
7      sp2    col4  800.0  0.383509  3.0
8      sp3    col1  217.0  0.112377  1.0
9      sp3    col2  477.0  0.247022  1.0
10     sp3    col3  512.0  0.265148  3.0
11     sp3    col4  725.0  0.375453  3.0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.