1

I have an empty DataFrame with Multi-Index index and columns. I also have list of strings that is cordinates of second level indexes. Since all of my second level index are unique, I am hoping to find cordinates and input values with my list of strings. Take a look at below example

df=
       DNA      Cat2                                 ....   
       Item     A   B   C   D   E   F   F   H   I   J   
DNA   Item
Cat2  A         0   0   0   0   0   0   0   0   0   0 
      B         0   0   0   0   0   0   0   0   0   0 
      C         0   0   0   0   0   0   0   0   0   0 
      D         0   0   0   0   0   0   0   0   0   0 
      E         0   0   0   0   0   0   0   0   0   0 
      F         0   0   0   0   0   0   0   0   0   0 
....

str_cord = [(A,B),(A,H),(A,I),(B,H),(B,I),(H,I)]
#and my output should be like below.

df_result=
       DNA      Cat2                                 ....   
       Item     A   B   C   D   E   F   F   H   I   J   
DNA   Item
Cat2  A         0   1   0   0   0   0   0   1   1   0 
      B         0   0   0   0   0   0   0   1   1   0 
      C         0   0   0   0   0   0   0   0   0   0 
      D         0   0   0   0   0   0   0   0   0   0 
      E         0   0   0   0   0   0   0   0   0   0 
      F         0   0   0   0   0   0   0   0   0   0 
      H         0   0   0   0   0   0   0   0   1   0
....

It looks kinda complicated, but all I want to do is use my str_cord[0] as my cordinate for df_result. I tried with .loc, but it seems like I need to input level 1 index. I am looking for the way that I do not have to input Multi-Index level1 and find cordinates with level2 strings. Hope it make sense and thanks in advance! (Oh the data itself is very big, so as efficient as possible)

1
  • Very difficult to reproduce this dataframe. Commented Jan 4, 2018 at 11:29

1 Answer 1

1

You can use:

for i, j in str_cord:
    idx = pd.IndexSlice
    df.loc[idx[:, i], idx[:, j]] = 1

Sample:

L = list('ABCDEFGHIJ')
mux = pd.MultiIndex.from_product([['Cat1','Cat2'], L])

df = pd.DataFrame(0, index=mux, columns=mux)
print (df)
       Cat1                            Cat2                           
          A  B  C  D  E  F  G  H  I  J    A  B  C  D  E  F  G  H  I  J
Cat1 A    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     B    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     C    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     D    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     E    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     F    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     G    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     H    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     I    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     J    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
Cat2 A    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     B    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     C    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     D    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     E    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     F    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     G    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     H    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     I    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     J    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0

str_cord = [('A','B'),('A','H'),('A','I'),('B','H'),('B','I'),('H','I')]

for i, j in str_cord:
    idx = pd.IndexSlice
    df.loc[idx[:, i], idx[:, j]] = 1

print (df)
       Cat1                            Cat2                           
          A  B  C  D  E  F  G  H  I  J    A  B  C  D  E  F  G  H  I  J
Cat1 A    0  1  0  0  0  0  0  1  1  0    0  1  0  0  0  0  0  1  1  0
     B    0  0  0  0  0  0  0  1  1  0    0  0  0  0  0  0  0  1  1  0
     C    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     D    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     E    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     F    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     G    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     H    0  0  0  0  0  0  0  0  1  0    0  0  0  0  0  0  0  0  1  0
     I    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     J    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
Cat2 A    0  1  0  0  0  0  0  1  1  0    0  1  0  0  0  0  0  1  1  0
     B    0  0  0  0  0  0  0  1  1  0    0  0  0  0  0  0  0  1  1  0
     C    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     D    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     E    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     F    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     G    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     H    0  0  0  0  0  0  0  0  1  0    0  0  0  0  0  0  0  0  1  0
     I    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
     J    0  0  0  0  0  0  0  0  0  0    0  0  0  0  0  0  0  0  0  0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.