2

Say you have a multiindex DataFrame

    x y z
a 1 0 1 2
  2 3 4 5
b 1 0 1 2
  2 3 4 5
  3 6 7 8
c 1 0 1 2
  2 0 4 6

Now you have another DataFrame which is

 col1 col2
0 a    1
1 b    1
2 b    3
3 c    1
4 c    2

How do you split the multiindex DataFrame based on the one above?

3 Answers 3

2

Use loc by tuples:

df = df1.loc[df2.set_index(['col1','col2']).index.tolist()]
print (df)
     x  y  z
a 1  0  1  2
b 1  0  1  2
  3  6  7  8
c 1  0  1  2
  2  0  4  6

df = df1.loc[[tuple(x) for x in df2.values.tolist()]]
print (df)
     x  y  z
a 1  0  1  2
b 1  0  1  2
  3  6  7  8
c 1  0  1  2
  2  0  4  6

Or join:

df = df2.join(df1, on=['col1','col2']).set_index(['col1','col2'])
print (df)
           x  y  z
col1 col2         
a    1     0  1  2
b    1     0  1  2
     3     6  7  8
c    1     0  1  2
     2     0  4  6
Sign up to request clarification or add additional context in comments.

Comments

1

Simply using isin

df[df.index.isin(list(zip(df2['col1'],df2['col2'])))]


Out[342]: 
               0  1  2  3
index1 index2            
a      1       1  0  1  2
b      1       1  0  1  2
       3       3  6  7  8
c      1       1  0  1  2
       2       2  0  4  6

Comments

0

You can also do this using the MultiIndex reindex method https://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.reindex.html

## Recreate your dataframes
tuples = [('a', 1), ('a', 2),
              ('b', 1), ('b', 2),
              ('b', 3), ('c', 1),
              ('c', 2)]

data = [[1, 0, 1, 2],
        [2, 3, 4, 5],
        [1, 0, 1, 2],
        [2, 3, 4, 5],
        [3, 6, 7, 8],
        [1, 0, 1, 2],
        [2, 0, 4, 6]]

idx = pd.MultiIndex.from_tuples(tuples, names=['index1','index2'])
df= pd.DataFrame(data=data, index=idx)

df2 = pd.DataFrame([['a', 1],
                   ['b', 1],
                   ['b', 3],
                   ['c', 1],
                   ['c', 2]])

# Answer Question
idx_subset = pd.MultiIndex.from_tuples([(a, b) for a, b in df2.values], names=['index1', 'index2'])
out = df.reindex(idx_subset)
print(out)
                   0  1  2  3
index1 index2            
a      1       1  0  1  2
b      1       1  0  1  2
       3       3  6  7  8
c      1       1  0  1  2
       2       2  0  4  6

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.