3

I have a multi-index dataframe that looks like

uid tid text

abc x t1

bcd y t2

uid and tid are the indexes. I have a list of uids, and want to get the rows corresponding to the uids in that list, but keeping the 2nd level index values (tid). I want to do it without running any explicit loop. Is that possible?

0

1 Answer 1

3

Data:

L = ['abc', 'bcd']

print (df)
         text
uid  tid     
abc  x     t1
abc1 x     t1
bcd  y     t2

1.slicers

idx = pd.IndexSlice
df1 = df.loc[idx[L,:],:]

2.boolean indexing + mask with get_level_values + isin:

df1 = df[df.index.get_level_values(0).isin(L)]

3.query, docs:

df1 = df.query('@L in uid')
print (df1)
        text
uid tid     
abc x     t1
bcd y     t2
Sign up to request clarification or add additional context in comments.

2 Comments

thanks @jezrael. With approach one, I get 'MultiIndex Slicing requires the index to be fully lexsorted tuple len (2), lexsort depth (0)' error. But 2 works. Which is the most efficient? I have a huge data frame.
the fastest is sorting first by df = df.sort_index() and then use first method. sorting is explained 'In 97'

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.