0

When I slice into a MultiIndex DataFrame by a level 0 index value, I want to know the possible level 1+ index values that fall under that initial value. If my wording doesn't make sense, here's an example:

>>> arrays = [['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
... ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two'],
... ['a','b','a','b','b','b','b','b']]
>>> tuples = list(zip(*arrays))
>>> index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second','third'])
>>> s = pd.Series(np.random.randn(8), index=index)
>>> s
first  second  third
bar    one     a       -0.598684
       two     b        0.351421
baz    one     a       -0.618285
       two     b       -1.175418
foo    one     b       -0.093806
       two     b        1.092197
qux    one     b       -1.515515
       two     b        0.741408
dtype: float64

s's index looks like:

>>> s.index
MultiIndex(levels=[[u'bar', u'baz', u'foo', u'qux'], [u'one', u'two'], [u'a', u'b']],
           labels=[[0, 0, 1, 1, 2, 2, 3, 3], [0, 1, 0, 1, 0, 1, 0, 1], [0, 1, 0, 1, 1, 1, 1, 1]],
           names=[u'first', u'second', u'third'])

When I take just the section of s whose first index value is foo, and look up the index of that I get:

>>> s_foo = s.loc['foo']
>>> s_foo
second  third
one     b       -0.093806
two     b        1.092197
dtype: float64

>>> s_foo.index
MultiIndex(levels=[[u'one', u'two'], [u'a', u'b']],
           labels=[[0, 1], [1, 1]],
           names=[u'second', u'third'])

I want the index of s_foo to act as if the higher level of s does not exist, yet we can see in s_foo.index's levels attribute that a is still considered a potential value of index third, despite the fact that s_foo only has b as a possible value.

Essentially, what I want to find are all the possible third values of foo_s, i.e. b and only b. Right now I do set(s_foo.reset_index()['third']), but I was hoping for a more elegant solution

2 Answers 2

1

You can create s_foo and explicitly drop the unused levels:

s_foo = s.loc['foo']
s_foo.index = s_foo.index.remove_unused_levels()
Sign up to request clarification or add additional context in comments.

1 Comment

This is exactly what I wanted. Thanks!
0

Reset index seems like the right way to go, seems like you don't want it to be an index (the result you're getting is the way indexes work).

s.reset_index(level=2).groupby(level=[0])['third'].unique()

or if you want counts

s.reset_index(level=2).groupby(level=[0])['third'].value_counts()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.