9

I've started using Pandas for some large Datasets and mostly it works really well. There are some questions I have regarding the indices though

  1. I have a MultiIndex with three levels - let's say a, b, c. How do I slice along index a - I just want the values where a = 5, 7, 10, 13. Doing df.ix[[5, 7, 10, 13]] does not work as pointed out in the documentation

  2. I need to have different indices on a DF - can I create these multiple indices and not associate them to a dataframe and use them to give me back the raw ndarray index?

  3. Can I slice a MultiIndex on its own not in a series or Dataframe?

Thanks in advance

2
  • Do you think you could include a construction for a sample df for this example? Commented Dec 21, 2012 at 16:58
  • I created a github issue to examine in more detail: github.com/pydata/pandas/issues/2598 Commented Dec 26, 2012 at 1:15

2 Answers 2

12

For the first part, you can use boolean indexing using get_level_values:

df[df.index.get_level_values('a').isin([5, 7, 10, 13])]

For the second two, you can inspect the MultiIndex object by calling:

df.index

(and this can be inspected/sliced.)

Sign up to request clarification or add additional context in comments.

7 Comments

brilliant! so I still have one problem. I can slice on multiindex, but I give it a raw index and it gives me back a tuple. I want it the other way around, so: myindex[1, 2, 4]
@WolfgangKerzendorf So you want it exported to an array? I think the issue is that behind the scenes pandas stores using indices of .level, and doesn't store this array... I will take another look. Hopefully there is a better way than np.array(map(np.array, df.index.values)) (!)
So I found that index.get_loc is similar to what I want. It translates from a key to an actual location - but it is not as useful as the .ix notation of a series. For now I think i will just do my_index = Series(arange(len(df)), index=myselectedindex)
df.index.get_level_values('a') gives back an array which doesn't have a method isin.
@WolfgangKerzendorf what version of pandas are you using?
|
2

Edit: This answer for pandas versions lower than 0.10.0 only:

Okay @hayden had the right idea to start with:

An index has the method get_level_values() which returns, however, an array (in pandas versions < 0.10.0). The isin() method doesn't exist for arrays but this works:

from pandas import lib
lib.ismember(df.index.get_level_values('a'), set([5, 7, 10, 13])

That only answers question 1 - but I'll give an update if I crack 2, 3 (half done with @hayden's help)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.