Pandas multi-index data access

Question

I have a multi-index dataframe like this:

       TC_name  Year
id id2              
1  1      RITA  2020
   2      RITA  2020
2  1       IDA  2020
   2       IDA  2020
   3       IDA  2020
   4       IDA  2021
3  1      RITA  2021
   2      RITA  2021
   3      RITA  2021

Now, I want to access the first line for each ‘id’ group, i.e. (1,1) = RITA2020, (2,1) = IDA2020, (3,1) = RITA2021...and use them to form a new dataframe.

However, when I try df.loc[:,1], it does not work. I tried df.loc[1], df.loc[2] and it gives me the right group, but it seems that the 'id2' index can not work well.

So what should I do next to get access to the data I want?

Thank you for your help.

Gonçalo Peres · Accepted Answer · 2022-09-24 09:08:41Z

3

Assuming OP wants to create a dataframe based on the first element of each group, one can use pandas.DataFrame.groupby. As OP wants the first index, id, one should be level=0. Finally, considering that OP wants the first element for each group, then one needs to pass .first()

df2 = df.groupby(level=0).first()

[Out]:
   TC_name  Year
id              
1     RITA  2020
2      IDA  2020
3     RITA  2021

edited Sep 24, 2022 at 9:08

answered Sep 24, 2022 at 9:01

Gonçalo Peres

13.8k5 gold badges73 silver badges95 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Pandas multi-index data access

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related