python pandas - select particular values after groupby

Question

I have groupby table:

df.groupby(['Age', 'Movie']).mean()

                  User  Raitings
Age Movie
1   1         4.666667  7.666667
    2         4.666667  8.000000
    3         2.000000  7.500000
    4         2.000000  5.500000
    5         3.000000  7.000000
18  1         3.000000  7.500000
    2         3.000000  8.000000
    3         3.000000  8.500000
25  1         8.000000  7.250000
    2         8.000000  7.500000
    3         5.500000  8.500000
    4         5.000000  7.000000
45  1         9.000000  7.500000
    2         9.000000  7.500000
    3        11.000000  7.000000
    4        11.000000  6.000000
60  1         8.000000  7.000000
    2         8.000000  9.000000
    3         8.000000  7.000000

please, help with function, which takes integer (Age) and return Movie with MIN raitings in this Age-group. Example def(1) should return 4 (min Raitings in group Age(1) = 5.5, Movies(5.5) = 4)

I can get min Raiting:

df['Raitings'].min()

But i don't know - how to get raiting in particular group (Age)?

example: i have age 18, in this group min integer in column "Raitings" - 7.5, corresponding Movie - 1 — VakarinDmitriy
– VakarinDmitriy, Commented Feb 17, 2018 at 16:46

MaxU - stand with Ukraine · Accepted Answer · 2018-02-17 16:34:06Z

4

Source multi-index DF:

In [221]: x
Out[221]:
                 User  Raitings
Age  Movie
1.0  1       4.666667  7.666667
     2       4.666667  8.000000
     3       2.000000  7.500000
     4       2.000000  5.500000
     5       3.000000  7.000000
18.0 1       3.000000  7.500000
     2       3.000000  8.000000
     3       3.000000  8.500000
25.0 1       8.000000  7.250000
     2       8.000000  7.500000
     3       5.500000  8.500000
     4       5.000000  7.000000
45.0 1       9.000000  7.500000
     2       9.000000  7.500000
     3      11.000000  7.000000
     4      11.000000  6.000000
60.0 1       8.000000  7.000000
     2       8.000000  9.000000
     3       8.000000  7.000000

Function:

In [222]: def f(df, age):
     ...:     return df.loc[pd.IndexSlice[age,:], 'Raitings'].idxmin()[1]
     ...:

Test:

In [223]: f(x, age=1)
Out[223]: 4

answered Feb 17, 2018 at 16:34

MaxU - stand with Ukraine

212k37 gold badges402 silver badges437 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

BruceWayne Over a year ago

@VakarinDmitriy if it works, you can mark as The Answer, click the check mark left of the post (under the down arrow)

i1100362 Over a year ago

HI, how can we get both the columns say, 'user' & 'raitings' and when 'user' is string.?. In my case I have 'group' instead of 'age' which starts from '0'

MaxU - stand with Ukraine Over a year ago

@i1100362, i'd suggest you to open a new question with a small sample input data set and your desired data set... It's not very clear to me what do you want to get as an output. Are you after df.loc[pd.IndexSlice[1,:], :] ?

i1100362 Over a year ago

stackoverflow.com/questions/52661673/…

piRSquared · Accepted Answer · 2018-02-17 16:55:27Z

This gets all of them in one go.

df.groupby('Age').Raitings.idxmin().str[-1]

Age
1     4
18    1
25    4
45    4
60    1
Name: Raitings, dtype: int64

If you want a function, I'd use pd.DataFrame.xs (xs is for cross section).
By default, xs will grab from the first level of the index and subsequently drop that level. This conveniently leaves the level in which we want to draw the value in which idxmin will hand us.

def f(df, age):
    return df.xs(age).Raitings.idxmin()

f(df, 1)

4

Setup
Useful for those who try to parse this stuff.

txt = """\
Age  Movie       User  Raitings
1.0  1       4.666667  7.666667
     2       4.666667  8.000000
     3       2.000000  7.500000
     4       2.000000  5.500000
     5       3.000000  7.000000
18.0 1       3.000000  7.500000
     2       3.000000  8.000000
     3       3.000000  8.500000
25.0 1       8.000000  7.250000
     2       8.000000  7.500000
     3       5.500000  8.500000
     4       5.000000  7.000000
45.0 1       9.000000  7.500000
     2       9.000000  7.500000
     3      11.000000  7.000000
     4      11.000000  6.000000
60.0 1       8.000000  7.000000
     2       8.000000  9.000000"""

df = pd.read_fwf(pd.io.common.StringIO(txt))
df = df.ffill(downcast='infer').set_index(['Age', 'Movie'])

Nicolas M. · Accepted Answer · 2018-02-17 16:32:02Z

0

If you want the minimum for a specific age, you can do :

df["Age"==1]['Raitings'].min()

If you want to do it for the whole dataframe, you can do:

df.groupby("Age").agg({ "Raitings" : "min" })

I hope it helps,

answered Feb 17, 2018 at 16:32

Nicolas M.

1,4881 gold badge15 silver badges26 bronze badges

Comments

dhFrank · Accepted Answer · 2018-02-17 18:16:03Z

0

I will reshape and do pivot. Think it will help

df.reset_index(inplace = true)
df_Min = pd.pivot_table(df,index = [‘Movie’, ‘User’], columns =‘Age’, values = ‘Raiting’, aggfunc = min )

answered Feb 17, 2018 at 18:16

dhFrank

991 silver badge3 bronze badges

Collectives™ on Stack Overflow

python pandas - select particular values after groupby

4 Answers 4

4 Comments

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

4 Comments

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related