How to get filtered values of data frame in Python?

Question

I want to find, in a given column "type", the values of that column that repeats "n" times.

I did this:

n = 5
df = dataf["type"].value_counts() > 5

print(df) will return something like this:

Bike           True
Truck          True
Car            False

How to get the values "Bike" and "Car" ? I want to add them in a set.

Can you show us df?

user1717828
– user1717828

2021-10-09 22:01:43 +00:00
Commented Oct 9, 2021 at 22:01 — user1717828
– user1717828, Commented Oct 9, 2021 at 22:01

user1717828 · Accepted Answer · 2021-10-09 22:05:33Z

3

You can use lambda in a loc for this:

import pandas as pd

df = pd.DataFrame({"vehicle": ["bike"] * 7 + ["truck"] * 8 + ["car"] * 4})
print(df)
print("\nUsing loc...")
print(df["vehicle"].value_counts().loc[lambda x: x > 5])

gives

   vehicle
0     bike
1     bike
2     bike
3     bike
4     bike
5     bike
6     bike
7    truck
8    truck
9    truck
10   truck
11   truck
12   truck
13   truck
14   truck
15     car
16     car
17     car
18     car

Using loc...
truck    8
bike     7
Name: vehicle, dtype: int64

answered Oct 9, 2021 at 22:05

user1717828

7,2818 gold badges41 silver badges61 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

jimmy Over a year ago

But then how do I extract only the values truck and bike from the series? Thanks!

Henry Ecker Over a year ago

df["vehicle"].value_counts().loc[lambda x: x > 5].index.tolist() @jimmy

Miguel Pinheiro · Accepted Answer · 2021-10-09 22:05:10Z

1

Try this

aux = dataf["type"].value_counts()
greater_than_five = aux[aux > 5]

The first line get the count of the types and the second line filter for the types that is greater than five.

answered Oct 9, 2021 at 22:05

Miguel Pinheiro

3432 silver badges13 bronze badges

Comments

claudio paulo · Accepted Answer · 2021-10-09 22:12:56Z

1

Try this,

n = 5
df = dataf["type"].value_counts()[dataf["type"].value_counts() > n]
print(df)

answered Oct 9, 2021 at 22:12

claudio paulo

463 bronze badges

2 Comments

jimmy Over a year ago

This will return me Car 7 . How do I extract only the value Car ?

claudio paulo Over a year ago

df = dataf["type"].value_counts()[(dataf["type"].value_counts() > 5) & (dataf["type"].value_counts().index == 'Car')]

kağan hazal koçdemir · Accepted Answer · 2021-10-09 22:25:03Z

1

the most efficient way is with lambda that @user1717828 wrote it. another way :

df = pd.DataFrame({"vehicle": ["bike"] * 7 + ["truck"] * 8 + ["car"] * 4})


df2 = df["vehicle"].agg({'count':'value_counts'})
df2[df2['count'] > 5]

answered Oct 9, 2021 at 22:25

kağan hazal koçdemir

7255 silver badges18 bronze badges

Comments

Doğu Can Elçi · Accepted Answer · 2021-10-09 22:06:20Z

0

You can add a new columns called counter which contain '1':

df['counter'] = 1

and use groupby:

df = df.groupby(['types']).sum()
df = df[df.counter > n]

answered Oct 9, 2021 at 22:06

Doğu Can Elçi

2310 bronze badges

Collectives™ on Stack Overflow

How to get filtered values of data frame in Python?

5 Answers 5

2 Comments

Comments

2 Comments

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

2 Comments

Comments

2 Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related