How do I subset columns in a Pandas dataframe based on criteria using a loop?

Question

I have a Pandas dataframe called "bag' with columns called beans1, beans2, and beans3

bag = pd.DataFrame({'beans1': [3,1,2,5,6,7], 'beans2': [2,2,1,1,5,6], 'beans3': [1,1,1,3,3,2]}) 
bag
Out[50]: 
   beans1  beans2  beans3
0       3       2       1
1       1       2       1
2       2       1       1
3       5       1       3
4       6       5       3
5       7       6       2

I want to use a loop to subset each column with observations greater than 1, so that I get:

The way to do it manually is :

beans1=beans.loc[bag['beans1']>1,['beans1']]
beans2=beans.loc[bag['beans2']>1,['beans2']]
beans3=beans.loc[bag['beans3']>1,['beans3']]

But I need to employ a loop, with something like:

for i in range(1,4):
    beans+str(i).loc[beans.loc[bag['beans'+i]>1,['beans'+str(i)]]

But it didn't work. I need a Python version of R's eval(parse(text=""))) Any help appreciated. Thanks much!

jezrael · Accepted Answer · 2020-01-03 09:25:51Z

1

It is possible, but not recommended, with globals:

for i in range(1,4):
    globals()['beans' + str(i)] = bag.loc[bag['beans'+str(i)]>1,['beans'+str(i)]]

for c in bag.columns:
    globals()[c] = bag.loc[bag[c]>1,[c]]

print (beans1)
   beans1
0       3
2       2
3       5
4       6
5       7

Better is create dictionary:

d = {c: bag.loc[bag[c]>1, [c]] for c in bag}

print (d['beans1'])
   beans1
0       3
2       2
3       5
4       6
5       7

edited Jan 3, 2020 at 9:25

answered Jan 3, 2020 at 9:14

jezrael

868k103 gold badges1.4k silver badges1.3k bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How do I subset columns in a Pandas dataframe based on criteria using a loop?

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related