Pandas - remove cells based on value

Question

I have a dataframe with z-scores for several values. It looks like this:

ID    Cat1     Cat2     Cat3
A     1.05     -1.67    0.94
B     -0.88    0.22     -0.56
C     1.33     0.84     1.19

I want to write a script that will tell me which IDs correspond with values in each category relative to a cut-off value I specify as needed. Because I am working with z-scores, I will need to compare the absolute value against my cut-off.

So if I set my cut-off at 0.75, the resulting dataframe would be:

Cat1    Cat2    Cat3
A       A       A
B       C       C
C

If I set 1.0 as my cut-off value: the dataframe above would return:

Cat1    Cat2    Cat3
A       A       C
C

I know that I can do queries like this:

df1 = df[df['Cat1'] > 1]
df1
df1 = df[df['Cat1'] < -1]
df1

to individually query each column and find the information I'm looking for but this is tedious even if I figure out how to use the abs function to combine the two queries into one.How can I apply this filtration to the whole dataframe?

I've come up with this skeleton of a script:

cut_off = 1.0
cols = list(df.columns)
cols.remove('ID')
for col in cols:
    # FOR CELL IN VALUE OF EACH CELL IN COLUMN:
        if (abs.CELL < cut_off):
            CELL = NaN

to basically just eliminate any values that don't meet the cut-off. If I can get this to work, it will bring me closer to my goal but I am stuck and don't even know if I am on the right track. Again, the overall goal is to quickly figure out which cells have absolute-values above the cut-off in each category be able to list the corresponding IDs.

I apologize if anything is confusing or vague; let me know in comments and I'll fix it. I've been trying to figure this out for most of today and my brain is somewhat fried

Not sure if this is close enough df.apply(lambda x: x.index)[abs(df) > 1] — EdChum
– EdChum, Commented Aug 4, 2014 at 20:08
This returned an error: TypeError: bad operand type for abs(): 'unicode' — Slavatron
– Slavatron, Commented Aug 5, 2014 at 17:26

FooBar · Accepted Answer · 2014-08-04 20:02:56Z

2

You don't have to apply the filtration to columns, you can also do

df[df > 1]

, and also,

df[df > 1] = np.NaN

answered Aug 4, 2014 at 20:02

FooBar

16.7k20 gold badges94 silver badges188 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Pandas - remove cells based on value

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related