1

I have a dataframe with z-scores for several values. It looks like this:

ID    Cat1     Cat2     Cat3
A     1.05     -1.67    0.94
B     -0.88    0.22     -0.56
C     1.33     0.84     1.19

I want to write a script that will tell me which IDs correspond with values in each category relative to a cut-off value I specify as needed. Because I am working with z-scores, I will need to compare the absolute value against my cut-off.

So if I set my cut-off at 0.75, the resulting dataframe would be:

Cat1    Cat2    Cat3
A       A       A
B       C       C
C

If I set 1.0 as my cut-off value: the dataframe above would return:

Cat1    Cat2    Cat3
A       A       C
C

I know that I can do queries like this:

df1 = df[df['Cat1'] > 1]
df1
df1 = df[df['Cat1'] < -1]
df1

to individually query each column and find the information I'm looking for but this is tedious even if I figure out how to use the abs function to combine the two queries into one.How can I apply this filtration to the whole dataframe?

I've come up with this skeleton of a script:

cut_off = 1.0
cols = list(df.columns)
cols.remove('ID')
for col in cols:
    # FOR CELL IN VALUE OF EACH CELL IN COLUMN:
        if (abs.CELL < cut_off):
            CELL = NaN

to basically just eliminate any values that don't meet the cut-off. If I can get this to work, it will bring me closer to my goal but I am stuck and don't even know if I am on the right track. Again, the overall goal is to quickly figure out which cells have absolute-values above the cut-off in each category be able to list the corresponding IDs.

I apologize if anything is confusing or vague; let me know in comments and I'll fix it. I've been trying to figure this out for most of today and my brain is somewhat fried

2
  • Not sure if this is close enough df.apply(lambda x: x.index)[abs(df) > 1] Commented Aug 4, 2014 at 20:08
  • This returned an error: TypeError: bad operand type for abs(): 'unicode' Commented Aug 5, 2014 at 17:26

1 Answer 1

2

You don't have to apply the filtration to columns, you can also do

df[df > 1]

, and also,

df[df > 1] = np.NaN
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.