I have a dataframe with z-scores for several values. It looks like this:
ID Cat1 Cat2 Cat3
A 1.05 -1.67 0.94
B -0.88 0.22 -0.56
C 1.33 0.84 1.19
I want to write a script that will tell me which IDs correspond with values in each category relative to a cut-off value I specify as needed. Because I am working with z-scores, I will need to compare the absolute value against my cut-off.
So if I set my cut-off at 0.75, the resulting dataframe would be:
Cat1 Cat2 Cat3
A A A
B C C
C
If I set 1.0 as my cut-off value: the dataframe above would return:
Cat1 Cat2 Cat3
A A C
C
I know that I can do queries like this:
df1 = df[df['Cat1'] > 1]
df1
df1 = df[df['Cat1'] < -1]
df1
to individually query each column and find the information I'm looking for but this is tedious even if I figure out how to use the abs function to combine the two queries into one.How can I apply this filtration to the whole dataframe?
I've come up with this skeleton of a script:
cut_off = 1.0
cols = list(df.columns)
cols.remove('ID')
for col in cols:
# FOR CELL IN VALUE OF EACH CELL IN COLUMN:
if (abs.CELL < cut_off):
CELL = NaN
to basically just eliminate any values that don't meet the cut-off. If I can get this to work, it will bring me closer to my goal but I am stuck and don't even know if I am on the right track. Again, the overall goal is to quickly figure out which cells have absolute-values above the cut-off in each category be able to list the corresponding IDs.
I apologize if anything is confusing or vague; let me know in comments and I'll fix it. I've been trying to figure this out for most of today and my brain is somewhat fried
df.apply(lambda x: x.index)[abs(df) > 1]