I am new in Python. I would appreciate any advice regarding this method.
I am using pandas in python and have a dataframe (csv file) like this one but with 195 columns and ~300 individuals.
Index IID Sex Disease 1 Disease 2 Disease 3
0 001 F Absent Absent Present
1 002 M Absent Absent Present
2 003 M Present Absent Present
I want to count the number of individuals with every disease, that means that I need to count the value "Present" across the 195 columns. Then I would like to have the counts grouped by sex. How I can do it?
The best I was able to do was: GROUP=df1.loc[:,["SEX","Disease1","Disease2", "Disease3"].groupby('SEX')
GROUP.count() but this just counted all the entries across the specified columns grouped by sex. I don't know how to make the same but counting only entries with the "Present" value on each row or at least to count the number of entries for each of the values on the rows ("Present", "Absent", "Unable_to_Code").
df[(df.filter(like='Disease')=='Present').all(1)].groupby('Sex')['IID'].size()If I had more data, I could to valid and verify.