1

Thank you very much for your help!

Question: How can I count the number of rows that contain '9999-Don't Know' in multiple columns?

I have been able to find solutions that take me halfway. For example, I found many examples where you can use the name of the column to get the number of rows with a specific criteria. BUT, I have 76 columns and each column represents a different question in a survey, hence has a different label, so that would be very inefficient.

Below is a sample df. Again, keep in mind I have 76 columns so using the name of the column is not a feasible option.

pd.DataFrame.from_items([('RespondentId', ['1ghi3g','335hduu','4vlsiu4','5nnvkkt','634deds','7kjng']), ('Satisfaction - Timing', ['9-Excellent','9-Excellent','9999-Don\'t Know','8-Very Good','1-Very Unsatisfied','9999-Don\'t Know']),('Response Speed - Time',['9999-Don\'t Know','9999-Don\'t Know','9-Excellent','9-Excellent','9-Excellent','9-Excellent'])])

As you can see, there are a total of 4 rows where '9999-Don't Know' appears so I would like to get the output 4.

3 Answers 3

2

This will give you number of rows equal to "9999-Don't Know" per column

df.astype(object).eq("9999-Don't Know").sum()

This will give you total count of "9999-Don't Know", thanks @Mitch

df.astype(object).eq("9999-Don't Know").values.sum()

This will give you total number of rows with at least one

df.astype(object).eq("9999-Don't Know").any(1).sum()
Sign up to request clarification or add additional context in comments.

9 Comments

Might want .values.sum()?
or df.eq("9999-Don't Know").sum().sum() - double sum
I tried your solution but received the TypeError: Could not compare ["9999-Don't Know"] with block values. Does that mean that maybe the formatting of certain cells in this Excel file is inconsistent? @Mitch. I got the same message with your solution as well.
@AlexeyTrofimov I tried your solution and got the same msg as well
Seems like the values in your dataframe are not strings but lists. Try this at first to check if it will work: df = df.applymap(lambda x: x[0] if type(x) == list else x)
|
1

You can also use this:

df.stack().str.contains("9999-Don't Know").sum()

Although this is slower than @piRSquared solution:

In [38]: timeit df.astype(str).eq("9999-Don't Know").values.sum() 
1000 loops, best of 3: 182 us per loop

In [39]: timeit df.stack().str.contains("9999-Don't Know").sum()
1000 loops, best of 3: 467 us per loop

3 Comments

thanks a bunch! this is exactly what i was looking for.
@techscolasticus you are welcome. But I recommend that you accept the other solution as it's faster..
i would but it was giving me an error so i wasn't able to get the total number of rows.
1

Another solution is:

df.eq("9999-Don't Know").sum().sum()

also you've mentioned the type error:

TypeError: Could not compare ["9999-Don't Know"] with block values. 

this means you have a list like an element of DataFrame. It can be transformed to string with the code:

 df = df.applymap(lambda x: x[0] if type(x) == list else x) 

2 Comments

so df=df.applymap... did work but I'm not sure what I'm supposed to do after that to get the total number of rows
After that you can use any of proposed here solutions. E.g. df.eq("9999-Don't Know").sum().sum()

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.