question regarding conditional looping on pandas dataframe. Data frame of interest is huge. We have student name(s) and their test score(s) at different time in columns (Please see below). A student is considered as fail if his/her score is less than 75 in any of the tests, pass otherwise. I'm not able to do it efficiently. Dataframe:
score = {'student_name': ['Jiten', 'Jac', 'Ali', 'Steve', 'Dave', 'James'],
'test_quiz_1': [74, 81, 84, 67, 59, 96],
'test_quiz_2': [76, np.NaN, 99, 77, 53, 69],
'test_mid_term': [76, 88, 84, 67, 58, np.NaN],
'test_final_term': [76, 78, 89, 67, 58, 96]}
df = pd.DataFrame(score, columns = ['student_name', 'test_quiz_1', 'test_quiz_2', 'test_mid_term', 'test_final_term'])
My approach: (Modifying based on Jacques Kvam's Answer)
df.test_quiz_1 > 70
This(^) gives me location where particular student fail. The same can be repeated for other tests (df.test_quiz_2, ...). Finally, I need to combine these all into one final column where student is failed if he/she fails at any test.
Edited: I have very little knowledge about python and pandas. I'm writing pseudo code as to how I would have implemented in C/C++.
for student in student_list:
value=0
for i in range (no_of_test):
if (score<75):
value=value+1
else:
continue
if(value>0):
student[status]=fail
else:
student[status]=pass
Above is just a pseudo code. I'm not creating any additional column to mark if student fail in any test or not. Is it possible to implement something similar in Python using Pandas.
Please advice.