I am using pandas with python and I have a dataframe data. I have another dataframe missing_vals. missing_vals contains a field column and a key column. The field column contains elements that correspond to names of the columns of data i.e data.columns ~= missing_vals['field']. The mapping, however, is not one-to-one (some entries in missing_vals['field'] do not exist in data.columns. I did a set intersection operation to take care of that and got an output array result containing all the values that are both in missing_vals['field'] and data.columns. Now I want to index into data using each element of result, check to see if that column contains the value corresponding to the element in missing_vals['key'] and replace it with NaN. I tried using for-loops, but I know this is not the ideal way to do it. Is there a way to do it with vector/lambda operations or perhaps with other dataframe functions? I am new to pandas so I would really appreciate some help.
Here is my code so far:
for i in range(len(result)):
field = missing_vals['field'][i]
for j in range(data[field].size):
if (data[field][j] == missing_vals['key'][i]):
data.replace(data[field][j], np.nan)
Thanks