I am trying myself out at spame filters. I tried several methods to label text files as spam. As a result, I have three dataframes. They basically look like this:
df_method_1 = pd.DataFrame({'file': ['A','B' ,'C'], 'spam': ['1', '0', '0']})
df_method_2 = pd.DataFrame({'file': ['A','B' ,'C'], 'spam': ['1', '1', '0']})
df_method_3 = pd.DataFrame({'file': ['A','B' ,'C'], 'spam': ['1', '1', '0']})
I am now trying to creat a dataframe showing, if a file was labled as spam and if so by which method.
In the best case, I can create a dataframe containing the following infortmation:
df_summary = pd.DataFrame({'file': ['A','B' ,'C'], 'spam': ['All methods', 'Method 2 & Method 3', 'No method']})
Obviously, I am looking for the information. No need for the actual strings.
I tried pandas.DataFrame.isin() to make it happen. But I failed. Any ideas how to do this?