I have a pandas dataframe with 21 columns. I am focusing on a subset of rows that have exactly same column data values except for 6 that are unique to each row. I don't know which column headings these 6 values correspond to a priori.
I tried converting each row to Index objects, and performed set operation on two rows. Ex.
row1 = pd.Index(sample_data[0])
row2 = pd.Index(sample_data[1])
row1 - row2
which returns an Index object containing values unique to row1. Then I can manually deduce which columns have unique values.
How can I programmatically grab the column headings that these values correspond to in the initial dataframe? Or, is there a way to compare two or multiple dataframe rows and extract the 6 different column values of each row, as well as the corresponding headings? Ideally, it would be nice to generate a new dataframe with the unique columns.
In particular, is there a way to do this using set operations?
Thank you.