Sample data:
sample_data = [
{'Case #': 'A25', 'Parent Case #': 'A24', 'Data': 'Blah blah'},
{'Case #': 'B46', 'Parent Case #': nan, 'Data': 'Waka waka'},
{'Case #': 'B89', 'Parent Case #': 'B46', 'Data': 'Moo moo'},
{'Case #': 'C12', 'Parent Case #': nan, 'Data': 'Meow'},
{'Case #': 'C44', 'Parent Case #': nan, 'Data': 'Woof'},
{'Case #': 'C77', 'Parent Case #': 'C12', 'Data': 'Hiss'},
{'Case #': 'D55', 'Parent Case #': 'D2', 'Data': 'Ribbet'}
]
df = pd.DataFrame(sample_data)
The data consists of cases that may or may not have parent cases (i.e., they may be children or not). No grandchildren / max depth = 1.
However, some of the referenced parents are not present in this data set, and so these cases are effectively orphans.
For the purposes of my data, simply removing the reference to the parent will suffice for orphans. I can identify these orphans like so:
df.loc[~df["Parent Case #"].isna() & ~df2["Parent Case #"].isin(df2["Case #"].values)]
For these two matching rows, I want to remove the "Parent Case #" reference (make that value nan / empty for only these two rows). How do I do this? I feel like I am just missing one final step. I'm not sure how to do assignment using my condition with its & logic.