Let's say I have a dataframe of leads as such:
import pandas as pd
leads = {'Unique Identifier':['1','2','3','4','5','6','7','8'],
'Name': ['brad','stacy','holly','mike','phil', 'chris','jane','glenn'],
'Channel': [None,None,None,None,'facebook', 'facebook','google', 'facebook'],
'Campaign': [None,None,None,None,'A', 'B','B', 'C'],
'Gender': ['M','F','F','M','M', 'M','F','M'],
'Signup Month':['Mar','Mar','Apr','May','May','May','Jun','Jun']
}
leads_df = pd.DataFrame(leads)
leads_df
which looks like the following. It has missing data for Channel and Campaign for the first 4 leads.
I have a separate dataframe with the missing data:
missing = {'Unique Identifier':['1','2','3','4'],
'Channel': ['google', 'email','facebook', 'google'],
'Campaign': ['B', 'A','C', 'B']
}
missing_df = pd.DataFrame(missing)
missing_df
Using the Unique Identifiers in both tables, how would I go about plugging in the missing data into the main leads table? For context there are about 6,000 leads with missing data.