I'm pulling my hair out here. I need to replace null values in a pandas dataframe column. These are specifically null values not NaN values.
I've tried:
trainData['Embarked'].replace(trainData['Embarked'].isnull, embarkedMost, regex=True)
trainData['Embarked'].replace('', embarkedMost, regex=True)
trainData['Embarked'].replace('', embarkedMost, regex=True, inplace=True)
trainData['Embarked'].str.replace('', embarkedMost, regex=True)
trainData['Embarked'].isnull().replace(np.nan, embarkedMost, regex=True)
trainData['Embarked'].fillna(embarkedMost)
trainData['Embarked'].str.replace(np.Nan, embarkedMost, regex=True)
trainData['Embarked'].str.replace(pd.isnull, embarkedMost, regex=True)
trainData['Embarked'].replace(r'^\s+$', embarkedMost, regex=True, inplace=True)
Then:
trainData.to_csv(os.path.join(os.path.dirname(__file__), 'full.csv'), sep=',')
Whereafter I load the dataset into excel to check but none of these change the dataset though.
This provides me with the correct indices for empty values:
print(np.where(pd.isnull(trainData['Embarked'])))
I wanted to use apply with lambda but read that it is horribly inefficient.