Try this to modify the objects in the current memory space.
for i in [ages, vels, vendors, mt, base_tbl]:
i.drop_duplicates(subset='IDs', keep="last", inplace=True)
i['IDs'] = i['IDs'].astype(str)
MVCE:
import pandas as pd
import numpy as np
np.random.seed(123)
df1 = pd.DataFrame(np.random.randint(0,100, (5,5)), columns=[*'abcde'])
df2 = pd.DataFrame(np.random.randint(0,100, (5,5)), columns=[*'abcde'])
df3 = pd.DataFrame(np.random.randint(0,100, (5,5)), columns=[*'abcde'])
for i in [df1, df2, df3]:
i.drop_duplicates('b', keep='last', inplace=True)
i['a'] = i['a'].astype(str)
df1.info()
df2.info()
df3.info()
print(df2)
Output:
<class 'pandas.core.frame.DataFrame'>
Int64Index: 5 entries, 0 to 4
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 a 5 non-null object
1 b 5 non-null int32
2 c 5 non-null int32
3 d 5 non-null int32
4 e 5 non-null int32
dtypes: int32(4), object(1)
memory usage: 160.0+ bytes
<class 'pandas.core.frame.DataFrame'>
Int64Index: 4 entries, 0 to 4
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 a 4 non-null object
1 b 4 non-null int32
2 c 4 non-null int32
3 d 4 non-null int32
4 e 4 non-null int32
dtypes: int32(4), object(1)
memory usage: 128.0+ bytes
<class 'pandas.core.frame.DataFrame'>
Int64Index: 5 entries, 0 to 4
Data columns (total 5 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 a 5 non-null object
1 b 5 non-null int32
2 c 5 non-null int32
3 d 5 non-null int32
4 e 5 non-null int32
dtypes: int32(4), object(1)
memory usage: 160.0+ bytes
a b c d e
0 84 39 66 84 47
1 61 48 7 99 92
3 34 97 76 40 3
4 69 64 75 34 58
1
df1
a b c d e
0 97 30 52 12 50
3 2 86 41 11 98 # Note missing second index drop duplicate worked.
4 0 48 71 94 61
ivariable each timei.drop_duplicates('IDs', keep='last', inplace=True)