I have a dataframe call it dfA,
ID Chronological Label
1 1 None
2 0 ONPEAPLFPH
3 0 JFECJGSQNS
4 1 None
5 1 None
6 0 MGMELTIVTJ
7 1 None
8 1 None
9 1 None
I want to assign a unique_id to the column Chronological such that each consequent repeated values has a "common" unique_id. That is I want the following desired output,
ID Chronological Label unique_id
1 1 None 1
2 0 ONPEAPLFPH 2
3 0 JFECJGSQNS 3
4 1 None 4
5 1 None 4
6 0 MGMELTIVTJ 5
7 1 None 6
8 1 None 6
9 1 None 6
I tried using a non-vectorized solution using for-loop but it is really slow,
starting_index = 0
unique_id = 1
dfs = []
for cL in dfA['Label'].unique():
if cL != "None":
current_index = dfA[dfA['Label']==cL].index.values[0]
sliced_df = dfA.iloc[starting_index:current_index+1, :]
sliced_df_ = sliced_df.copy()
if len(sliced_df_)>=1:
sliced_df_['unique_id'] = unique_id
starting_index = current_index
unique_id += 1
dfs.append(sliced_df_)
df_concat = pd.concat(dfs, axis=0)
Is there a more efficient way to solve it?