I am working with this dataframe where it has to be grouped by DocumentId and PersonId. Within that group, if the End Date column is empty then fill it with the row where DocCode is RT.
DocumentID,PersonID,DocCode,StartDate,Amount,EndDate
120303,110001,FB,5/18/21,245,
120303,110001,TW,5/25/21,460,
120303,110001,RT,6/1/21,,6/6/21
120303,110011,GK,4/1/21,0,
120303,110011,AK,4/8/21,128,
120303,110011,PL,4/12/21,128,
120303,110011,FB,4/16/21,256,
120303,110011,RT,4/28/21,,5/4/21
It works fine but there is another twist to it. Within that group of DocumentId & PersonID if the amount changes, the next amount's StartDate will be the previous amount's EndDate. So intermediate dataframe will look like this:
And then all the rows with duplicate amounts within that group and empty amounts will be collapsed into 1 row.
Final dataset will look like this:

Here is the code I am using to fill up all the empty EndDate columns from the row where DocCode is RT:
df = pd.read_csv(path).sort_values(by=["StartDate"])
df.groupby(["DocumentId", "PersonId"]).apply(fill_end_date).reset_index(drop=True)
def fill_end_date(df):
rt_doc = df[df["DocumentCode"] == "RT"]
# if there is row in this group by with DocumentCode RT
if not rt_doc.empty:
end_date = rt_doc.iloc[0]["EndDate"]
# and EndDate not empty
if pd.notnull(end_date):
# all other rows need to be filled by that end date
df = df.fillna({"EndDate": end_date})
return df

pd.read_imagedoesn't exist yet