Input DF:
id . sub_id . id_created . id_last_modified sub_id_created . lead_
1 . 10 12:00 7:00 12:00 . 1:00
1 . 20 . 12:00 7:00 1:00 . 2:30
1 . 30 . 12:00 7:00 2:30 . 7:00
1 . 40 12:00 7:05 7:00 null
Use case, I am trying to create a new_column "time", where:
1. For: (id, max(sub_id)) : id_last_modified - sub_id_created
2. otherwise: sub_id_created - lead_
Code:
window = Window.partitionBy("id").orderBy("sub_id")
I am getting the expected op for all the rows except for the combination of:
(id, max(sub_id))
for which I am getting null
Any suggestions on where am I going wrong will be helpful. Thanks.