Pandas - Mapping data based on a sequence

Question

I have a Dataframe that has some trip data where every row represents data for every point/location.

trip_id, sequence, location, start_time
101, 1, point_a, 2020-05-01 00:00:01
101, 2, point_b, 2020-05-01 00:04:01
101, 3, point_c, 2020-05-01 00:14:01
102, 1, point_x, 2020-05-11 00:13:21
102, 2, point_y, 2020-05-11 00:14:01
103, 1, point_z, 2020-05-11 00:14:01
103, 3, point_za, 2020-05-11 00:20:01

I am trying to create a new dataframe that has data between two consecutive points / locations in the same row as shown below:

trip_id, sequence, start_location, start_time, sequence, end_location, end_time
101, 1, point_a, 2020-05-01 00:00:01, 2, point_b, 2020-05-01 00:04:01
101, 2, point_b, 2020-05-01 00:04:01, 3, point_c, 2020-05-01 00:14:01
102, 1, point_x, 2020-05-11 00:13:21, 2, point_y, 2020-05-11 00:14:01
103, 1, point_z, 2020-05-11 00:14:01, 3, point_za, 2020-05-11 00:20:01

Quang Hoang · Accepted Answer · 2020-05-20 16:09:44Z

1

You can remove the top/bottom rows and concat:

bottoms = df[df.trip_id.duplicated()].reset_index(drop=True)
tops = df[df.trip_id.duplicated(keep='last')].reset_index(drop=True)
# rename bottoms' columns
bottoms.columns = ['trip_id', 'sequence', 'end_location', 'end_time']

pd.concat((tops,bottoms), axis=1)

answered May 20, 2020 at 16:09

Quang Hoang

151k11 gold badges64 silver badges86 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

Pandas - Mapping data based on a sequence

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related