1

I have a dataframe,

Check In Date   Check Out Date  Number  stage
2020/5/22 16:23 2020/5/22 18:39 1         a
2020/5/22 22:41 2020/5/23 2:03  1         b
2020/5/23 2:04  2020/5/23 2:04  1         c
2020/5/23 2:04  2020/5/23 2:56  1         d
2020/5/23 2:56  2020/5/23 2:56  2         a
2020/5/24 8:39  2020/5/24 8:39  2         b
2020/5/24 8:40  2020/5/24 10:58 2         c
2020/5/24 10:59 2020/5/24 10:59 2         d


df = pd.DataFrame({'Check In Date': ['2020/5/22 16:23', '2020/5/22 22:41', '2020/5/23 2:04', '2020/5/23 2:04', '2020/5/23 2:56', '2020/5/24 8:39', '2020/5/24 8:40', '2020/5/24 10:59'],
                   'Check Out Date': ['2020/5/22 18:39', '2020/5/23 2:03', '2020/5/23 2:04', '2020/5/23 2:56', '2020/5/23 2:56', '2020/5/24 8:39', '2020/5/24 10:58', '2020/5/24 10:59'],
                   'Number': [1, 1, 1, 1, 2, 2, 2, 2],
                   'stage': ['a', 'b', 'c', 'd', 'a', 'b', 'c', 'd']})

I am trying to do some calculation in the dataframe like this:

          1       2
a -> b  4:02:00 5:43:00
b -> c  0:01:00 0:01:00
c -> d  0:00:00 0:01:00

Which equals to

                         1                                       2
a -> b  b: ckeck in date - a: check out date    b: ckeck in date - a: check out date
b -> c  c: ckeck in date - b: check out date    c: ckeck in date - b: check out date
c -> d  d: ckeck in date - c: check out date    d: ckeck in date - c: check out date

I check example related to pandas and dataframe, but I still don't know how to achieve this. Any thought?

1 Answer 1

1

Use DataFrameGroupBy.shift for shifting columns stage and Check Out Date, reshape by DataFrame.unstack, so in last step is possible subtract by shifted columns by DataFrame.sub:

df['Check In Date'] = pd.to_datetime(df['Check In Date'])
df['Check Out Date'] = pd.to_datetime(df['Check Out Date'])


g = df.groupby('Number')
df = (df.assign(shitfted = g['Check Out Date'].shift(),
                stage = g['stage'].shift() + ' -> ' + df['stage'])
        .set_index(['stage','Number'])[['Check In Date','shitfted']]
        .unstack()
        .dropna()
      )
df = df['Check In Date'].sub(df['shitfted'])
print (df)
Number        1               2
stage                          
a -> b 04:02:00 1 days 05:43:00
b -> c 00:01:00 0 days 00:01:00
c -> d 00:00:00 0 days 00:01:00

EDIT:

For all combinations is used cross join with filtering by all combinations:

df['Check In Date'] = pd.to_datetime(df['Check In Date'])
df['Check Out Date'] = pd.to_datetime(df['Check Out Date'])

from  itertools import combinations

c = [f'{a} -> {b}' for a, b in (combinations(df['stage'].unique(), 2))]
print (c)
['a -> b', 'a -> c', 'a -> d', 'b -> c', 'b -> d', 'c -> d']

df = (df.merge(df, on='Number')
       .assign(stage = lambda x: x.pop('stage_x') + ' -> ' + x.pop('stage_y'))
       .query('stage in @c')
# df = df[df['stage'].isin(c)]
        .set_index(['stage','Number'])[['Check In Date_y','Check Out Date_x']]
        .unstack())
df = df['Check In Date_y'].sub(df['Check Out Date_x'])
print (df)
Number        1               2
stage                          
a -> b 04:02:00 1 days 05:43:00
a -> c 07:25:00 1 days 05:44:00
a -> d 07:25:00 1 days 08:03:00
b -> c 00:01:00 0 days 00:01:00
b -> d 00:01:00 0 days 02:20:00
c -> d 00:00:00 0 days 00:01:00
Sign up to request clarification or add additional context in comments.

5 Comments

@ jezrael It is possible to turn the result from day, hour, min, sec into second?
@Joanne - Sure, add df = df['Check In Date'].sub(df['shitfted']).apply(lambda x: x.dt.total_seconds())
@ jezrael - Thank you so much!!! One last question, it is possible to calculate all of the combination between stage? For example, a->b, a->c, a->d, b->c, b->d, c->d
@Joanne - Yes, need some time.
@Joanne - If some problem let me know, it should working well if not too many combinations.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.