3

have stuck with pandas. I have df which contains every transaction (first column - index, sorted by time):

        email    date
43487   aaa     2017-10-11 08:28:39
42910   bbb     2017-09-24 07:49:52
45561   bbb     2017-12-03 11:03:56
47212   bbb     2018-01-02 12:25:52
89734   ccc     2018-02-02 12:25:52
89734   ccc     2018-03-02 12:20:52

I also have df2 which contains unique emails and min date (df1 with drop_duplicates, since it was sorted by time, i got min date by default):

        email    date
43487   aaa     2017-10-11 08:28:39
42910   bbb     2017-09-24 07:49:52
89734   ccc     2018-02-02 12:25:52

How to create column date2 in df2, containing the second min date for respective email in df1?

I tried for loop:

for email in df2['email']:
    df2.at[email, 'date2'] = df1.loc[df1['email'] == email]['date'].iloc[1] 
    if len(df1.loc[df1['email'] == email]['date']) > 1 else None

But it is very long (55k rows and 32 GB RAM - no result in 5 min).

Desired output is:

        email   date                date 2
43487   aaa     2017-10-11 08:28:39 None
42910   bbb     2017-09-24 07:49:52 2017-12-03 11:03:56
89734   ccc     2018-02-02 12:25:52 2018-03-02 12:20:52
3
  • Why first value of date2 column is None? Commented Dec 27, 2018 at 9:01
  • Because there is only one aaa transaction, there was no second one. Commented Dec 27, 2018 at 9:02
  • So, basically, I want a table with 3 columns - email, date of 1 transaction, date of 2 transactions. Commented Dec 27, 2018 at 9:03

1 Answer 1

2

You could use sort_values to sort the dates within each email, which can be done using a list of columns to sort by.

Then you GroupBy email and use nth to select the second element of each group.

date2 = df.sort_values(['email','date']).groupby('email').nth(1)

             date
email                    
bbb   2017-12-03 11:03:56
ccc   2018-03-02 12:20:52

Finally left merge with df2 on email:

df2.merge(date2, on = 'email', how = 'left')

    email        date_x              date_y
0   aaa 2017-10-11 08:28:39                 NaT
1   bbb 2017-09-24 07:49:52 2017-12-03 11:03:56
2   ccc 2018-02-02 12:25:52 2018-03-02 12:20:52
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.