How to join two tables with dates conditions with SQL and Python?

Question

I have a SQL db, with two tables called users and user_activities (see below). I´m trying to get a dataframe from a query with the id_user and the number of sessions that a user makes after the second day of its registration. To identify a session, we have the activity "session" in the table user_activities.

For that, I need to combine two tables. The first one, users, provides the user_id and the registration_date:

users table:

user_id	registration_date
1	2021-01-10 04:37:14
1	2021-01-10 10:37:24
2	2021-01-10 20:37:44
3	2021-01-10 20:10:14
2	2021-01-10 10:37:04

The other one, user_activities, tracks all the activities that each user makes:

user_activities table

user	activity	date
1	session	2021-01-10 04:37:14
1	mainPage	2021-01-10 10:37:24
2	session	2021-01-10 20:37:44
3	session	2021-01-10 20:10:14
4	session	2021-01-11 00:02:04
2	session	2021-01-12 00:03:04
4	session	2021-01-13 00:31:04
5	session	2021-01-14 20:23:04
2	session	2021-01-15 10:36:52
2	mainPage	2021-01-15 10:37:04

What I am trying to get

I would like to get a df with the user_id and the number of sessions made after the second day of their registration. Only the users with more than 0 sessions would be included in that df. It would be as follows:

user_id	n_sessions
2	2
4	1
5	1

To get the number of sessions per user, I made before:

import mysql.connector
import pandas as pd

mydb = mysql.connector.connect(host="localhost", user="root", password="", database="users")
mycursor = mydb.cursor()

#sesiones por usuario
mycursor.execute("SELECT user_id, COUNT(*) FROM user_activities WHERE name = 'session' GROUP BY user_id;")
sessions_per_user = pd.DataFrame(mycursor, columns=['user_id','n_sessions'])

But I don´t know how to join with the registration_date condition. Does anyone know how to do it?

Gordon Linoff · Accepted Answer · 2021-04-12 23:32:46Z

1

This is a join and group by. Something like this:

SELECT u.user_id, COUNT(*)
FROM users u JOIN
     user_activities ua
     on ua.user = u.user_id
WHERE ua.name = 'session' AND
      ua.date > u.registration_date + interval 1 day
GROUP BY u.user_id;

I'm not sure exactly what you mean by "the number of sessions made after the second day of their registration." This interprets that as "at least 24 hours after the registration". The logic can be tweaked for other definitions.

Based on your comment, you want:

      ua.date > date(u.registration_date) + interval 2 day

edited Apr 12, 2021 at 23:32

answered Apr 12, 2021 at 23:19

Gordon Linoff

1.3m62 gold badges706 silver badges857 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

srgam Over a year ago

yeah... "the number of sessions made after the second day of their registration." means that if someone was registered on the 10th, I want to see the number of sessions after the 12th. I think your solution can already do it with the interval day at 2 instead of 1

Collectives™ on Stack Overflow

How to join two tables with dates conditions with SQL and Python?

What I am trying to get

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

What I am trying to get

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related