0

I have a SQL db, with two tables called users and user_activities (see below). I´m trying to get a dataframe from a query with the id_user and the number of sessions that a user makes after the second day of its registration. To identify a session, we have the activity "session" in the table user_activities.

For that, I need to combine two tables. The first one, users, provides the user_id and the registration_date:

users table:

user_id registration_date
1 2021-01-10 04:37:14
1 2021-01-10 10:37:24
2 2021-01-10 20:37:44
3 2021-01-10 20:10:14
2 2021-01-10 10:37:04

The other one, user_activities, tracks all the activities that each user makes:

user_activities table

user activity date
1 session 2021-01-10 04:37:14
1 mainPage 2021-01-10 10:37:24
2 session 2021-01-10 20:37:44
3 session 2021-01-10 20:10:14
4 session 2021-01-11 00:02:04
2 session 2021-01-12 00:03:04
4 session 2021-01-13 00:31:04
5 session 2021-01-14 20:23:04
2 session 2021-01-15 10:36:52
2 mainPage 2021-01-15 10:37:04

What I am trying to get

I would like to get a df with the user_id and the number of sessions made after the second day of their registration. Only the users with more than 0 sessions would be included in that df. It would be as follows:

user_id n_sessions
2 2
4 1
5 1

To get the number of sessions per user, I made before:

import mysql.connector
import pandas as pd

mydb = mysql.connector.connect(host="localhost", user="root", password="", database="users")
mycursor = mydb.cursor()

#sesiones por usuario
mycursor.execute("SELECT user_id, COUNT(*) FROM user_activities WHERE name = 'session' GROUP BY user_id;")
sessions_per_user = pd.DataFrame(mycursor, columns=['user_id','n_sessions'])

But I don´t know how to join with the registration_date condition. Does anyone know how to do it?

0

1 Answer 1

1

This is a join and group by. Something like this:

SELECT u.user_id, COUNT(*)
FROM users u JOIN
     user_activities ua
     on ua.user = u.user_id
WHERE ua.name = 'session' AND
      ua.date > u.registration_date + interval 1 day
GROUP BY u.user_id;

I'm not sure exactly what you mean by "the number of sessions made after the second day of their registration." This interprets that as "at least 24 hours after the registration". The logic can be tweaked for other definitions.

Based on your comment, you want:

      ua.date > date(u.registration_date) + interval 2 day
Sign up to request clarification or add additional context in comments.

1 Comment

yeah... "the number of sessions made after the second day of their registration." means that if someone was registered on the 10th, I want to see the number of sessions after the 12th. I think your solution can already do it with the interval day at 2 instead of 1

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.