1

I'm creating a messaging app as a side project and I'm trying to query a user's conversations efficiently.

The messages table structure is basic right now with some dummy data:

| id   | sender_id | receiver_id | message | created_at |
|------|-----------|-------------|---------|------------|
| 1    | 1         | 2           | text    | time       |
| 2    | 2         | 1           | text    | time       |
| 3    | 1         | 2           | text    | time       |
| 4    | 1         | 3           | text    | time       |
| 5    | 3         | 2           | text    | time       |
| 6    | 3         | 1           | text    | time       |
| 7    | 2         | 1           | text    | time       |

I'd like to be able to query the DB and group by "conversation" - A.K.A any rows that have the same sender_id or receiver_id in either column - rows (1, 2, 3, 7), (4, 6), (5). I'd like to be able to limit each "group" to n rows and order them by the created_at column. It would ideally look like (created_at values are arbitrary numbers to show descending values):

| id   | sender_id | receiver_id | message | created_at |
|------|-----------|-------------|---------|------------|
| 1    | 1         | 2           | text    | 400        |
| 2    | 2         | 1           | text    | 300        |
| 3    | 1         | 2           | text    | 200        |
| 7    | 2         | 1           | text    | 100        |
| 4    | 1         | 3           | text    | 700        |
| 6    | 3         | 1           | text    | 500        |
| 5    | 3         | 2           | text    | 300        |

Ideally there would be an additional column added that would number each group (easy to create multi-dimensional array.

So far I've been able to "group" by sender/receiver ids, order by created_at, and limit the number per group. However, It's not quite right. Here's the query:

SELECT
    filter.id, filter.sender_id, filter.receiver_id, filter.message, filter.created_at
FROM (
    SELECT messages.*,
        rank() OVER (
            PARTITION BY sender_id
            ORDER BY created_at DESC
        )
    FROM messages
    WHERE messages.sender_id = 1 or messages.receiver_id = 1
) filter WHERE rank <= 50;

My result set looks like this:

| id   | sender_id | receiver_id | message | created_at |
|------|-----------|-------------|---------|------------|
| 1    | 1         | 2           | text    | 400        |
| 3    | 1         | 2           | text    | 300        |
| 4    | 1         | 3           | text    | 700        |
| 2    | 2         | 1           | text    | 300        |
| 7    | 2         | 1           | text    | 100        |
| 6    | 3         | 1           | text    | 500        |
| 5    | 3         | 2           | text    | 300        |

You can see that row 3 and 6 should be grouped but aren't.

1 Answer 1

2

You can use rank(). To limit the number of records per conversation (ie sender/receiver or receiver/sender tuple), you can use a partition like least(sender_id, receiver_id), greatest(sender_id, receiver_id):

select filter.id, filter.sender_id, filter.receiver_id, filter.message, filter.created_at
from (
    select
        t.*,
        rank() over(
            partition by least(sender_id, receiver_id), greatest(sender_id, receiver_id)
            order by created_at desc
        ) rn
    from mytable t
) t
where rn <= 50
order by least(sender_id, receiver_id), greatest(sender_id, receiver_id), rn
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.