Postgres: getting latest rows for an array of keys

Question

I have a simple table for the events log:

uid | event_id | event_data
----+----------+------------
  1 |  1       | whatever
  2 |  2       |
  1 |  3       |
  4 |  4       |
  4    5       |

If I need the latest event for a given user, that's obvious:

SELECT * FROM events WHERE uid=needed_uid ORDER BY event_id DESC LIMIT 1

However, suppose I need the latest events for each user id in an array. For example, for the table above and users {1, 4} I'd expect events {3, 5}. Is that possible in plain SQL without resorting to a pgSQL loop?

Do you actually want an array as result? Or whole rows in the order of elements in the input array? This would raise the more interesting question how to preserve the given order ... — Erwin Brandstetter
– Erwin Brandstetter, Commented Nov 17, 2014 at 9:21
@ErwinBrandstetter I need the rows, but the order does not really matter. — bereal
– bereal, Commented Nov 17, 2014 at 9:23
I added another answer anyway, there is sweet potential for better performance. — Erwin Brandstetter
– Erwin Brandstetter, Commented Nov 17, 2014 at 14:23

score 3 · Accepted Answer · 2014-11-17 17:48:51Z

3

A Postgres specific solution is to use distinct on which is usually faster than the solution using a window function:

select distinct on (uid) uid, event_id, event_data
from events 
where uid in (1,4)
order by uid, event_id DESC

edited Nov 17, 2014 at 17:48

answered Nov 17, 2014 at 8:39

user330315

Sign up to request clarification or add additional context in comments.

3 Comments

bereal Over a year ago

Why is order by uid, event_id, not just by event_id?

user330315 Over a year ago

@bereal: because distinct on requires the order by to start with the column(s) that are specified for distinct on. Try running it without it ;)

Mladen Uzelac Over a year ago

I can an error with above answer. It seems that , after (uid) needs to be removed.

Deep · Accepted Answer · 2014-11-17 08:33:48Z

2

try below query :

select DesiredColumnList 
from 
(
    select *, row_number() over ( partition by uid order by event_id desc) rn
    from yourtable
) t
where rn = 1

Row_Number will assign unique number starting from 1 to each row order by event_id desc and partition by will ensure that numbering should be done for each group of uid.

answered Nov 17, 2014 at 8:33

Deep

3,2021 gold badge15 silver badges21 bronze badges

Comments

DirkNM · Accepted Answer · 2014-11-17 08:34:41Z

1

Maybe this helps:

SELECT uid,
       event_id
  FROM(SELECT uid,
              event_id,
              ROW_NUMBER() OVER (PARTITION BY uid ORDER BY event_ID DESC) rank
         FROM events
      )
 WHERE uid IN (1, 4)
   AND rank = 1

answered Nov 17, 2014 at 8:34

DirkNM

2,66417 silver badges22 bronze badges

Comments

Erwin Brandstetter · Accepted Answer · 2022-03-15 03:07:18Z

1

To return rows in the original order of array elements:

Postgres 9.4 or newer

SELECT e.*
FROM   unnest('{1, 4}'::int[]) WITH ORDINALITY a(uid, ord)  -- input array here
CROSS  JOIN LATERAL (
   SELECT * FROM events e
   WHERE  e.uid = a.uid
   ORDER  BY e.event_id DESC
   LIMIT  1
   ) e
ORDER  BY a.ord;

Details for WITH ORDINALITY:

PostgreSQL unnest() with element number

There is a subtle difference to the @a_horse's query: If the given array has duplicate elements, this query gets duplicate rows in return, which may or may not be desirable. If it's not, add a DISTINCT step after unnest() and before the join to the big table.

The main benefit is optimized index usage. See:

Postgres 9.3 or older

Using an implicit JOIN LATERAL:

SELECT e.*
FROM  (SELECT '{1, 4}'::int[]) a(arr)  -- input array here
     , generate_subscripts(a.arr, 1) i 
CROSS  JOIN LATERAL (
   SELECT * FROM event e
   WHERE  e.uid = a.arr[i.i]
   ORDER  BY e.event_id DESC
   LIMIT  1
   ) e
ORDER  BY i.i;

edited Mar 15, 2022 at 3:07

answered Nov 17, 2014 at 14:18

Erwin Brandstetter

669k160 gold badges1.2k silver badges1.3k bronze badges

1 Comment

bereal Over a year ago

You, sir, rock, I have to notice.

bereal · Accepted Answer · 2014-11-17 09:01:50Z

0

This came to me few seconds after I posted the question. It's not as efficient, but to consider all the options:

SELECT * FROM events WHERE event_id IN 
    (SELECT MAX(event_id) FROM events GROUP BY uid WHERE uid IN (1,4))

answered Nov 17, 2014 at 9:01

bereal

34.7k8 gold badges65 silver badges111 bronze badges

Collectives™ on Stack Overflow

Postgres: getting latest rows for an array of keys

5 Answers 5

3 Comments

Comments

Comments

Postgres 9.4 or newer

Postgres 9.3 or older

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

3 Comments

Comments

Comments

Postgres 9.4 or newer

Postgres 9.3 or older

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related