Join only the last row of an associated table with PostgreSQL

Question

Given the folowing tables:

Table "public.vs_protocolo"
     Column      |            Type             |                         Modifiers                         
-----------------+-----------------------------+-----------------------------------------------------------
 id              | integer                     | not null default nextval('vs_protocolo_id_seq'::regclass)
 data_criacao    | timestamp without time zone | not null default now()
 ano_processo    | integer                     | not null
 numero_processo | integer                     | not null


Table "public.vs_protocolo_historico"
    Column    |            Type             |                              Modifiers                              
--------------+-----------------------------+---------------------------------------------------------------------
 id           | integer                     | not null default nextval('vs_protocolo_historico_id_seq'::regclass)
 id_protocolo | integer                     | not null
 descricao    | character varying(255)      | not null
 status       | integer                     | not null default 0
 data_criacao | timestamp without time zone | not null default now()

I must select all rows from vs_protocolo joined with the last row from vs_protocolo_historico.

I'm worring about performance, so it must avoid sub-queries or, at least, avoid subqueries for every row at vs_protocolo.

Note: vs_protocolo_historico(id_protocolo) REFERENCES vs_protocolo(id).

Clodoaldo Neto · Accepted Answer · 2014-02-06 22:56:54Z

9

I think this one is simpler and faster

select
    id,
    p.ano_processo,
    p.numero_processo,
    h.descricao,
    h.status,
    h.data_modificacao
from
    vs_protocolo p
    inner join
    (
        select distinct on (id_protocolo)
            id_protocolo as id,
            descricao,
            status,
            data_criacao as data_modificacao
        from vs_protocolo_historico
        order by id_protocolo, data_criacao desc
    ) h using (id)

answered Feb 6, 2014 at 22:56

Clodoaldo Neto

127k30 gold badges251 silver badges274 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

paulodiovani Over a year ago

Much better, thanks. The Query Plan cost was very similar, but there's not a SubPlan, so it may be faster with large data (actualy I have just a couple rows).

paulodiovani · Accepted Answer · 2014-02-06 13:19:06Z

0

This is the easiest query for the case. Not sure if it remains fast for thousands of rows.

SELECT 
    p.id,
    p.ano_processo,
    p.numero_processo,
    h.descricao,
    h.status,
    h.data_criacao AS data_modificacao
FROM vs_protocolo p
INNER JOIN vs_protocolo_historico h
    ON h.id_protocolo = p.id
    AND h.data_criacao = (
        SELECT data_criacao
        FROM vs_protocolo_historico
        WHERE id_protocolo = p.id
        ORDER BY data_criacao DESC
        LIMIT 1
    );

And here is its query plan.

QUERY PLAN                                         
-------------------------------------------------------------------------------------------
 Hash Join  (cost=40.75..3161.13 rows=2 width=173)
   Hash Cond: ((h.id_protocolo = p.id) AND (h.data_criacao = (subplan)))
   ->  Seq Scan on vs_protocolo_historico h  (cost=0.00..14.10 rows=410 width=161)
   ->  Hash  (cost=22.30..22.30 rows=1230 width=16)
         ->  Seq Scan on vs_protocolo p  (cost=0.00..22.30 rows=1230 width=16)
   SubPlan
     ->  Limit  (cost=15.13..15.14 rows=1 width=8)
           ->  Sort  (cost=15.13..15.14 rows=2 width=8)
                 Sort Key: vs_protocolo_historico.data_criacao
                 ->  Seq Scan on vs_protocolo_historico  (cost=0.00..15.12 rows=2 width=8)
                       Filter: (id_protocolo = $0)

answered Feb 6, 2014 at 13:19

paulodiovani

1,3252 gold badges16 silver badges35 bronze badges

2 Comments

Jayadevan Over a year ago

Does the plan change if you index vs_protocolo_historico (id_protocolo) and vs_protocolo (id) ?

paulodiovani Over a year ago

I was thinking that pg could auto generate indexes for foreign keys. It seens that it doesn't. I'll add an index to vs_protocolo_historico (id_protocolo).

Collectives™ on Stack Overflow

Join only the last row of an associated table with PostgreSQL

2 Answers 2

1 Comment

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related