3

Given the folowing tables:

Table "public.vs_protocolo"
     Column      |            Type             |                         Modifiers                         
-----------------+-----------------------------+-----------------------------------------------------------
 id              | integer                     | not null default nextval('vs_protocolo_id_seq'::regclass)
 data_criacao    | timestamp without time zone | not null default now()
 ano_processo    | integer                     | not null
 numero_processo | integer                     | not null


Table "public.vs_protocolo_historico"
    Column    |            Type             |                              Modifiers                              
--------------+-----------------------------+---------------------------------------------------------------------
 id           | integer                     | not null default nextval('vs_protocolo_historico_id_seq'::regclass)
 id_protocolo | integer                     | not null
 descricao    | character varying(255)      | not null
 status       | integer                     | not null default 0
 data_criacao | timestamp without time zone | not null default now()

I must select all rows from vs_protocolo joined with the last row from vs_protocolo_historico.

I'm worring about performance, so it must avoid sub-queries or, at least, avoid subqueries for every row at vs_protocolo.

Note: vs_protocolo_historico(id_protocolo) REFERENCES vs_protocolo(id).

2 Answers 2

9

I think this one is simpler and faster

select
    id,
    p.ano_processo,
    p.numero_processo,
    h.descricao,
    h.status,
    h.data_modificacao
from
    vs_protocolo p
    inner join
    (
        select distinct on (id_protocolo)
            id_protocolo as id,
            descricao,
            status,
            data_criacao as data_modificacao
        from vs_protocolo_historico
        order by id_protocolo, data_criacao desc
    ) h using (id)
Sign up to request clarification or add additional context in comments.

1 Comment

Much better, thanks. The Query Plan cost was very similar, but there's not a SubPlan, so it may be faster with large data (actualy I have just a couple rows).
0

This is the easiest query for the case. Not sure if it remains fast for thousands of rows.

SELECT 
    p.id,
    p.ano_processo,
    p.numero_processo,
    h.descricao,
    h.status,
    h.data_criacao AS data_modificacao
FROM vs_protocolo p
INNER JOIN vs_protocolo_historico h
    ON h.id_protocolo = p.id
    AND h.data_criacao = (
        SELECT data_criacao
        FROM vs_protocolo_historico
        WHERE id_protocolo = p.id
        ORDER BY data_criacao DESC
        LIMIT 1
    );

And here is its query plan.

QUERY PLAN                                         
-------------------------------------------------------------------------------------------
 Hash Join  (cost=40.75..3161.13 rows=2 width=173)
   Hash Cond: ((h.id_protocolo = p.id) AND (h.data_criacao = (subplan)))
   ->  Seq Scan on vs_protocolo_historico h  (cost=0.00..14.10 rows=410 width=161)
   ->  Hash  (cost=22.30..22.30 rows=1230 width=16)
         ->  Seq Scan on vs_protocolo p  (cost=0.00..22.30 rows=1230 width=16)
   SubPlan
     ->  Limit  (cost=15.13..15.14 rows=1 width=8)
           ->  Sort  (cost=15.13..15.14 rows=2 width=8)
                 Sort Key: vs_protocolo_historico.data_criacao
                 ->  Seq Scan on vs_protocolo_historico  (cost=0.00..15.12 rows=2 width=8)
                       Filter: (id_protocolo = $0)

2 Comments

Does the plan change if you index vs_protocolo_historico (id_protocolo) and vs_protocolo (id) ?
I was thinking that pg could auto generate indexes for foreign keys. It seens that it doesn't. I'll add an index to vs_protocolo_historico (id_protocolo).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.