Subquery or union of joins in postgres?

Question

I have so-called links that can have tags assigned to them, so I store it in 3 tables:

tag: id, name
tag_in_link: tag_id, link_id
link: id, url

Now I need to get basic tag counts: how many times a tag was used (including 0 times). I have two queries:

select t.id, t.name, count(*)
from tag as t inner join tag_in_link as tl
    on tl.tag_id = t.id
group by t.id, t.name
union
select t.id, t.name, 0
from tag as t left outer join tag_in_link as tl
    on tl.tag_id = t.id where tl.tag_id is null

and

select t.id, t.name,
       (select count(*) from tag_in_link as tl
              where tl.tag_id = t.id
       ) as count from tag as t

they both give the same (up to the order of records) results and work almost as fast

Problem is that I don't have much data to test it, but I need to pick one way or another today. All I know is that, there will be:

up to 100 tags
millions of links

So my question:

which approach : a dependent subquery or union of joins has better performance on large tables in postgres?

Could you show explain for both queries?

khusnetdinov
– khusnetdinov

2018-01-07 17:07:44 +00:00
Commented Jan 7, 2018 at 17:07 — khusnetdinov
– khusnetdinov, Commented Jan 7, 2018 at 17:07

Laurenz Albe · Accepted Answer · 2018-01-08 07:11:11Z

1

The first query will be better for large data sets, because it does not force a nested loop.

But why don't you use the optimal query:

SELECT t.id, t.name, count(*)
FROM tag AS t LEFT JOIN tag_in_link AS tl
    ON tl.tag_id = t.id
GROUP BY t.id, t.name;

edited Jan 8, 2018 at 7:11

answered Jan 7, 2018 at 12:26

Laurenz Albe

257k22 gold badges312 silver badges388 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Trident D'Gao Over a year ago

this query doesn't give me 0 counts, which i need

Laurenz Albe Over a year ago

Sorry, I used an inner join instead of a left join. Fixed.

Parfait · Accepted Answer · 2018-01-07 14:40:35Z

0

Consider combining UNION with a conditional aggregation, still avoiding the correlated subquery run for every row.

select t.id, t.name, 
       sum(case when tl.tag_id is null then 0 else 1 end) as tag_count
from tag as t 
left join tag_in_link as tl
    on tl.tag_id = t.id
group by t.id, t.name

answered Jan 7, 2018 at 14:40

Parfait

108k19 gold badges103 silver badges138 bronze badges

Collectives™ on Stack Overflow

Subquery or union of joins in postgres?

2 Answers 2

2 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related