How to optimize sql query with subqueries, perhaps by lateral join?

Question

I'm trying to optimize sophisticated sql query, it would be executed on each map bonding box change. I thought that INNER LATERAL JOIN would be fastest but it isn't. Does anybody know how to speed up this query and how to make better use of LATERAL JOIN?

The fastest query I've made:

SELECT r0."id", r0."name" 
FROM "hiking"."routes" AS r0 
INNER JOIN "hiking"."hierarchy" AS h1 ON r0."id" = h1."parent" 
INNER JOIN (SELECT DISTINCT unnest(s0."rels") AS "rel" 
            FROM "hiking"."segments" AS s0 
            WHERE (ST_Intersects(s0."geom", ST_SetSrid(ST_MakeBox2D(ST_GeomFromText('POINT(1285982.015631 7217169.814674)', -1), ST_GeomFromText('POINT(2371999.313507 6454022.524275)', -1)), 3857)))) AS s2 ON TRUE 
WHERE (s2."rel" = h1."child");

Planning time: ~0.605 ms Execution time: ~37.232 ms

Actually the same as above but with LATERAL JOIN, is it correct that it is slower?

SELECT r0."id", r0."name" 
FROM "hiking"."routes" AS r0 
INNER JOIN "hiking"."hierarchy" AS h1 ON r0."id" = h1."parent" 
INNER JOIN LATERAL (SELECT DISTINCT unnest(s0."rels") AS "rel" 
                    FROM "hiking"."segments" AS s0 
                    WHERE (ST_Intersects(s0."geom", ST_SetSrid(ST_MakeBox2D(ST_GeomFromText('POINT(1285982.015631 7217169.814674)', -1), ST_GeomFromText('POINT(2371999.313507 6454022.524275)', -1)), 3857)))) AS s2 ON TRUE 
WHERE (s2."rel" = h1."child");

Planning time: ~1.353 ms Execution time: ~38.518 ms

Slowest query with subquery in subquery (that was my first so I've improved it a bit):

SELECT r0."id", r0."name" 
FROM "hiking"."routes" AS r0 
INNER JOIN (SELECT DISTINCT h0."parent" AS "parent" 
            FROM "hiking"."hierarchy" AS h0 
            INNER JOIN (SELECT DISTINCT unnest(s0."rels") AS "rel" 
                        FROM "hiking"."segments" AS s0 
                        WHERE (ST_Intersects(s0."geom", ST_SetSrid(ST_MakeBox2D(ST_GeomFromText('POINT(1285982.015631 7217169.814674)', -1), ST_GeomFromText('POINT(2371999.313507 6454022.524275)', -1)), 3857)))) AS s1 ON TRUE 
            WHERE (h0."child" = s1."rel")) AS s1 ON TRUE 
WHERE (r0."top" AND (r0."id" = s1."parent"));

Planning time: ~1.017 ms Execution time: ~41.288 ms

You're trying to optimize a query that completes in <40 ms? What do you want, light speed? — Evan Carroll
– Evan Carroll, Commented Jan 21, 2018 at 0:40
We also don't care about the slower-than-best queries, what we do care about is the EXPLAIN ANALYZE — Evan Carroll
– Evan Carroll, Commented Jan 21, 2018 at 0:41

Abelisto · Accepted Answer · 2018-01-21 01:07:16Z

3

It is hard to reproduce the logic of your query without any knowledges about your DB but I will try, so be patient:

SELECT r0."id", r0."name" 
FROM "hiking"."routes" AS r0 
INNER JOIN "hiking"."hierarchy" AS h1 ON r0."id" = h1."parent" 
WHERE 
  EXISTS (
    SELECT 1
    FROM "hiking"."segments" AS s0 
    WHERE (
      ST_Intersects(
        s0."geom",
        ST_SetSrid(ST_MakeBox2D(ST_GeomFromText('POINT(1285982.015631 7217169.814674)', -1), ST_GeomFromText('POINT(2371999.313507 6454022.524275)', -1)),
        3857)))
      AND array[h1."child"] <@ s0."rels");

There are two points:

Filtering data by EXISTS or NOT EXISTS sometimes faster then by joining
Instead of unnesting array field to compare its elements with some value you could to use the array comparison operator(s). Having appropriate GIN index it is much faster (docs here and here).

Here is simple example how to use indexes on arrays and how its faster:

create table foo(bar int[]);
insert into foo(bar) select array[1,2,3,x] from generate_series(1,1000000) as x;
create index idx on foo using gin (bar); // Note this
select * from foo where 666 in (select unnest(bar)); // 6936,345 ms on my HW
select * from foo where array[666] <@ bar; // 45,524 ms

edited Jan 21, 2018 at 1:07

answered Jan 21, 2018 at 0:43

Abelisto

15.8k3 gold badges38 silver badges47 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Evan Carroll Over a year ago

Think this is any easier method

ST_Intersects(     ST_SetSRID(ST_MakeEnvelope( 1285982.015631, 6454022.524275, 2371999.313507, 7217169.814674), 3857),     s0.geom   );

Abelisto Over a year ago

@EvanCarroll Shame to me! I never used PostGIS extension... :(

luzny Over a year ago

Your query reduced time significantly Planning time: ~0.438 ms Execution time: ~10.656 ms, thanks a lot for all new knowledge! I've imported only small pice of world from osm, so in whole dataset it would perform even better.

Collectives™ on Stack Overflow

How to optimize sql query with subqueries, perhaps by lateral join?

1 Answer 1

3 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

3 Comments

Your Answer

Sign up or log in

Post as a guest

Related