Postgres / PostGIS query optimisation

Question

I've put together a query which works, I'm just wanting to learn how I can optimise it. The idea of the query is that given a particular row in table A, it take its geometry and in table B finds the closest matching geometry to it filtered by certain criteria.

SELECT     a.id,
           closest_pt.dist,
           closest_pt.name,
           closest_pt.meters
FROM       "hex-hex-uk" a
CROSS JOIN lateral
       (
                SELECT   a.id,
                         b.name                            AS name,
                         a.geom <-> b.way                  AS dist,
                         st_distance(a.geom, b.way, FALSE) AS meters
                FROM     "osm-polygons-uk" b
                WHERE    (
                                  b.landuse='industrial'
                         OR       b.man_made='works')
                AND      st_area(b.way, FALSE)>15000
                ORDER BY a.geom <-> b.way
                LIMIT    1) AS closest_pt
WHERE      a.id='abc'

Currently the query executes in 30-90ms, but I need to perform millions of these lookups. I tried swopping a.id='abc' with a.id IN ('abc','def','ghi',...) and looking up 10000 at a time, but it takes 10mins+ which doesn't really add up.

Here's the query plan as it stands:

"  ->  Index Scan using ""hex-hex-uk_id_idx"" on ""hex-hex-uk"" a  (cost=0.43..8.45 rows=1 width=168) (actual time=0.029..0.046 rows=1 loops=1)"
"        Index Cond: ((id)::text = '89195c849a3ffff'::text)"
"  ->  Limit  (cost=0.28..536.88 rows=1 width=43) (actual time=33.009..33.062 rows=1 loops=1)"
"        ->  Index Scan using ""idx_osm-polygons-uk_geom"" on ""osm-polygons-uk"" b  (cost=0.28..4935623.77 rows=9198 width=43) (actual time=32.992..33.001 rows=1 loops=1)"
"              Order By: (way <-> a.geom)"
"              Filter: (((landuse = 'industrial'::text) OR (man_made = 'works'::text)) AND (st_area((way)::geography, false) > '15000'::double precision))"
"              Rows Removed by Filter: 7"
"Planning Time: 0.142 ms"
"Execution Time: 33.311 ms"

What would be the process for trying to optimise a query like this? I learn best by example hence I think it makes sense to post on here rather than just reading about optimisation techniques.

Thanks!

CREATE TABLE "osm-polygons-uk" (id bigint,name text,landuse text, man_made text,way geometry);
CREATE INDEX "idx_osm-polygons-uk_geom" ON "osm-polygons-uk" USING gist (way);
ALTER TABLE "osm-polygons-uk" ADD PRIMARY KEY (id);

CREATE TABLE "hex-hex-uk" (id varchar(15), geom geometry);
CREATE UNIQUE INDEX ON "hex-hex-uk" (id);

we need still to know the CREATE TABLE and the indexes, they are vital to optimize Queries — nbk
– nbk, Commented Apr 25, 2022 at 12:09
You can make a partial index only using the 3 conditions. Otherwise if the found polygons have many vertices you can look at applying st_subdivide first (in an indexed materialized view or else) — JGH
– JGH, Commented Apr 25, 2022 at 13:39
Collect the plan using EXPLAIN (ANALYZE, BUFFERS). Turn track_io_timing on first if you can. — jjanes
– jjanes, Commented Apr 25, 2022 at 13:59

richwol · Accepted Answer · 2022-04-26 09:10:01Z

Some great tips above. The comment about the indexed materialized view led me to create a view with only the filtered data.. it cut the number of rows down from 1 million to ~20000 and executed in a couple of seconds.

From then I tweaked the original query and it ended up blasting through 2400000 rows in a couple of minutes. A huge improvement from the original 13 hours it was going to take to run!

SELECT a.id, closest_pt.name, ST_Distance(a.geom, closest_pt.way, false) as meters
            FROM "hex-hex-uk" a
            CROSS JOIN LATERAL
              (SELECT
                 id,
                 b.name as name,
                 a.geom <-> b.way as dist,
                 b.way as way
                 FROM "tmp_industrial" b
                 ORDER BY dist ASC
               LIMIT 1) AS closest_pt WHERE a.id IN ('abc','def','ghi',...);

Thanks for the tips, it gives me a bit of a guide as to how to go about debugging query performance.

Collectives™ on Stack Overflow

Postgres / PostGIS query optimisation

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related