0

I am doing a query to select records that lie in a certain geographic region and I am doing some joins and couple of filtering as well.

This is my query:

SELECT "events".* FROM "events" INNER JOIN "albums" ON "albums"."event_id" = "events"."id" INNER JOIN "photos" ON "photos"."album_id" = "albums"."id" WHERE "events"."deleted_at" IS NULL AND "albums"."deleted_at" IS NULL AND "photos"."deleted_at" IS NULL AND (events.latitude BETWEEN -44.197088742316055 AND -23.22003941183816 AND events.longitude BETWEEN 133.226480859375 AND 165.570230859375) GROUP BY events.id HAVING count(albums.id) > 0  ORDER BY start_date DESC

I have the following indexes:

Events:

"events_pkey" PRIMARY KEY, btree (id)
"index_events_on_deleted_at" btree (deleted_at)
"index_events_on_latitude_and_longitude" btree (latitude, longitude)

Albums:

"albums_pkey" PRIMARY KEY, btree (id)
"index_albums_on_deleted_at" btree (deleted_at)
"index_albums_on_event_id" btree (event_id)

Photos:

"photos_pkey" PRIMARY KEY, btree (id)
"index_photos_on_album_id" btree (album_id)
"index_photos_on_deleted_at" btree (deleted_at)

Doing an EXPLAIN ANALYZE results in this and I dont see any usage of my indexes which. I am not sure how to force it to use the indexes. Can anyone help me optimize this?

 Sort  (cost=4057.46..4057.84 rows=150 width=668) (actual time=556.114..556.187 rows=76 loops=1)
   Sort Key: events.start_date
   Sort Method: quicksort  Memory: 78kB
   ->  HashAggregate  (cost=4050.16..4052.04 rows=150 width=668) (actual time=555.667..555.783 rows=76 loops=1)
         Filter: (count(albums.id) > 0)
         ->  Hash Join  (cost=76.14..3946.54 rows=20724 width=668) (actual time=3.675..467.578 rows=48050 loops=1)
               Hash Cond: (photos.album_id = albums.id)
               ->  Seq Scan on photos  (cost=0.00..3441.87 rows=59013 width=4) (actual time=0.008..169.206 rows=60599 loops=1)
                     Filter: (deleted_at IS NULL)
               ->  Hash  (cost=74.10..74.10 rows=163 width=668) (actual time=3.633..3.633 rows=318 loops=1)
                     Buckets: 1024  Batches: 1  Memory Usage: 176kB
                     ->  Hash Join  (cost=49.80..74.10 rows=163 width=668) (actual time=1.195..2.519 rows=318 loops=1)
                           Hash Cond: (albums.event_id = events.id)
                           ->  Seq Scan on albums  (cost=0.00..21.47 rows=321 width=8) (actual time=0.011..0.458 rows=321 loops=1)
                                 Filter: (deleted_at IS NULL)
                           ->  Hash  (cost=47.92..47.92 rows=150 width=664) (actual time=1.151..1.151 rows=195 loops=1)
                                 Buckets: 1024  Batches: 1  Memory Usage: 126kB
                                 ->  Seq Scan on events  (cost=0.00..47.92 rows=150 width=664) (actual time=0.007..0.488 rows=195 loops=1)
                                       Filter: ((deleted_at IS NULL) AND (latitude >= (-44.1970887423161)::double precision) AND (latitude <= (-23.2200394118382)::double precision) AND (longitude >= 133.226480859375::double precision) AND (longitude <= 165.570230859375::double precision))
 Total runtime: 556.459 ms

Thanks!!

EDIT: Thanks for the links. I have tried disabling seqscan. Now my plan is:

 Sort  (cost=5565.73..5566.10 rows=150 width=46) (actual time=451.208..451.290 rows=76 loops=1)
   Sort Key: (date(events.start_date))
   Sort Method: quicksort  Memory: 31kB
   ->  GroupAggregate  (cost=0.00..5560.31 rows=150 width=46) (actual time=2.990..450.850 rows=76 loops=1)
         Filter: (count(albums.id) > 0)
         ->  Nested Loop  (cost=0.00..5454.44 rows=20724 width=46) (actual time=0.077..278.319 rows=48050 loops=1)
               ->  Merge Join  (cost=0.00..205.35 rows=163 width=46) (actual time=0.051..2.856 rows=318 loops=1)
                     Merge Cond: (events.id = albums.event_id)
                     ->  Index Scan using events_pkey on events  (cost=0.00..118.72 rows=150 width=42) (actual time=0.024..0.792 rows=195 loops=1)
                           Filter: ((deleted_at IS NULL) AND (latitude >= (-44.1970887423161)::double precision) AND (latitude <= (-23.2200394118382)::double precision)
 AND (longitude >= 133.226480859375::double precision) AND (longitude <= 165.570230859375::double precision))
                     ->  Index Scan using index_albums_on_event_id on albums  (cost=0.00..83.83 rows=321 width=8) (actual time=0.017..0.832 rows=321 loops=1)
                           Filter: (deleted_at IS NULL)
               ->  Index Scan using index_photos_on_album_id on photos  (cost=0.00..30.27 rows=155 width=4) (actual time=0.010..0.409 rows=151 loops=318)
                     Index Cond: (album_id = albums.id)
                     Filter: (deleted_at IS NULL)
 Total runtime: 451.562 ms

Still Indexes are not being fully used especially for the latitude and long conditions. Do I have my indexes setup correctly?

EDIT: After looking at answers at http://stackoverflow.com/questions/8228326/how-can-i-avoid-postgresql-sometimes-choosing-a-bad-query-plan-for-one-of-two-ne, I assumed that its behaving like this because my query was returning all records, then I updated the conditions, and the new query plan is:

 Sort  (cost=786.18..786.22 rows=19 width=668) (actual time=3.754..3.755 rows=2 loops=1)
   Sort Key: events.start_date
   Sort Method: quicksort  Memory: 25kB
   ->  HashAggregate  (cost=785.54..785.77 rows=19 width=668) (actual time=3.700..3.703 rows=2 loops=1)
         Filter: (count(albums.id) > 0)
         ->  Nested Loop  (cost=48.39..765.51 rows=2670 width=668) (actual time=1.116..2.968 rows=543 loops=1)
               ->  Hash Join  (cost=48.39..89.25 rows=21 width=668) (actual time=1.093..1.128 rows=3 loops=1)
                     Hash Cond: (events.id = albums.event_id)
                     ->  Bitmap Heap Scan on events  (cost=9.42..49.44 rows=19 width=664) (actual time=0.061..0.080 rows=9 loops=1)
                           Recheck Cond: ((latitude >= (-33.7474111086624)::double precision) AND (latitude <= (-33.581678187556)::double precision) AND (longitude >= 151.193933862305::double precision) AND (longitude <= 151.44661940918::double precision))
                           Filter: (deleted_at IS NULL)
                           ->  Bitmap Index Scan on index_events_on_latitude_and_longitude  (cost=0.00..9.42 rows=28 width=0) (actual time=0.050..0.050 rows=9 loops=1)
                                 Index Cond: ((latitude >= (-33.7474111086624)::double precision) AND (latitude <= (-33.581678187556)::double precision) AND (longitude >= 151.193933862305::double precision) AND (longitude <= 151.44661940918::double precision))
                     ->  Hash  (cost=34.95..34.95 rows=321 width=8) (actual time=0.992..0.992 rows=321 loops=1)
                           Buckets: 1024  Batches: 1  Memory Usage: 13kB
                           ->  Bitmap Heap Scan on albums  (cost=14.74..34.95 rows=321 width=8) (actual time=0.069..0.570 rows=321 loops=1)
                                 Recheck Cond: (deleted_at IS NULL)
                                 ->  Bitmap Index Scan on index_albums_on_deleted_at  (cost=0.00..14.66 rows=321 width=0) (actual time=0.056..0.056 rows=321 loops=1)
                                       Index Cond: (deleted_at IS NULL)
               ->  Index Scan using index_photos_on_album_id on photos  (cost=0.00..30.27 rows=155 width=4) (actual time=0.014..0.273 rows=181 loops=3)
                     Index Cond: (album_id = albums.id)
                     Filter: (deleted_at IS NULL)
 Total runtime: 3.958 ms

And the time is verrry less!! Any suggestions?

7
  • also related stackoverflow.com/questions/14554302/… and postgresql.org/docs/current/static/indexes-examine.html Commented Jul 7, 2014 at 9:36
  • Hi @pozs I have tried to disable seqscan. Updated the query plan. Commented Jul 7, 2014 at 10:24
  • I see, btw your query plan is not updated yet. Commented Jul 7, 2014 at 10:50
  • It has not? It seems to be using couple of indexes? Commented Jul 7, 2014 at 11:04
  • I mean you didn't update your question. If those solved your problem, please close this question, or if you have other problems, update it according to them. Commented Jul 7, 2014 at 11:06

1 Answer 1

1

This is mostly because you did not set a LIMIT clause on your query. Without LIMIT you always requests all data from your tables, so looking at their indexes too would be insufficient.

SQLFiddle 1, 2 vs. 3, 4

Also note, that FOREIGN KEY does not add indexes (but UNIQUE & PRIMARY KEY constraint does). So you might want to add indexes for albums.event_id & photos.album_id.

SQLFiddle 3 vs. 4

Sorting can use an index, if present. At your query this means an index on events.start_date.

Sign up to request clarification or add additional context in comments.

4 Comments

Thanks a lot @pozs. Unfortunately I cannot do LIMIT at the moment in my app as I need to do a lot of front end changes. but it did help!! :)
And also, I already have indexes on those two FKs. Btw, let me add index for start_date too! Helped me a lot. Thanks!
adding LIMIT is not helping. I added index for sorting too. And I removed sorting too.. yet the time stays at 450-500ms.
Hi @pozs I have been using that needless JOIN with photos table!! :) I realized that I dont need to do it. Removing that drastically reduced time to around 4ms!!

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.