Why do I get a 'Hash Join' and FTS on this PostgreSQL query?

Question

I'm trying to optimize the following scenario:

In worded format: I have 2 tables, alerts and user_devices; in user_devices we track whether a device coupled to a user_id wants to get notifications or not and in alerts table we track user-to-notifier relation. Basically the task is to select every user_id that has any alerts and allows notifications to any of the devices registered to it.

Table 'alerts', around 900k records:

               Table "public.alerts"
   Column    |           Type           | Modifiers 
-------------+--------------------------+-----------
 id          | uuid                     | not null
 user_id     | uuid                     | 
 target_id   | uuid                     | 
 target_type | text                     | 
 added_on    | timestamp with time zone | 
 old_id      | text                     | 
Indexes:
    "alerts_pkey" PRIMARY KEY, btree (id)
    "one_alert_per_business_per_user" UNIQUE CONSTRAINT, btree (user_id, target_id)
    "addedon" btree (added_on)
    "targetid" btree (target_id)
    "userid" btree (user_id)
    "userid_targetid" btree (user_id, target_id)
Foreign-key constraints:
    "alerts_user_id_fkey" FOREIGN KEY (user_id) REFERENCES users(id)

Table 'user_devices', around 12k records:

                Table "public.user_devices"
       Column        |           Type           | Modifiers 
---------------------+--------------------------+-----------
 id                  | uuid                     | not null
 user_id             | uuid                     | 
 device_id           | text                     | 
 device_token        | text                     | 
 push_notify_enabled | boolean                  | 
 device_type         | integer                  | 
 device_name         | text                     | 
 badge_count         | integer                  | 
 added_on            | timestamp with time zone | 
Indexes:
    "user_devices_pkey" PRIMARY KEY, btree (id)
    "push_notification" btree (push_notify_enabled)
    "user_id" btree (user_id)
    "user_id_push_notification" btree (user_id, push_notify_enabled)
Foreign-key constraints:
    "user_devices_user_id_fkey" FOREIGN KEY (user_id) REFERENCES users(id)

The following query:

select COUNT(DISTINCT a.user_id) 
from alerts a 
  inner join user_devices ud on a.user_id = ud.user_id 
WHERE ud.push_notify_enabled = true;

Takes about 3 seconds and produces the following plan:

explain select COUNT(DISTINCT a.user_id) from alerts a inner join user_devices ud on a.user_id = ud.user_id WHERE ud.push_notify_enabled = true;
                                     QUERY PLAN                                     
------------------------------------------------------------------------------------
 Aggregate  (cost=49777.32..49777.33 rows=1 width=16)
   ->  Hash Join  (cost=34508.97..48239.63 rows=615074 width=16)
         Hash Cond: (ud.user_id = a.user_id)
         ->  Seq Scan on user_devices ud  (cost=0.00..480.75 rows=9202 width=16)
               Filter: push_notify_enabled
         ->  Hash  (cost=20572.32..20572.32 rows=801732 width=16)
               ->  Seq Scan on alerts a  (cost=0.00..20572.32 rows=801732 width=16)

What am I missing, is there a way to speed this up?

Thank you.

== edit ==

As per suggestion, tried to move condition inside the join, no difference:

=> explain select COUNT(DISTINCT a.user_id) from alerts a inner join user_devices ud on a.user_id = ud.user_id and ud.push_notify_enabled;
                                     QUERY PLAN                                     
------------------------------------------------------------------------------------
 Aggregate  (cost=49777.32..49777.33 rows=1 width=16)
   ->  Hash Join  (cost=34508.97..48239.63 rows=615074 width=16)
         Hash Cond: (ud.user_id = a.user_id)
         ->  Seq Scan on user_devices ud  (cost=0.00..480.75 rows=9202 width=16)
               Filter: push_notify_enabled
         ->  Hash  (cost=20572.32..20572.32 rows=801732 width=16)
               ->  Seq Scan on alerts a  (cost=0.00..20572.32 rows=801732 width=16)

So, no way to get away from 2 FTS? If I could at least get it to somehow use the index on 'alerts' table, would've been great..

== edit ==

Adding `EXPLAIN ANALYZE.

=> explain ANALYZE select COUNT(DISTINCT a.user_id) from alerts a inner join user_devices ud on a.user_id = ud.user_id and ud.push_notify_enabled;
                                                             QUERY PLAN                                                              
-------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=49777.32..49777.33 rows=1 width=16) (actual time=5254.355..5254.356 rows=1 loops=1)
   ->  Hash Join  (cost=34508.97..48239.63 rows=615074 width=16) (actual time=1824.607..2863.635 rows=614768 loops=1)
         Hash Cond: (ud.user_id = a.user_id)
         ->  Seq Scan on user_devices ud  (cost=0.00..480.75 rows=9202 width=16) (actual time=0.048..16.784 rows=9186 loops=1)
               Filter: push_notify_enabled
         ->  Hash  (cost=20572.32..20572.32 rows=801732 width=16) (actual time=1824.229..1824.229 rows=801765 loops=1)
               Buckets: 4096  Batches: 32  Memory Usage: 990kB
               ->  Seq Scan on alerts a  (cost=0.00..20572.32 rows=801732 width=16) (actual time=0.047..878.429 rows=801765 loops=1)
 Total runtime: 5255.427 ms
(9 rows)

=== Edit ===

Adding requested configuration. Most of it is Ubuntu PG9.1 defaults:

/etc/postgresql/9.1/main# cat postgresql.conf | grep -e "work_mem" -e "effective_cache" -e "shared_buff" -e "random_page_c"
shared_buffers = 24MB           # min 128kB
#work_mem = 1MB             # min 64kB
#maintenance_work_mem = 16MB        # min 1MB
#wal_buffers = -1           # min 32kB, -1 sets based on shared_buffers
#random_page_cost = 4.0         # same scale as above
#effective_cache_size = 128MB

Well PostgreSQL needs to visit nearly every row in the alerts table. So the seq scan will be the fastest thing to do. If you are on 9.2 it might actually do a index only scan on the userid index. — user330315
– user330315, Commented Jan 18, 2013 at 17:59
Is there any difference if you move the where condition to the join condition like inner join user_devices ud on a.user_id = ud.user_id and ud.push_notify_enabled? There is no need for the = true part BTW. — Clodoaldo Neto
– Clodoaldo Neto, Commented Jan 18, 2013 at 18:18
Will try moving and see if it helps. Afk now, will check in a couple of hours. Thank you for your comments. — favoretti
– favoretti, Commented Jan 18, 2013 at 19:18
please add the output of EXPLAIN ANALYZE that will show you both the expected and observed number of rows. (maybe your statistics are off, or absent) — wildplasser
– wildplasser, Commented Jan 19, 2013 at 13:47

wildplasser · Accepted Answer · 2013-01-19 15:48:45Z

1

Replacing the index by a partial index:

DROP INDEX    user_id_push_notification ;
CREATE INDEX    user_id_push_notification ON user_devices (user_id)
 WHERE push_notify_enabled =True
 ;

, and set random_page_cost to lower value:

SET random_page_cost = 1.1;

Caused an Index Scan using push_notification on user_devices ud for me (< 300ms). YMMV.

The seqscan on alerts seems more or less unavoidable, since you expect 800K/900K := 88%) rows. An index scan would only be effective if the rowsize were very large, IMHO.

UPDATE: adding the users table to the query seems to force a triple index scan. (but approx the same time )

explain  ANALYZE
select COUNT(DISTINCT a.user_id)
from alerts a
join user_devices ud on a.user_id = ud.user_id
join users us on a.user_id = us.id
WHERE ud.push_notify_enabled = true;

edited Jan 19, 2013 at 15:48

answered Jan 19, 2013 at 15:41

wildplasser

44.5k9 gold badges72 silver badges116 bronze badges

Sign up to request clarification or add additional context in comments.

11 Comments

favoretti Over a year ago

9.1 or 9.2? Causes no difference for me, but I'm upgrading to 9.2.2 as we speak.

wildplasser Over a year ago

9.1.2. I need to upgrade, too ;-)

favoretti Over a year ago

Am I doing something wrong? Your proposed query forces a triple Seq scan for me :)

wildplasser Over a year ago

Strange. I did change the uuid's to integer and serial, though. (for easyer insert by generate_series(), which doesnt seem to support uuids )

favoretti Over a year ago

I'd gladly shoot the guy who used uuids for this database in the head, but he's about 6000 miles further away. :(

|

9000 · Accepted Answer · 2013-01-19 14:33:07Z

1

As said in comments, the real hog is the full scan of alerts table. Logically, for given user ID, any and all records in alerts may match that user ID.

You have one condition that might limit the scan: push_notify_enabled; you don't need rows where it's false. But you lack an index on this column, so full scan on alerts is still the fastest way to join the two tables.

~~Try using a bitmap index on push_notify_enabled, if your version of Postgres supports it. (Obviously, btree index on a 2-value column is no good.)~~

To speed up the query you have to limit the number of rows to be scanned in alerts, that is, to add a condition on some indexed column of alerts. Then an index scan instead of a full scan might be possible, if the index is selective enough.

For instance, you could filter by target ID, or by some date-related column, if this makes sense.

If you have 900k alerts which are all active, and can be arbitrarily shared betwen users, you have little choice; probably adding RAM to keep the alerts table always cached might help. (Adding hardware is often the easiest and most cost-effective solution.)

AFAICT you are only interested in alerts that are associated with push notifications. If users with push notifications never share an alert with users without push notifications, you may effectively split alerts by this condition.

If you had bitmap indexes, you might move push_notify_enabled column to alerts. Else, you might try to physically split it on that column, using partitioning. If the number of alerts with push notifications is significantly lower than total alerts, a far smaller portion of alerts will be scanned to make the join.

edited Jan 19, 2013 at 14:33

answered Jan 19, 2013 at 13:57

9000

41k9 gold badges70 silver badges108 bronze badges

5 Comments

favoretti Over a year ago

Unfortunately no bitmap index available here it seems. Another question (might be total bs, but just thinking aloud) - what if I transfer the push_notify_enabled to alerts table?

9000 Over a year ago

My bad — I for some reason mistook push_notify_enabled for a column from alerts. I will edit my response.

favoretti Over a year ago

Trying to upgrade the thing to 9.2 for a chance. Gah, I need to learn more about postgres, this is so frustrating :)

9000 Over a year ago

@favoretti: "Experience is what you get when you don't get what you wanted" :)

wildplasser Over a year ago

Also: good judgement stems from experience; experience stems from bad judgement.

Collectives™ on Stack Overflow

Why do I get a 'Hash Join' and FTS on this PostgreSQL query?

2 Answers 2

11 Comments

5 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

11 Comments

5 Comments

Your Answer

Sign up or log in

Post as a guest

Related