Create some test data...
create unlogged table users( user_id serial primary key, login text unique not null );
insert into users (login) select 'user'||n from generate_series(1,100000) n;
create unlogged table messages( message_id serial primary key, sender_id integer not null, receiver_id integer not null);
insert into messages (sender_id,receiver_id) select random()*100000+1, random()*100000+1 from generate_series(1,1000000);
create index messages_s on messages(sender_id);
create index messages_r on messages(receiver_id);
vacuum analyze users,messages;
And then:
EXPLAIN ANALYZE
SELECT user_id, count(DISTINCT m1.message_id), count(DISTINCT m2.message_id)
FROM users u
LEFT JOIN messages m1 ON m1.receiver_id = user_id
LEFT JOIN messages m2 ON m2.sender_id = user_id
GROUP BY user_id;
GroupAggregate (cost=4.39..326190.22 rows=100000 width=20) (actual time=4.023..3331.031 rows=100000 loops=1)
Group Key: u.user_id
-> Merge Left Join (cost=4.39..250190.22 rows=10000000 width=12) (actual time=3.987..2161.032 rows=9998915 loops=1)
Merge Cond: (u.user_id = m1.receiver_id)
-> Merge Left Join (cost=2.11..56522.26 rows=1000000 width=8) (actual time=3.978..515.730 rows=1000004 loops=1)
Merge Cond: (u.user_id = m2.sender_id)
-> Index Only Scan using users_pkey on users u (cost=0.29..2604.29 rows=100000 width=4) (actual time=0.016..10.149 rows=100000 loops=1)
Heap Fetches: 0
-> Index Scan using messages_s on messages m2 (cost=0.42..41168.40 rows=1000000 width=8) (actual time=0.011..397.128 rows=999996 loops=1)
-> Materialize (cost=0.42..43668.42 rows=1000000 width=8) (actual time=0.008..746.748 rows=9998810 loops=1)
-> Index Scan using messages_r on messages m1 (cost=0.42..41168.42 rows=1000000 width=8) (actual time=0.006..392.426 rows=999997 loops=1)
Execution Time: 3432.131 ms
Since I put in 100k users and 1M messages, each user has about 100 messages as sender and 100 also as receiver, which means the joins generate 100*100=10k rows per user which then have to be processed by the count(DISTINCT ...) aggregates. Postgres doesn't realize this is all unnecessary because the counts and group by's should really be moved inside the joined tables, which means this is extremely slow.
The solution is to move the aggregation inside the joined tables manually, to avoid generating all these unnecessary extra rows.
EXPLAIN ANALYZE
SELECT user_id, m1.cnt, m2.cnt
FROM users u
LEFT JOIN (SELECT receiver_id, count(*) cnt FROM messages GROUP BY receiver_id) m1 ON m1.receiver_id = user_id
LEFT JOIN (SELECT sender_id, count(*) cnt FROM messages GROUP BY sender_id) m2 ON m2.sender_id = user_id;
Hash Left Join (cost=46780.40..48846.42 rows=100000 width=20) (actual time=469.699..511.613 rows=100000 loops=1)
Hash Cond: (u.user_id = m2.sender_id)
-> Hash Left Join (cost=23391.68..25195.19 rows=100000 width=12) (actual time=237.435..262.545 rows=100000 loops=1)
Hash Cond: (u.user_id = m1.receiver_id)
-> Seq Scan on users u (cost=0.00..1541.00 rows=100000 width=4) (actual time=0.015..5.162 rows=100000 loops=1)
-> Hash (cost=22243.34..22243.34 rows=91867 width=12) (actual time=237.252..237.253 rows=99991 loops=1)
Buckets: 131072 Batches: 1 Memory Usage: 5321kB
-> Subquery Scan on m1 (cost=20406.00..22243.34 rows=91867 width=12) (actual time=210.817..227.793 rows=99991 loops=1)
-> HashAggregate (cost=20406.00..21324.67 rows=91867 width=12) (actual time=210.815..222.794 rows=99991 loops=1)
Group Key: messages.receiver_id
Batches: 1 Memory Usage: 14353kB
-> Seq Scan on messages (cost=0.00..15406.00 rows=1000000 width=4) (actual time=0.010..47.173 rows=1000000 loops=1)
-> Hash (cost=22241.52..22241.52 rows=91776 width=12) (actual time=232.003..232.004 rows=99992 loops=1)
Buckets: 131072 Batches: 1 Memory Usage: 5321kB
-> Subquery Scan on m2 (cost=20406.00..22241.52 rows=91776 width=12) (actual time=205.401..222.517 rows=99992 loops=1)
-> HashAggregate (cost=20406.00..21323.76 rows=91776 width=12) (actual time=205.400..217.518 rows=99992 loops=1)
Group Key: messages_1.sender_id
Batches: 1 Memory Usage: 14353kB
-> Seq Scan on messages messages_1 (cost=0.00..15406.00 rows=1000000 width=4) (actual time=0.008..43.402 rows=1000000 loops=1)
Planning Time: 0.574 ms
Execution Time: 515.753 ms
I used a schema that is a bit different from yours, but you get the idea: instead of generating lots of duplicate rows by doing what is essentially a cross product, push aggregations into the joined tables so they return only one row per value of whatever column you're joining on, then remove the GROUP BY from the main query since it is no longer necessary.
Note that count(DISTINCT table.*) is not smart enough to understand that it can do this by looking only at the primary key of the table if there is one, so it will pull the whole row to run the distinct on it. When a table is named "message" or "question_response" it smells like it has a largish TEXT column in it, which will make this very slow. So in case you really need a count(distinct ...) you should use count(DISTINCT table.primarykey):
explain analyze SELECT count(distinct user_id) from users;
Aggregate (cost=1791.00..1791.01 rows=1 width=8) (actual time=15.220..15.221 rows=1 loops=1)
-> Seq Scan on users (cost=0.00..1541.00 rows=100000 width=4) (actual time=0.016..5.830 rows=100000 loops=1)
Execution Time: 15.263 ms
explain analyze SELECT count(distinct users.*) from users;
Aggregate (cost=1791.00..1791.01 rows=1 width=8) (actual time=90.896..90.896 rows=1 loops=1)
-> Seq Scan on users (cost=0.00..1541.00 rows=100000 width=37) (actual time=0.038..38.497 rows=100000 loops=1)
Execution Time: 90.958 ms