Hi I can't seem to find the right answer so I might as well write a post
Could any db expert help me improve the following query (see explain plan) which is slowing down our application on production quite a bit.
- a bid is related to a realty
- a realty is owned by an agency
- I'm using postgres
- a table stores the views per user: HIT(user_id, bid_id, date)
the aim is to retrieve the number of hits per bids for a particular agency
here is the query
select hit.bid_id , count(hit.id)
from hit
cross join bid
cross join realty
where hit.bid_id=bid.id
and realty.id=bid.realty_id
and realty.agency_id = 91
group by hit.bid_id
order by count(hit.id) desc
here is the explain plan
"Sort (cost=167474.69..167493.30 rows=7445 width=16)"
" Sort Key: (count(hit.id)) DESC"
" -> HashAggregate (cost=166921.45..166995.90 rows=7445 width=16)"
" Group Key: hit.bid_id"
" -> Nested Loop (cost=694.81..162541.34 rows=876021 width=16)"
" -> Hash Join (cost=694.38..7217.46 rows=1986 width=8)"
" Hash Cond: (bid.realty_id = realty.id)"
" -> Seq Scan on bid (cost=0.00..6398.98 rows=27798 width=16)"
" -> Hash (cost=669.92..669.92 rows=1957 width=8)"
" -> Bitmap Heap Scan on realty (cost=63.45..669.92 rows=1957 width=8)"
" Recheck Cond: (agency_id = 91)"
" -> Bitmap Index Scan on agency_idx (cost=0.00..62.97 rows=1957 width=0)"
" Index Cond: (agency_id = 91)"
" -> Index Scan using hit_bid_id_idx on hit (cost=0.43..61.74 rows=1647 width=16)"
" Index Cond: (bid_id = bid.id)"
I tried to use exists, or select in but they are worse
[EDIT] I'm using QueryDsl (java api) which generates the cross joins, but even with inner join the execution plan is too long, here is the explain plan with verbose
"Sort (cost=169479.60..169498.99 rows=7756 width=16) (actual time=15350.858..15351.819 rows=821 loops=1)"
" Output: hit.bid_id, (count(hit.id))"
" Sort Key: (count(hit.id)) DESC"
" Sort Method: quicksort Memory: 63kB"
" -> HashAggregate (cost=168900.96..168978.52 rows=7756 width=16) (actual time=15348.418..15349.550 rows=821 loops=1)"
" Output: hit.bid_id, count(hit.id)"
" Group Key: hit.bid_id"
" -> Nested Loop (cost=699.70..164385.85 rows=903022 width=16) (actual time=17.777..14364.165 rows=582723 loops=1)"
" Output: hit.bid_id, hit.id"
" -> Hash Join (cost=699.26..7225.23 rows=2013 width=8) (actual time=8.427..146.966 rows=1977 loops=1)"
" Output: bid.id"
" Hash Cond: (bid.realty_id = realty.id)"
" -> Seq Scan on public.bid (cost=0.00..6400.88 rows=27988 width=16) (actual time=0.018..84.389 rows=27994 loops=1)"
" Output: bid.id, bid.created_by, bid.created_date, bid.last_modified_by, bid.last_modified_date, bid.agency_costs, bid.availability_begin_date, bid.availability_end_date, bid.bail, bid.description, bid.imported_bid, bid.is_availabl (...)"
" -> Hash (cost=674.46..674.46 rows=1984 width=8) (actual time=8.186..8.186 rows=1977 loops=1)"
" Output: realty.id"
" Buckets: 2048 Batches: 1 Memory Usage: 94kB"
" -> Bitmap Heap Scan on public.realty (cost=67.66..674.46 rows=1984 width=8) (actual time=0.533..4.967 rows=1977 loops=1)"
" Output: realty.id"
" Recheck Cond: (realty.agency_id = 91)"
" Heap Blocks: exact=208"
" -> Bitmap Index Scan on agency_idx (cost=0.00..67.17 rows=1984 width=0) (actual time=0.491..0.491 rows=1978 loops=1)"
" Index Cond: (realty.agency_id = 91)"
" -> Index Scan using hit_bid_id_idx on public.hit (cost=0.43..61.88 rows=1619 width=16) (actual time=2.198..6.376 rows=295 loops=1977)"
" Output: hit.id, hit.created_by, hit.created_date, hit.last_modified_by, hit.last_modified_date, hit.date, hit.ip, hit.user_id, hit.bid_id, hit.display_phone"
" Index Cond: (hit.bid_id = bid.id)"
"Planning time: 3.037 ms"
"Execution time: 15353.187 ms"
Tables DDL
CREATE TABLE public.bid
(
id bigint NOT NULL,
realty_id bigint,
CONSTRAINT bid_pkey PRIMARY KEY (id),
CONSTRAINT bid_fkey_realty FOREIGN KEY (realty_id)
REFERENCES public.realty (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
CREATE TABLE public.hit
(
id bigint NOT NULL,
bid_id bigint,
CONSTRAINT hit_pkey PRIMARY KEY (id),
CONSTRAINT hit_fkey_bid FOREIGN KEY (bid_id)
REFERENCES public.bid (id) MATCH SIMPLE
ON UPDATE NO ACTION ON DELETE NO ACTION
)
CREATE TABLE public.realty
(
id bigint NOT NULL,
CONSTRAINT realty_pkey PRIMARY KEY (id)
)
explain (analyze, verbose, buffers). But those cross joins don't make any sense. Why don't you just use a regularjoinas apparently that is what you want to do