0

Below is my query. It takes 9387.430 ms to execute, which is certainly too long for such a request. I would like to be able to reduce this execution time. Can you please help me on this ? I also provided my analyze output.

EXPLAIN ANALYZE 
SELECT a.artist, b.artist, COUNT(*) 
FROM release_has_artist a, release_has_artist b 
WHERE a.release = b.release AND a.artist <> b.artist 
GROUP BY(a.artist,b.artist) 
ORDER BY (a.artist,b.artist);;

Output of EXPLAIN ANALYZE :

     Sort  (cost=1696482.86..1707588.14 rows=4442112 width=48) (actual time=9253.474..9314.510 rows=461386 loops=1)
   Sort Key: (ROW(a.artist, b.artist))
   Sort Method: external sort  Disk: 24832kB
   ->  Finalize GroupAggregate  (cost=396240.32..932717.19 rows=4442112 width=48) (actual time=1928.058..2911.463 rows=461386 loops=1)
         Group Key: a.artist, b.artist
         ->  Gather Merge  (cost=396240.32..860532.87 rows=3701760 width=16) (actual time=1928.049..2494.638 rows=566468 loops=1)
               Workers Planned: 2
               Workers Launched: 2
               ->  Partial GroupAggregate  (cost=395240.29..432257.89 rows=1850880 width=16) (actual time=1912.809..2156.951 rows=188823 loops=3)
                     Group Key: a.artist, b.artist
                     ->  Sort  (cost=395240.29..399867.49 rows=1850880 width=8) (actual time=1912.794..2003.776 rows=271327 loops=3)
                           Sort Key: a.artist, b.artist
                           Sort Method: external merge  Disk: 4848kB
                           ->  Merge Join  (cost=0.85..177260.72 rows=1850880 width=8) (actual time=2.143..1623.628 rows=271327 loops=3)
                                 Merge Cond: (a.release = b.release)
                                 Join Filter: (a.artist <> b.artist)
                                 Rows Removed by Join Filter: 687597
                                 ->  Parallel Index Only Scan using release_has_artist_pkey on release_has_artist a  (cost=0.43..67329.73 rows=859497 width=8) (actual time=0.059..240.998 rows=687597 loops=3)
                                       Heap Fetches: 711154
                                 ->  Index Only Scan using release_has_artist_pkey on release_has_artist b  (cost=0.43..79362.68 rows=2062792 width=8) (actual time=0.072..798.402 rows=2329742 loops=3)
                                       Heap Fetches: 2335683
 Planning time: 2.101 ms
 Execution time: 9387.430 ms
2
  • 1
    What indices do you have on release_has_artist? Commented Nov 17, 2019 at 21:51
  • Ser More value to work_mem (configuration parameter), this avoid use Sort Method: external merge Disk, set this parameter to 40 or 50 MB before execute that query Commented Nov 18, 2019 at 1:30

1 Answer 1

1

In your EXPLAIN ANALYZE output, there are two Sort Method: external merge Disk: ####kB, indicating that the sort spilled out to disk and not in memory, due to an insufficiently-sized work_mem. Try increasing your work_mem up to 32MB (30 might be ok, but I like multiples of 8), and try again

Note that you can set work_mem on a per-session basis, as a global change in work_mem could potentially have negative side-effects, such as running out of memory, because postgresql.conf-configured work_mem is allocated for each session (basically, it has a multiplicative effect).

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.