PostgreSQL join fetch all rows in table, too slow

Question

i have the two tables "commissions" and "mt4_trades". In "mt4_trades" "ticket" column is private key, in "commissions" there is "order_id" and it is has relation to mt4_trades.ticket as one to many (one "ticket" to many "order_id"). And i have this statement:

SELECT commissions.ibs_account AS ibs_account                      
       FROM "public"."mt4_trades" 
    INNER JOIN commissions ON commissions.order_id = mt4_trades.ticket 
        WHERE "mt4_trades"."close_time" >= '2014.11.01' 
          AND "mt4_trades"."close_time" < '2014.12.01'

commissions table constains about 4 millions rows. This statement return 480000 rows. But it is too slow: executions time 9 sec. I did EXPLAIN ANALYZE:

Hash Join  (cost=43397.07..216259.97 rows=144233 width=7) (actual time=3993.839..9459.896 rows=488131 loops=1)
  Hash Cond: (commissions.order_id = mt4_trades.ticket)
  ->  Seq Scan on commissions  (cost=0.00..116452.08 rows=3997708 width=15) (actual time=0.005..4185.254 rows=3997157 loops=1)
  ->  Hash  (cost=42485.10..42485.10 rows=72958 width=4) (actual time=288.767..288.767 rows=97260 loops=1)
        Buckets: 8192  Batches: 1  Memory Usage: 3420kB
        ->  Index Scan using "INDEX_CLOSETIME" on mt4_trades  (cost=0.43..42485.10 rows=72958 width=4) (actual time=0.020..174.810 rows=97260 loops=1)
              Index Cond: ((close_time >= '2014-11-01 00:00:00'::timestamp without time zone) AND (close_time < '2014-12-01 00:00:00'::timestamp without time zone))
Total runtime: 9881.979 ms

This row:

->  Seq Scan on commissions  (cost=0.00..116452.08 rows=3997708 width=15) (actual time=0.005..4185.254 rows=3997157 loops=1)

Means that scan the whole "commissions" table instead compare "order_id" and "ticket" first. Can you help me how i can improve this query. THanks

Paste CREATE TABLE statements for these tables into your question. Update statistics by running analyze commissions; analyze mt4_trades;. Then explain analyze your-query again. — Mike Sherrill 'Cat Recall'
– Mike Sherrill 'Cat Recall', Commented Nov 12, 2014 at 4:14
@tsnorri: An index probably won't help on this hash join unless the OP can also add a WHERE condition to ibs_account. — Mike Sherrill 'Cat Recall'
– Mike Sherrill 'Cat Recall', Commented Nov 12, 2014 at 14:00

Ramfjord · Accepted Answer · 2014-11-18 22:08:53Z

1

9 seconds to return half a million rows is not terrible, and a sequential scan on 4M could be much faster than faster than 100K indexed lookups on 4M. Assuming you've got an index on order_id, you can test this by running set enable_seqscan TO false; before running your query (this will only affect the current connection).

Do you really need all 500K rows every time you run this query? Or are you going to be filtering the results? If you're almost always going to be filtering the results in some other way, you probably want to optimize against that query rather than the one that returns all 500K rows.

answered Nov 18, 2014 at 22:08

Ramfjord

9578 silver badges15 bronze badges

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

PostgreSQL join fetch all rows in table, too slow

1 Answer 1

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Related