2

i have the two tables "commissions" and "mt4_trades". In "mt4_trades" "ticket" column is private key, in "commissions" there is "order_id" and it is has relation to mt4_trades.ticket as one to many (one "ticket" to many "order_id"). And i have this statement:

SELECT commissions.ibs_account AS ibs_account                      
       FROM "public"."mt4_trades" 
    INNER JOIN commissions ON commissions.order_id = mt4_trades.ticket 
        WHERE "mt4_trades"."close_time" >= '2014.11.01' 
          AND "mt4_trades"."close_time" < '2014.12.01'

commissions table constains about 4 millions rows. This statement return 480000 rows. But it is too slow: executions time 9 sec. I did EXPLAIN ANALYZE:

Hash Join  (cost=43397.07..216259.97 rows=144233 width=7) (actual time=3993.839..9459.896 rows=488131 loops=1)
  Hash Cond: (commissions.order_id = mt4_trades.ticket)
  ->  Seq Scan on commissions  (cost=0.00..116452.08 rows=3997708 width=15) (actual time=0.005..4185.254 rows=3997157 loops=1)
  ->  Hash  (cost=42485.10..42485.10 rows=72958 width=4) (actual time=288.767..288.767 rows=97260 loops=1)
        Buckets: 8192  Batches: 1  Memory Usage: 3420kB
        ->  Index Scan using "INDEX_CLOSETIME" on mt4_trades  (cost=0.43..42485.10 rows=72958 width=4) (actual time=0.020..174.810 rows=97260 loops=1)
              Index Cond: ((close_time >= '2014-11-01 00:00:00'::timestamp without time zone) AND (close_time < '2014-12-01 00:00:00'::timestamp without time zone))
Total runtime: 9881.979 ms

This row:

->  Seq Scan on commissions  (cost=0.00..116452.08 rows=3997708 width=15) (actual time=0.005..4185.254 rows=3997157 loops=1)

Means that scan the whole "commissions" table instead compare "order_id" and "ticket" first. Can you help me how i can improve this query. THanks

3
  • 1
    Have you indexed commissions.order_id? Commented Nov 11, 2014 at 23:56
  • 1
    Paste CREATE TABLE statements for these tables into your question. Update statistics by running analyze commissions; analyze mt4_trades;. Then explain analyze your-query again. Commented Nov 12, 2014 at 4:14
  • 1
    @tsnorri: An index probably won't help on this hash join unless the OP can also add a WHERE condition to ibs_account. Commented Nov 12, 2014 at 14:00

1 Answer 1

1

9 seconds to return half a million rows is not terrible, and a sequential scan on 4M could be much faster than faster than 100K indexed lookups on 4M. Assuming you've got an index on order_id, you can test this by running set enable_seqscan TO false; before running your query (this will only affect the current connection).

Do you really need all 500K rows every time you run this query? Or are you going to be filtering the results? If you're almost always going to be filtering the results in some other way, you probably want to optimize against that query rather than the one that returns all 500K rows.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.