1

I'm running a very heavy query on MySQL which takes a few hours and would love to improve the process time. It looks like this:

insert into customers_input
select *
from 
(
   select *
   from cust_a a join cut_b b on a.id=b.id
   where a.date='2015-01-01' and a.date='2015-01-01'
) a left join
(
   select *
   from cust_c
) b on a.id=b.id;

cust_a - has 8,000,000 rows and only two different values for the date column, in addition a BTREE index on the id column

cust_b - has 600,000 rows and only two different values for the date column , in addition a BTREE index on the id column

cust_c - has 20,000 rows

My suspicion is that the problem is with the sub query (a) that joins tables cust_a and cust_b, simply because the process time has dramatically increased since I've added this sub query.

Any ideas or suggestions will be much appreciated.

Thanks in advance

2
  • You need to show us the table and index definitions, as well as row counts for each of the tables. Maybe your tables are defined poorly. Maybe the indexes aren't created correctly. Maybe you don't have an index on that column you thought you did. Without seeing the table and index definitions, we can't tell. We need row counts because that can affect query planning. If you know how to do an EXPLAIN or get an execution plan, put the results in the question as well. If you have no indexes, visit use-the-index-luke.com. Commented Jul 1, 2015 at 15:44
  • Until recent versions of MySQL, those subqueries could have no indexes, hence large table scans (of the tmp tables) were needed. Commented Jul 2, 2015 at 0:03

1 Answer 1

1

First, don't use subqueries. Your query can also be written as:

select *
from cust_a a join
     cust_b b
     on a.id = b.id and a.date = b.date
     where a.date = '2015-01-01' left join
     cust_c c
     on a.id = c.id;

In addition,

  • I fixed the typo in a table name.
  • I fixed the typo in the date comparison.
  • I moved the date comparison for b to the on clause.
  • I added an alias for c.

This query can then benefit from indexes: cust_a(date, id), cust_b(id, date) and cust_c(id). (The columns for the cust_b index can be in either order.)

Try out the query and see if it returns values in a timely manner. You can then put the insert back in.

Sign up to request clarification or add additional context in comments.

1 Comment

Thanks allot! taking out the sub-query and putting indexes helped allot. Got it down to a few minutes...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.