0

I am trying to optimize this MySQL query. I want to get a count of the number of customers that do not have an appointment prior to the current appointment being looked at. In other words, if they have an appointment (which is what the NOT IN( subquery is checking for), then exclude them.

However, this query is absolutely killing performance. I know that MySQL is not very good with NOT IN( queries, but I am not sure on the best way to go about optimizing this query. It takes anywhere from 15 to 30 seconds to run. I have created indexes on CustNo, AptStatus, and AptNum.

SELECT
        COUNT(*)             AS NumOfCustomersWithPriorAppointment,
FROM 
        transaction_log AS tl
LEFT JOIN
        appointment AS a
ON
        a.AptNum = tl.AptNum
INNER JOIN
        customer AS c
ON
        c.CustNo = tl.CustNo
WHERE
        a.AptStatus  IN (2)
AND     a.CustNo NOT IN
        (
                SELECT
                        a2.CustNo
                FROM
                        appointment a2
                WHERE
                        a2.AptDateTime < a.AptDateTime)


AND     a.AptDateTime > BEGIN_QUERY_DATE
AND     a.AptDateTime < END_QUERY_DATE

Thank you in advance.

3
  • Please provide SHOW CREATE TABLE statements for each of the tables in your query. Commented Oct 15, 2017 at 3:32
  • Is BEGIN_QUERY_DATE a constant? Or another column? Rewrite the query using LEFT JOIN .. IS NULL; tell us if it is faster. Commented Oct 15, 2017 at 4:01
  • Incidentally, your outer join (a) is really an inner join Commented Oct 15, 2017 at 7:09

1 Answer 1

1

Try the following:

SELECT
        COUNT(*)             AS NumOfCustomersWithPriorAppointment,
FROM 
        transaction_log AS tl
INNER JOIN
        appointment AS a
ON
        a.AptNum = tl.AptNum
LEFT OUTER JOIN appointment AS earlier_a
        ON earlier_a.CustNo = a.CustNo
        AND earlier_a.AptDateTime < a.AptDateTime
INNER JOIN
        customer AS c
ON
        c.CustNo = tl.CustNo
WHERE
        a.AptStatus  IN (2)
AND     earlier_a.AptNum IS NULL


AND     a.AptDateTime > BEGIN_QUERY_DATE
AND     a.AptDateTime < END_QUERY_DATE

This will benefit from a composite index on (CustNo,AptDateTime). Make it unique if that fits your business model (logically it seems like it should, but practically it may not, depending on how you handle conflicts in your application.)

Provide SHOW CREATE TABLE statements for all tables if this does not create a sufficient performance improvement.

Sign up to request clarification or add additional context in comments.

1 Comment

This query nailed it exactly. Thank you so much for your help and suggestions!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.