Optimizing sql queries with NOT IN clause on the same table

Question

I would like to be able to select all entries from the orders table where a certain product has been ordered prior to 2019 but not after it. The table has close to 7M entries and the below query seems to take almost ~4 minutes to run. Note that in the orders table productId is a foreign key to products table and is indexed. Could we rewrite the below query to be more optimized and better in performance time ? Any help is greatly appreciated. Thank you

SELECT distinct *
FROM orders o
WHERE o.year < '2019'
AND o.productid NOT IN (
                        SELECT distinct(productid)
                        FROM orders
                        WHERE year > '2019');

Please find below the output from explain commmand

+----+--------------------+-------+------------+------+------------------------+------------------------+---------+--------------------------+---------+----------+-------------+
| id | select_type        | table | partitions | type | possible_keys          | key                    | key_len | ref                      | rows    | filtered | Extra       |
+----+--------------------+-------+------------+------+------------------------+------------------------+---------+--------------------------+---------+----------+-------------+
|  1 | PRIMARY            | o     | NULL       | ALL  | NULL                   | NULL                   | NULL    | NULL                     | 2124177 |    33.33 | Using where |
|  2 | DEPENDENT SUBQUERY | o2    | NULL       | ref  | FK_orders_product | FK_orders_product | 4       | test-db.o.productid |       3 |    33.33 | Using where |
+----+--------------------+-------+------------+------+------------------------+------------------------+---------+--------------------------+---------+----------+-------------+
2 rows in set, 2 warnings (0.05 sec)

Please run explain before your query and post the result in the question. Add table description as well — Ergest Basha
– Ergest Basha, Commented Jun 4, 2022 at 20:22
Why do you mention a foreign key to products? You don't use that table in this query. — HoneyBadger
– HoneyBadger, Commented Jun 4, 2022 at 22:19
@ErgestBasha - Please find the output from explain select in the question above. Thanks — shashank hr
– shashank hr, Commented Jun 6, 2022 at 2:45

Stu · Accepted Answer · 2022-06-04 20:04:05Z

1

You could use not exists.

Hopefully the year column is not a varchar so you should not be using string literals. Presumably using select * means there won't be any duplicates so you should remove distinct.

Your year ranges also exclude 2019 completely, so presumably one of your predicates should be equal to 2019?

select *
from orders o
where o.year < 2019
  and not exists (
    select *
    from orders o2
    where o2.productid = o.productid
      and Year >= 2019
  );

answered Jun 4, 2022 at 20:04

Stu

32.7k6 gold badges17 silver badges34 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

shashank hr Over a year ago

Thanks for your help. I did add the changes you mentioned above and it has definitely improved/corrected the results returned. The query time still seems to be around the same time, but not sure how to completely optimize it yet. Thank you

Rick James Over a year ago

@shashankhr - Please provide the EXPLAIN SELECT ...

shashank hr Over a year ago

@RickJames - Please find the output from explain select in the question above. Thanks

Rick James Over a year ago

@shashankhr - And for Stu's NOT EXISTS version?

Rick James · Accepted Answer · 2022-06-04 21:21:47Z

0

Probably both uses of DISTINCT were useless.

Add this composite index (to at least help the NOT EXISTS):

INDEX(product_id, year)

answered Jun 4, 2022 at 21:21

Rick James

144k15 gold badges144 silver badges255 bronze badges

Collectives™ on Stack Overflow

Optimizing sql queries with NOT IN clause on the same table

2 Answers 2

4 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related