I would like to be able to select all entries from the orders table where a certain product has been ordered prior to 2019 but not after it. The table has close to 7M entries and the below query seems to take almost ~4 minutes to run. Note that in the orders table productId is a foreign key to products table and is indexed. Could we rewrite the below query to be more optimized and better in performance time ? Any help is greatly appreciated. Thank you
SELECT distinct *
FROM orders o
WHERE o.year < '2019'
AND o.productid NOT IN (
SELECT distinct(productid)
FROM orders
WHERE year > '2019');
Please find below the output from explain commmand
+----+--------------------+-------+------------+------+------------------------+------------------------+---------+--------------------------+---------+----------+-------------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+--------------------+-------+------------+------+------------------------+------------------------+---------+--------------------------+---------+----------+-------------+ | 1 | PRIMARY | o | NULL | ALL | NULL | NULL | NULL | NULL | 2124177 | 33.33 | Using where | | 2 | DEPENDENT SUBQUERY | o2 | NULL | ref | FK_orders_product | FK_orders_product | 4 | test-db.o.productid | 3 | 33.33 | Using where | +----+--------------------+-------+------------+------+------------------------+------------------------+---------+--------------------------+---------+----------+-------------+ 2 rows in set, 2 warnings (0.05 sec)
explainbefore your query and post the result in the question. Add table description as wellproducts? You don't use that table in this query.SHOW CREATE TABLE orders