I have a MySQL query (Ubu 10.04,Innodb, Core i7, 16Gb RAM, SSD drives, MySQL params optimized):
SELECT
COUNT(DISTINCT subscriberid)
FROM
em_link_data
WHERE
linkid in (SELECT l.id FROM em_link l WHERE l.campaignid = '2900' AND l.link != 'open')
The table em_link_data has about 7million rows, em_link has a few thousand. This query will take about 18 seconds to complete. However, if I substitute the results of the subquery and do this:
SELECT
COUNT(DISTINCT subscriberid)
FROM
em_link_data
WHERE
linkid in (24899,24900,24901,24902);
then the query will run in less than 1 millisecond. The subquery alone runs in less than 1ms, the column linkid is indexed.
If I rewrite the query as a join, also less than 1ms. Why is a "IN" query so slow with a subquery in it and why so fast with values in it? I can't rewrite the query (bought software) so I was hoping there is some tweak or hint to speedup this query! Any help is appreciated.
em_linkneeds an index containingcampaignidandlink.ackci.em_link_data.subscriberid) ASCOUNT(DISTINCT subscriberid)fromackci.em_link_datawhere <in_optimizer>(ackci.em_link_data.linkid, <exists>(<primary_index_lookup>(<cache>(ackci.em_link_data.linkid) in em_link on PRIMARY where ((ackci.l.campaignid= '2900') and (ackci.l.link<> 'open') and (<cache>(ackci.em_link_data.linkid) =ackci.l.id)))))materializationoption. If the sub-query is independent of the outer query, then it gets executed once, turned into temporary table internally, then joined to the outer query. This has always been a very frustrating problem with MySQL, something that Oracle managed to get right several decades ago.