Here is the query I'm trying to launch:
SELECT c.creative_id, c.creative_title, c.creative_image_name, c.gravity, c.ad_strength
FROM creatives AS c
INNER JOIN term_relationships AS tr ON c.creative_id = tr.creative_id
WHERE tr.term_id
IN ( 14, 1, 50, 76, 104 )
GROUP BY c.creative_id
HAVING COUNT(tr.term_id ) =5
ORDER BY c.gravity ASC
LIMIT 30;
Here is what EXPLAIN for this query outputs:
Here is the creatives table structure:
CREATE TABLE `creatives` (
`creative_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
`scraper_id` bigint(20) unsigned DEFAULT NULL,
`creative_title` varchar(255) NOT NULL,
`creative_image_name` varchar(255) DEFAULT NULL,
`image_attrib` varchar(12) DEFAULT NULL,
`original_image_name` varchar(255) DEFAULT NULL,
`creative_subtext` varchar(255) DEFAULT NULL,
`dest_url` varchar(2083) NOT NULL,
`lp_url` varchar(2083) NOT NULL,
`lp_image_name` varchar(255) DEFAULT NULL,
`lp_image_flag` tinyint(1) unsigned NOT NULL DEFAULT '0',
`creative_first_seen` date NOT NULL,
`creative_last_seen` date NOT NULL,
`daily_ad_count` int(5) unsigned NOT NULL,
`ad_strength` int(11) unsigned NOT NULL,
`prev_ad_strength` int(11) unsigned DEFAULT NULL,
`gravity` int(11) unsigned DEFAULT NULL,
PRIMARY KEY (`creative_id`),
KEY `gravity` (`gravity`)
) ENGINE=InnoDB AUTO_INCREMENT=173037591 DEFAULT CHARSET=utf8
I'm concerned about Using temporary; using filesort when launching both with GROUP BY and ORDER BY on another column. If I remove ORDER BY, the temporary and filesort go away and the query runs really fast.
What I don't understand, why mysql needs temporary table, why can't it first where filter + sort by c.gravity, then group by the resulting table and filter according to HAVING clause. The filtered table will be sorted by c.gravity correctly as the gravity value remains unchanged after the grouping and having filter.
What I tried:
Selected everything without
ORDER BY, wrapped into a subquery and joined again oncreativestable - same result, using temporary, filesort and slowtried to add
FORCE USE INDEX FOR ORDER BY (gravity)and it doesn't change anything.EXPLAINand execution time remain the same.
UPDATE: the question has been answered by @Rick and it's really much faster with his correlated subquery and not using GROUP BY. I'm adding here an EXPLAIN output for the query:
And the output of SHOW CREATE TABLE term_relationships with the newly created index:
And one more question to @Rick: why do we need the outer query with c3? It seems just to join creatives on its own one more just to get the values from other columns and order the records by gravity. However, they are already sorted with the inner query and we can easily add missing columns in c1 making it:
SELECT c1.creative_id,c1.creative_title,c1.creative_image_name,c1.gravity, c1.ad_strength
FROM creatives AS c1
WHERE
( SELECT COUNT(*)
FROM term_relationships
WHERE c1.creative_id = creative_id
AND term_id IN ( 14, 1, 50, 76, 104 )
) = 5
ORDER BY c1.gravity ASC
LIMIT 30;
Is my understanding correct or am I missing something in your query?



c3) Withoutc3, onlycreative_idis hauled around in the tmp table. Withc3, lots of columns are hauled around. If there is a big difference between theLIMIT(30) and the number of rows beforeLIMITing, then there could be a noticeable performance difference.