I have a query on a large table with millions of rows that looks like this:
SELECT
COUNT(
DISTINCT clicks.tx_id
) AS unique_clicks_count ,
clicks.offer_id
FROM
clicks
WHERE
clicks.offer_id = 5
AND created_at > '2014-11-27 18:00:02'
;
Created_at is a timestamp. I have a compound index on (offer_id, created_at) which gets used. The following is the explain:
| 1 | SIMPLE | clicks | range | clicks_created_at_index,clicks_offer_id_created_at_index | clicks_offer_id_created_at_index | 8 | NULL | 215380 | Using index condition; Using MRR |
Keeping in mind the range, what kind of index would I need to be able to count the distinct
tx_id's efficiently, most likely which covers tx_id as well?What would the index look like without specifying
clicks.offer_id = 5, and instead doingGROUP BY offer_id?
(offer_id, created_at, tx_id)?created_atbeing the range column.WHEREcondition but it can use them. And that means skip reading the table and reading only the index.offer_idand then adding an index oncreated_at,tx_id. That should give you both the range scan and the filter on the value.