0

Two tables

Table 1 - customer table

user_id      Name
1            first
2            Second

Table 2 - customer_activity table

user_id      type
1            downloaded_software
1            filled_download_form
2            downloaded_software
2            filled_download_form
2            purchased

Goal - To select all the customers who have downloaded_software, filled_download_form and purchased.

My Query

    SELECT SQL_CALC_FOUND_ROWS DISTINCT(c.user_id) 
      FROM customer AS c 
INNER JOIN customer_activity AS ca ON ca.user_id = c.user_id 
     WHERE ca.type IN('downloaded_software','filled_download_form','purchased') 
  ORDER BY c.user_id asc 
     LIMIT 0, 100

Result

1
2

Desired Result

2

EDIT: Comments summary:

Answer for this question is below but a possible good scenario would be to exclude some items from the list. For example, if I want to do a search for customers who have downloaded_software, filled_software_form but not purchased.

This was answered by @Serpiton in this fiddle in the comments.

2 Answers 2

2

You could group by user_id and count the distinct values of type. Since the WHERE restricts the values to only 3 possible, the count of distinct values should be 3 if all are found;

SELECT SQL_CALC_FOUND_ROWS c.user_id 
FROM customer AS c 
JOIN customer_activity AS ca ON ca.user_id = c.user_id 
WHERE ca.type IN('downloaded_software', 'filled_download_form', 'purchased') 
GROUP BY c.user_id
HAVING COUNT(DISTINCT ca.type) = 3
ORDER BY c.user_id 
LIMIT 0, 100

An SQLfiddle to test with.

EDIT: To answer your question from the comments, if you need to exclude a type, you can't easily use GROUP BY to find the results. You can either do a self join per type (left join the excluded one and check that it results in no row);

SELECT SQL_CALC_FOUND_ROWS c.user_id 
FROM customer AS c 
JOIN customer_activity AS ca1
  ON ca1.user_id = c.user_id AND ca1.type = 'downloaded_software'
JOIN customer_activity AS ca2
  ON ca2.user_id = c.user_id AND ca2.type = 'filled_download_form'
LEFT JOIN customer_activity AS ca3
  ON ca3.user_id = c.user_id AND ca3.type = 'purchased'
WHERE ca3.user_id IS NULL 
ORDER BY c.user_id 
LIMIT 0, 100

...or - not as efficient but perhaps easier if you're auto generating the query - you can do it using 3 simple subqueries using IN and NOT IN to select if the type should be included or not...

SELECT SQL_CALC_FOUND_ROWS c.user_id 
FROM customer AS c
WHERE c.user_id IN (
  SELECT user_id FROM customer_activity WHERE type='downloaded_software'
) AND c.user_id IN (
  SELECT user_id FROM customer_activity WHERE type='filled_download_form'
) AND c.user_id NOT IN (
  SELECT user_id FROM customer_activity WHERE type='purchased'
) 
ORDER BY c.user_id
LIMIT 0,100;

An SQLfiddle showing both in action.

Sign up to request clarification or add additional context in comments.

7 Comments

Yep, this one is much better.
Nice!! That works. What if I want to exclude a type...So for example I want to target customers who have downloaded_software, filled_download_form but not purchased.
@KaranKhanna Added two possible examples of that.
It's possible to exclude a type with the first query changing the HAVING to HAVING COUNT(DISTINCT ca.type) = 2 AND COUNT(CASE WHEN ca.type = 'purchased' THEN 1 ELSE NULL END) = 0, the type in the CASE is the one to exclude
@KaranKhanna maybe I wasn't clear, check this fiddle
|
1

Disclaimer: in a comment in the answer of Joachim Isaksson I've suggested a variation on one of his query to the OP, who asked clarification, this is it.

Starting from the query with the modification

SELECT SQL_CALC_FOUND_ROWS c.user_id 
FROM customer AS c 
JOIN customer_activity AS ca 
  ON ca.user_id = c.user_id 
WHERE ca.type IN('downloaded_software',
                 'filled_download_form',
                 'purchased') 
GROUP BY c.user_id
HAVING COUNT(DISTINCT ca.type) = 2
   AND COUNT(CASE WHEN ca.type = 'purchased' THEN 1 ELSE NULL END) = 0
ORDER BY c.user_id 
LIMIT 0, 100

the part in bold is my suggested edit.

If something it's not clear outside the part I modified you should ask to Joachim Isaksson, as he is the one who wrote the query.

My edit does what it say on the tin: the first condition check that there are only two of the three valid values of type, the second one check that 'purchased' is the one left out. The second condition is equivalent to

SUM(CASE WHEN ca.type = 'purchased' THEN 1 ELSE 0 END) = 0

that is maybe more simple to read.

The whole query is equivalent to

SELECT SQL_CALC_FOUND_ROWS c.user_id 
FROM customer AS c 
JOIN customer_activity AS ca 
  ON ca.user_id = c.user_id 
WHERE ca.type IN('downloaded_software',
                 'filled_download_form',
                 'purchased') 
GROUP BY c.user_id
HAVING COUNT(DISTINCT 
             CASE WHEN ca.type = 'downloaded_software' THEN 1 ELSE NULL END) = 1
   AND COUNT(DISTINCT 
             CASE WHEN ca.type = 'filled_download_form' THEN 1 ELSE NULL END) = 1
   AND COUNT(DISTINCT 
             CASE WHEN ca.type = 'purchased' THEN 1 ELSE NULL END) = 0
ORDER BY c.user_id 
LIMIT 0, 100

(if you only have those 3 type the WHERE is not necessary)

If you're writing the query from a programming language I will use this as the template

SELECT SQL_CALC_FOUND_ROWS c.user_id 
FROM customer AS c 
JOIN customer_activity AS ca 
  ON ca.user_id = c.user_id 
WHERE ca.type IN('downloaded_software',
                 'filled_download_form',
                 'purchased') 
GROUP BY c.user_id
HAVING COUNT(DISTINCT 
             CASE WHEN ca.type = 'downloaded_software' 
                  THEN 1 ELSE NULL END) = ?downloaded?
   AND COUNT(DISTINCT 
             CASE WHEN ca.type = 'filled_download_form' 
                  THEN 1 ELSE NULL END) = ?filled?
   AND COUNT(DISTINCT 
             CASE WHEN ca.type = 'purchased' 
                  THEN 1 ELSE NULL END) = ?purchased?
ORDER BY c.user_id 
LIMIT 0, 100

with ?downloaded?, ?filled? and ?purchased? as parameters. 1 mean that the type need to be present, 0 mean that the parameter need to be missing

To answer the other question For example, only target customers who have neither filled_download_form nor purchased. What would be the query for that? just fill the parameters accordingly.

2 Comments

Thanks a lot mate for a good explanation. Also this solution will not work if I just have a 1 negation parameter right? For example, just select customers which have not purchased today?
Sorry...the database was too large to map out that's why I didn't write in the question. There is a date column in there but let's leave the today part. Thing is, these are all filters which a customer can select. They may or may not select all. So suppose if they are only selecting "not purchased" then this one doesn't work.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.