6

I tried do search other posts but could only find about finding duplicates about one fixed values.

So imagine following table:

 ╔══════════╦═══════╗
 ║ customer ║ color ║
 ╠══════════╬═══════╣
 ║        1 ║ black ║
 ║        1 ║ black ║
 ║        2 ║ red   ║
 ║        2 ║ black ║
 ║        3 ║ red   ║
 ║        3 ║ red   ║
 ║        3 ║ red   ║
 ║        4 ║ black ║
 ║        5 ║ black ║
 ║        5 ║ green ║
 ║        6 ║ purple║
 ╚══════════╩═══════╝

I want to select the "duplicates" means the following customers:

  • with more than one black
  • one black and other red would be also an duplicate
  • no duplicate: customer can have as many reds as he want

What I have so far

Currently what I can select is only about the black duplicates but I can not combine it wiht the condition "one black, no more red".

SELECT customer FROM events WHERE
    color = 'black'
    group by customer
    having count(*) > 1

Maybe I could first count the blacks and than join again with the existing table count the additional blacks and reds?

Desired Output

I would like to have the following result as customers: 1,2. Even better would be an output where I know if customer was a double black or a black + some reds:

╔══════════╦═══════════╦══════════════╗
║ customer ║ blackOnly ║ blackPlusRed ║
╠══════════╬═══════════╬══════════════╣
║        1 ║ yes       ║ no           ║
║        2 ║ no        ║ yes          ║
╚══════════╩═══════════╩══════════════╝

Sorry had to modify my post

  • I added customer 5 and 6 in example table and more colors. So maybe some suggestions does not work anymore :-(. (Just want to edit fast, so If I didn't follow some modifying rules, just tell me)
  • Thanks for the very fast answers already so far
5
  • What if customer 1 had also a red? Would he still be blackOnly? Commented Oct 26, 2015 at 9:09
  • Take your pick from any of the three responses below :-) Commented Oct 26, 2015 at 9:19
  • @pablomatico: right I doesn't specify this. Best would be to count on both sights than and I should relabel from "blackOnly" to "blackDouble". But the more important is just the right selection of customers. Thanks Commented Oct 26, 2015 at 9:33
  • You didn't mention the results you are expecting regarding the new colors. Commented Oct 26, 2015 at 9:49
  • @NeriaNachum: The result should not change due to the new colors. Same conditions for the meaning of "duplicate" apply. I had to add it because some solutions correctly assumes that example is the whole set of the problem, so they constructed their solution to meet the simplified condition. Commented Oct 26, 2015 at 10:02

2 Answers 2

2

This query first creates a temporary table containing the count of black and red for each customer, and then queries this table to obtain the blackOnly and blackPlusRed column values, for each customer.

SELECT t.customer,
    CASE WHEN t.black > 1 AND t.red = 0 THEN 'yes' ELSE 'no' END AS blackOnly,
    CASE WHEN t.black > 0 AND t.red > 0 THEN 'yes' ELSE 'no' END AS blackPlusRed
FROM
(
    SELECT *,
        SUM(CASE WHEN color='black' THEN 1 ELSE 0 END) AS black,
        SUM(CASE WHEN color='red' THEN 1 ELSE 0 END) AS red
    FROM events
    GROUP BY customer
) t

If you want to add a new color condition, e.g. only red, then you can add a new CASE statement to the outer query:

CASE WHEN t.red > 1 AND t.black = 0 THEN 'yes' ELSE 'no' END AS redOnly

Here is a demo:

SQLFiddle

Sign up to request clarification or add additional context in comments.

2 Comments

You shouldn't encourage the use of non-standard double quotes for string literals. Use single quotes instead.
There are a lot of things MySQL tolerates which are not part of the ANSI standard, such as allowing a column in a GROUP BY which is not an aggregate. Really bad practice, but welcome to the world of open source.
1

You want all customers having 'black' and at least two records. You can do this with conditional aggregation:

select 
  customer,  
  case when count(distinct color) = 1 then 'yes' else 'no' end as blackOnly,
  case when count(distinct color) > 1 then 'yes' else 'no' end as blackPlusRed
from events 
group by customer
having count(*) > 1
and count(case when color = 'black' then 1 end) > 0;

UPDATE: If you allow for other colors, the query changes slightly:

select 
  customer,  
  case when count(case when color = 'red' then 1 end) = 0 then 'yes' else 'no' end as blackOnly,
  case when count(case when color = 'red' then 1 end) > 0 then 'yes' else 'no' end as blackPlusRed
from events 
group by customer
having count(case when color = 'black' then 1 end) > 1
or
(
  count(case when color = 'black' then 1 end) > 0
  and 
  count(case when color = 'red' then 1 end) > 0
);

6 Comments

sorry I edited my post and added more colors. So I can not brake down the condition to "You want all customers having 'black' and at least two records" anymore. I would appreciate if there would still be a solution without a subselect.
@timguy There is nothing wrong with a subquery if it is necessary. Nice first name, by the way ^ ^
@timguy: Then simply change the conditions. I've updated my answer.
@TimBiegeleisen: You are right there is nothing wrong with a subquery and I like it usually as it is to read more easily. Just for my case my SQL I use from Java side is easier to adapt. Nice first name, indeed ;-). Thanks
I like EXISTS clauses, too, and always wonder why so many people join instead only to have to use DISTINCT afterwards in order to get rid of the generated duplicates :-) Here, however, where you only want information on a customer from that one table, aggregation is the straight-forward way and usually the faster, too.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.