Return Distinct Concat Count if More Than One Result

Question

I'm trying to get a distinct count of multiple columns grouped by another column, but I want the results to only include counts greater than ONE. So if I have the following:

SELECT *
FROM cast
ORDER BY cast_characters, cast_identities, cast_roles

cast_characters	cast_identities	cast_roles
Barry	William	Hero
Barry	William	Hero
Barry	Scott	Hero
Barry	Scott	Hero
Alice	Susan	Villain
Jerry	Smith	Villain
Jerry	Smith	Villain
Carlos	Salvador	Supporting
Carlos	Salvador	Supporting

As I'd like to count by unique roles over ONE, based on the above, "Hero" should have two counts with Barry/William and Barry/Scott. "Villain", should have one count, with Jerry/Smith (Alice/Susan should be ignored, as there's only one instance of her), and "Supporting" should have one with Carlos/Salvador. So I tried this and want:

SELECT cast_roles, COUNT(DISTINCT CONCAT(cast_characters, cast_identities, cast_roles)) AS 'cnt'
FROM cast
GROUP BY cast_roles
HAVING cnt > 1;

cast_roles	cnt
Hero	2
Villains	1
Supporting	1

But I get...

cast_roles	cnt
Hero	2
Villains	2
Supporting	1

So pretty close, but it looks like it's counting all distinct characters/identities/roles regardless of how many instances. Indeed, when I remove the "having" element from the query, I get the same results, so it doesn't seem to be doing anything, though it doesn't give me an error message, either.

What am I missing?

I make this fiddle to help us collaborate on a solution dbfiddle.uk/ZES8SS2d — Bart McEndree
– Bart McEndree, Commented Jan 8 at 19:07
Please provide a table that shows what your desired result is — Bart McEndree
– Bart McEndree, Commented Jan 8 at 19:14
Don't use CONCAT(). It will group characters_id=12, identities_id=3, roles_id=4 together with characters_id=1, identities_id=23, roles_id=4. Just use COUNT(DISTINCT cast_characters_id, cast_identities_id, cast_roles_id) — Barmar
– Barmar, Commented Jan 8 at 19:44

Barmar · Accepted Answer · 2025-01-08 21:02:27Z

2

You should start with a subquery that removes all non-duplicated rows from the original data. Then get the per-role counts from that.

SELECT cast_roles, COUNT(DISTINCT cast_characters, cast_identities) AS cnt
FROM (
    SELECT *
    FROM cast
    GROUP BY cast_characters, cast_identities, cast_roles
    HAVING COUNT(*) > 1
) AS multiples
GROUP BY cast_roles

Result:

cast_roles  cnt
Hero        2
Supporting  1
Villain     1

DEMO

Note that you shouldn't include the column you're grouping by in the COUNT(DISTINCT ...) expression.

edited Jan 8 at 21:02

answered Jan 8 at 20:17

Barmar

789k57 gold badges555 silver badges669 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Michael Kaiser Jan 8 at 20:57

This creates the right results so it's obviously the right solution. Unfortunately I don't think "WITH" works on the version of MySQL my host provides. Guess I need a new host.

Barmar Jan 8 at 21:01

I've recoded it as a traditional subquery.

Bart McEndree Jan 8 at 21:02

WITH was added in MySQL v8.0

Barmar Jan 8 at 21:03

@MichaelKaiser If your provider still only provides MySQL 5.x, you definitely should fine a new host. That has been obsolete for years.

Bart McEndree · Accepted Answer · 2025-01-08 20:59:32Z

0

This sub query might help get the results you want

SELECT cast_roles, COUNT(DISTINCT cast_characters, cast_identities) as cnt
  FROM
  (
SELECT *
FROM cast
GROUP BY cast_characters, cast_identities, cast_roles
HAVING COUNT(*) > 1
  ) t
GROUP BY cast_roles

fiddle

cast_roles	cnt
Hero	2
Supporting	1
Villain	1

edited Jan 8 at 20:59

answered Jan 8 at 19:16

Bart McEndree

3,6041 gold badge12 silver badges20 bronze badges

10 Comments

Barmar Jan 8 at 19:41

If their code works as is, just say so in a comment. An answer with the same code doesn't really solve anything.

Bart McEndree Jan 8 at 19:43

Yes but this code is accompanied with different output then OP suggests. I can't include a table in a comment. It also has a fiddle to help us find the discrepancy.

Barmar Jan 8 at 19:45

In cases like this I just write a comment that says "It works as expected" followed by a link to the fiddle.

Michael Kaiser Jan 8 at 19:50

It does not work as I want. I have edited my original post to hopefully communicate my problem more effectively.

Michael Kaiser Jan 8 at 19:58

But Alice/Susan was only a villain once, so I don't want her in the result.

|

Collectives™ on Stack Overflow

Return Distinct Concat Count if More Than One Result

2 Answers 2

4 Comments

10 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

4 Comments

10 Comments

Your Answer

Sign up or log in

Post as a guest

Related