0

Is it possible in SQL to create a query that returns all rows where some columns are duplicates, but not all?

Actionable example: consider this hypothetical SQL table with five rows in it:

| Column_A | Column_B | Column_C |
| -------- | -------- | -------- |
| ABC      | DEF      | GHI      |
| ABC      | DEF      | JKL      |
| DEF      | GHI      | GHI      |
| DEF      | GHI      | JKL      |
| ABC      | GHI      | GHI      |

The question I'm asking is this: how can I write a query that will return/"select" all rows where both Column_A and Column_B are equal to that of at least one other row in the table?

To eliminate vagueness, here is a problem that if you can solve it you will resolve my issue:

What SQL query will return exactly these four rows and no other rows?

| ABC      | DEF      | GHI      |
| ABC      | DEF      | JKL      |
| DEF      | GHI      | GHI      |
| DEF      | GHI      | JKL      |

To do this the query must check if column A and B are duplicates of other rows, but ignore column C.

I thought that using a GROUP BY and HAVING would work, but those only work when all rows are duplicate, because it just returns each unique row, I need to return all rows where only some columns are duplicate.

Is this possible in SQL, if so how?

3
  • select * from t where (a, b) in (select a, b from t group by a, b having count(*) > 1) ? Commented Apr 8, 2024 at 19:07
  • @TheImpaler: works - IF your RDBMS does support this kind of syntax - not all RDBMS do ... (and the OP unfortunately didn't mention what concrete RDBMS he's using...) Commented Apr 8, 2024 at 19:31
  • Windows functions are probably the most efficient, COUNT(*) OVER (PARTITION BY Column_A, Column_B) then check that for >1 Commented Apr 8, 2024 at 19:38

1 Answer 1

0

You can do a select where you return all the columns and check with a subquery in your where the two columns who are equals.

SELECT 
    TABLE.Column_A, TABLE.Column_B, TABLE.Column_C
FROM 
    TABLE
WHERE 
    (TABLE.Column_A, TABLE.Column_B) IN 
         (SELECT TABLE.Column_A, TABLE.Column_B
          FROM TABLE
          GROUP BY TABLE.Column_A, TABLE.Column_B
          HAVING COUNT(*) > 1);
Sign up to request clarification or add additional context in comments.

1 Comment

Word of advice: not all SQL-based RDBMS system will support this syntax .....

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.