MySQL: Exclude duplicate data

Question

In one of my columns, there's are duplicates and I want to grab the first occurrence only. How can I do that? In the example, I want to grab all the rows that are unique in col C. So I only want hello ladies, hello team, hello cats, and hello sexy

Example Table
---------------

column A | col B | col C 
--------------------------
hello    | ladies| 1
hello    | guys  | 1
hello    | team  | 2
hello    | dogs  | 2
hello    | cats  | 3
hello    | cats  | 3
hello    | sexy  | 4

arnehehe · Accepted Answer · 2011-04-01 06:52:35Z

4

Use the distinct() command.

select distinct(colc), cola, colb from table

This will only select the unique values

answered Apr 1, 2011 at 6:52

arnehehe

1,4082 gold badges18 silver badges34 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Will · Accepted Answer · 2011-04-01 08:21:00Z

3

The DISTINCT keyword is not applicable in your case.

In the DB the order of rows is arbitrary. You can however select just one of the B column for each unique C value using an aggregate function that can work with strings. MAX is such a function, if the 'maximum' of strings is an acceptable choice:

mysql> select A,max(B),C from Test group by C,A;
+-------+--------+------+
| A     | max(B) | C    |
+-------+--------+------+
| hello | ladies |    1 |
| hello | team   |    2 |
| hello | cats   |    3 |
| hello | sexy   |    4 |
+-------+--------+------+
4 rows in set (0.00 sec)

edited Apr 1, 2011 at 8:21

answered Apr 1, 2011 at 6:53

Will

76k43 gold badges177 silver badges256 bronze badges

1 Comment

Strawberry Over a year ago

Wait, how would this work if I wanted to select all the columns? Wouldn't this mean I have to type out each one instead of using *? I tried this: SELECT DISTINCT col C,* FROM table

Isotopp · Accepted Answer · 2011-04-01 08:35:40Z

With LIMIT you can get just one of something, if there are multiple of this thing. Also, if rows are completely identical, there is no way to distinguish them at all, so order does not matter.

SELECT * FROM t WHERE colc=3 LIMIT 1

Sometimes you want a report of rows that are duplicate:

SELECT colc, COUNT(*) AS cnt FROM t GROUP BY colc

A GROUP BY clause looks at the fields that you name (here: colc) and considers all rows with the same colc value identical. It makes heaps for each colc value, so all colc=1 go onto one heap, colc=2 onto another and so on. The COUNT() aggregate function measures the height of these heaps.

A HAVING clause is a WHERE-like condition applied after the GROUP BY. We can use that to choose rows that are unique or that are duplicate, asking for cnt being 1 or larger than 1:

-- list all unique rows
SELECT colc, COUNT(*) AS cnt FROM t GROUP BY colc HAVING cnt = 1

You can make the actual contents of the heaps visible:

SELECT colc, COUNT(*) as cnt, GROUP_CONCAT(colb) AS content FROM t GROUP BY colc HAVING cnt > 1

It is possible to delete all but one copy of duplicate rows using the MySQL extension of LIMIT with DELETE:

DELETE FROM t WHERE colc=3 LIMIT 1

This will match ALL colc=3, but will delete only one row due to the LIMIT.

Collectives™ on Stack Overflow

MySQL: Exclude duplicate data

3 Answers 3

Comments

1 Comment

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

Comments

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related