1

In one of my columns, there's are duplicates and I want to grab the first occurrence only. How can I do that? In the example, I want to grab all the rows that are unique in col C. So I only want hello ladies, hello team, hello cats, and hello sexy

Example Table
---------------

column A | col B | col C 
--------------------------
hello    | ladies| 1
hello    | guys  | 1
hello    | team  | 2
hello    | dogs  | 2
hello    | cats  | 3
hello    | cats  | 3
hello    | sexy  | 4

3 Answers 3

4

Use the distinct() command.

select distinct(colc), cola, colb from table 

This will only select the unique values

Sign up to request clarification or add additional context in comments.

Comments

3

The DISTINCT keyword is not applicable in your case.

In the DB the order of rows is arbitrary. You can however select just one of the B column for each unique C value using an aggregate function that can work with strings. MAX is such a function, if the 'maximum' of strings is an acceptable choice:

mysql> select A,max(B),C from Test group by C,A;
+-------+--------+------+
| A     | max(B) | C    |
+-------+--------+------+
| hello | ladies |    1 |
| hello | team   |    2 |
| hello | cats   |    3 |
| hello | sexy   |    4 |
+-------+--------+------+
4 rows in set (0.00 sec)

1 Comment

Wait, how would this work if I wanted to select all the columns? Wouldn't this mean I have to type out each one instead of using *? I tried this: SELECT DISTINCT col C,* FROM table
1

With LIMIT you can get just one of something, if there are multiple of this thing. Also, if rows are completely identical, there is no way to distinguish them at all, so order does not matter.

SELECT * FROM t WHERE colc=3 LIMIT 1

Sometimes you want a report of rows that are duplicate:

SELECT colc, COUNT(*) AS cnt FROM t GROUP BY colc

A GROUP BY clause looks at the fields that you name (here: colc) and considers all rows with the same colc value identical. It makes heaps for each colc value, so all colc=1 go onto one heap, colc=2 onto another and so on. The COUNT() aggregate function measures the height of these heaps.

A HAVING clause is a WHERE-like condition applied after the GROUP BY. We can use that to choose rows that are unique or that are duplicate, asking for cnt being 1 or larger than 1:

-- list all unique rows
SELECT colc, COUNT(*) AS cnt FROM t GROUP BY colc HAVING cnt = 1

You can make the actual contents of the heaps visible:

SELECT colc, COUNT(*) as cnt, GROUP_CONCAT(colb) AS content FROM t GROUP BY colc HAVING cnt > 1

It is possible to delete all but one copy of duplicate rows using the MySQL extension of LIMIT with DELETE:

DELETE FROM t WHERE colc=3 LIMIT 1

This will match ALL colc=3, but will delete only one row due to the LIMIT.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.