7

I have table a with two fields: id (PK) and f.

Consider following records:

id | f
1  | NULL
2  | 'foo'
3  | 'bar'
4  | NULL
5  | 'foo'
6  | 'baz'

I want to retrieve and count all the records having distinct f values including every record WHERE f IS NULL. Given this criteria, the query should return every record except #5, because the same value is already included in the set, and the total count would be 5.

The query I'm using to retrieve all records looks like this:

SELECT CASE WHEN EXISTS (SELECT id FROM a a2 WHERE a2.f = a.f AND a.id < a2.id) THEN 1 END AS not_distinct FROM a HAVING not_distinct IS NULL

If this query could be improved, I'd welcome any feedback. Anyway, the main problem is counting. Obviously adding a COUNT(*) will not help here and I'm totally lost how to count the records after the filtering.

3
  • Can you please provide sample data you wish to get as o/p? Commented Nov 28, 2012 at 12:51
  • @AjithSasidharan: The question is already clear on that: the query should return every record except #5, because the same value is already included in the set, and the total count would be 5. Commented Nov 28, 2012 at 13:14
  • Yes go the requirement now. Commented Nov 28, 2012 at 13:18

3 Answers 3

16

There's a pretty simple approach that might work for you:

select count(distinct ifnull(f, id))
from a

Note that this query assumes that f values are never id values, and based on sample data and experience this is reasonable.

Edited:

I thought about it and there's an even simpler approach:

select count(distinct f) + sum(f is null) from a;

which you can see running on sqlfiddle

This works because distinct throws away nulls, and sum(condition) counts the number of times condition is true because in mysql true is 1 and false is 0.

Sign up to request clarification or add additional context in comments.

2 Comments

Although your solution generally is better performance-wise, I chose Erwin's answer since it suits my needs better in current situation. Thanks anyway!
@package See edited answer for another pretty simple approach that will perform well too, but is less "smarty pants". (btw, the answer you selected will always suck performance wise).
4

Use NOT EXISTS in a WHERE clause:

SELECT count(*)
FROM   a
WHERE  NOT EXISTS (SELECT * FROM a a2 WHERE a2.f = a.f AND a2.id < a.id);

This way you can also get actual rows - if you need more than the bare count:

SELECT *
FROM   a
WHERE  NOT EXISTS (SELECT * FROM a a2 WHERE a2.f = a.f AND a2.id < a.id)

The = operator makes sure that all rows with f IS NULL are included. You had that in your query already.

-> sqlfiddle

Neither of these would work:

SELECT DISTINCT f FROM a;

SELECT * FROM a GROUP BY f;

.. because both would also fold NULL values, and you want

every record WHERE f IS NULL.

2 Comments

This solution seems so obvious I wonder why I haven't figured it out myself. Thanks a lot, works beautifully.
That's the thing with solutions: most seem so obvious once you have them. :)
0
    UPDATE comments SET view =     
                              (SELECT COUNT(DISTINCT browser, ip, ifnull(user_id, 0)) 
                              FROM counters WHERE item_type=comments.item_type )
    WHERE id=1

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.