0

Suppose database.tbl contains two fields: fruit and price.

  • Row 1: fruit='apple', price=0.60.
  • Row 2: fruit='pear', price=0.60.
  • Row 3: fruit='peach', price=0.50.

I want to run a query which returns the count of distinct values for each field, i.e. it should return

  • fruit 3
  • price 2

because there are 3 fruits, but only 2 distinct prices.

I've tried

SELECT C.COLUMN_NAME, COUNT(DISTINCT C.COLUMN_NAME) FROM tbl JOIN (SELECT COLUMN_NAME FROM INFORMATION_SCHEMA.COLUMNS  WHERE TABLE_SCHEMA='database' AND TABLE_NAME='tbl') AS C GROUP BY C.COLUMN_NAME;

but this returns:

  • fruit 1
  • price 1
4
  • It's only returning 1 because you're counting the column name, not the data in the column. Does this have to work without knowing what columns you want the count for? Commented Feb 9, 2015 at 20:12
  • Thanks. Yes, it has to work without knowing the columns in advance. All it knows is that it must count for all columns. Commented Feb 9, 2015 at 20:14
  • Is it acceptable to use a stored procedure to generate the results? Commented Feb 9, 2015 at 20:21
  • Yes, a stored procedure would be fine. Commented Feb 9, 2015 at 20:27

1 Answer 1

1

The only way I know to do it is to take the column names from the INFORMATION_SCHEMA query and put them in a temp table. Then iterate over the column names in the temp table and execute dynamic sql to query the distinct values in those columns. Here is an example in T-SQL, so you'll have to make some changes.

CREATE TABLE #values
(
    COLUMN_NAME VARCHAR(100),
    DISTINCT_COUNT INT
)

DECLARE @column AS VARCHAR(100) = ''

SELECT COLUMN_NAME INTO #columns FROM INFORMATION_SCHEMA.COLUMNS  WHERE TABLE_SCHEMA='dbo' AND TABLE_NAME='tbl'

SELECT  TOP 1@column = COLUMN_NAME
FROM    #columns
WHERE   COLUMN_NAME > @column
ORDER BY COLUMN_NAME ASC

WHILE (@@ROWCOUNT > 0)
BEGIN
    DECLARE @sql AS NVARCHAR(1000) = N'INSERT INTO #values (COLUMN_NAME, DISTINCT_COUNT) SELECT ''' + @column + ''', COUNT(DISTINCT ' + @column + N') FROM [db].[dbo].[tbl]'
    EXECUTE sp_executesql @sql

    SELECT  TOP 1 @column = COLUMN_NAME
    FROM    #columns
    WHERE   COLUMN_NAME > @column
    ORDER BY COLUMN_NAME ASC
END

SELECT * FROM #values

DROP TABLE #columns
DROP TABLE #values

This is pretty rough and just an example of how it can be done. If this were production code, I'd make sure the variable types are all consistent and make sure the column names are safe. This will help: How To have Dynamic SQL in MySQL Stored Procedure

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.