MySQL COUNT of distinct values from two independent columns of a table

Question

I want to get the COUNT of distinct values from two independent columns of a table.

My table is:

ID     CR PB      DB CB    
-----------------------------
1      1000       1000
2     60000       1000
3      1000     (NULL)
4   1500000      13000
5     60000      12000
6      1000     (NULL)

expected output:

CR PB    cnt_crpb   DB CB    cnt_dbcb
1000       3        1000        2
60000      2        13000       1
1500000    1        12000       1

I have tried to separate both columns CR PB and DB CB in two different tables and joined them using LEFT JOIN but does not give expected output as MySQL does not support FULL OUTER JOIN.

I have also tried using UNION which but gives result in rows.

Any help will be appreciated...

Thanks you.

how would you relate CR PB and DB CB so they appear in the same row? — Barranka
– Barranka, Commented Oct 6, 2014 at 15:10
@Barranka: it looks like OP is not expecting them to appear on the same row (which is what we'd expect); it appears that the rows are correlated on ascending values of cnt_crpb and cnt_dbcb (highest value correlated with highest value, next highest with next highest), which is a rather bizarre result. It's possible to return a result like this, but the SQL is way more involved. The normative approach would be to return the counts as separate rowsets. — spencer7593
– spencer7593, Commented Oct 6, 2014 at 15:22

Gordon Linoff · Accepted Answer · 2014-10-07 10:59:38Z

3

I think you need to do this using union all:

select max(CRPB) as CRPB, max(CRPB_cnt) as CRPB_cnt, max(DBCB) as DBCB, max(DBCB_cnt) as DBCB_cnt
from ((select (@rn1 := @rn1 + 1) as rn, CRPB, count(CRPB) as CRPB_cnt, NULL as DBCB, NULL as DBCB_cnt
       from table t cross join
            (select @rn1 := 0) as vars
       group by CRPB
      ) union all
      (select (@rn2 := @rn2 + 1) as rn, NULL, NULL, DBCB, count(DBCB) as DBCB_cnt
       from table t cross join
            (select @rn2 := 0) as vars
       group by DBCB
      )
     ) x
group by rn;

This will guarantee results regardless of which list is longest.

edited Oct 7, 2014 at 10:59

answered Oct 6, 2014 at 15:30

Gordon Linoff

1.3m62 gold badges706 silver badges857 bronze badges

Sign up to request clarification or add additional context in comments.

5 Comments

spencer7593 Over a year ago

This similar to the approach I would use. (Mine was a little more complicated, I assumed the specification was to "correlate" the counts by ordering them in descending value (highest count for CRPB and highest count for DBCB, next highest with next highest.)... I got the counts first, then assigned the rownum, and then did the UNION ALL of those results.

spencer7593 Over a year ago

(that second group by should be group by DBCB)

John Ruddell Over a year ago

The only other issue is this is not handling NULL values so it returns an additional row HERE incase you want to remove the nulls first :) +1 from me though nice answer!

tejas033 Over a year ago

Thanks a lot, it worked as per my expectations... instead of using COUNT(*) i used COUNT(CRPB) and COUNT(DBCB) which resulted in NULL occurrence as 0... Will this change cause some Run Time issues? Because as per our sample data until now no issue occurred.

Gordon Linoff Over a year ago

@tejas033 . . . Using count(*) versus count(<column name>) is pretty similar in this case, because you are grouping by the column name.

John Ruddell · Accepted Answer · 2014-10-06 15:37:58Z

1

Note you need to determine which column will produce more results aka either CR PB or DB CB whichever produces the most results will be the first select you want to do then left join the other. assuming that there is an uneven number of results from the two

SELECT `CR PB`, cnt_crpb, `DB CB`, cnt_dbcb
FROM
(   SELECT `CR PB`, COUNT(*) as cnt_crpb, @a := @a + 1 as num_rows_a
    FROM test_table
    CROSS JOIN (SELECT @a := 0 ) temp
    WHERE `CR PB` is not null
    GROUP BY `CR PB`
)t
LEFT JOIN
(   SELECT `DB CB`, COUNT(*) as cnt_dbcb, @b := @b + 1 as num_rows_b
    FROM test_table
    CROSS JOIN (SELECT @b := 0)temp1
    WHERE `DB CB` is not null
    GROUP BY `DB CB`
)t1 ON t1.num_rows_b = t.num_rows_a;

Fiddle Demo

edited Oct 6, 2014 at 15:37

answered Oct 6, 2014 at 15:22

John Ruddell

25.9k7 gold badges60 silver badges88 bronze badges

8 Comments

Gordon Linoff Over a year ago

With a left join, you will lose results if the second list is longer than the first.

spencer7593 Over a year ago

This is an approach. Some issues: What if there are more "distinct" values of "DB CB" than there are of "distinct" values of "CR PB"? What if udv @a has a value of 3 when this statement is executed? (I think the specification is ambiguous, whether the rows are to be ordered by ascending values of CR PB, or ordered by descending value of count, descending value of DB CB.)

John Ruddell Over a year ago

True that the number of rows may be more on one side than the other... correlating two unrelated columns isn't easy. but that is something the OP will need to test on his side.

John Ruddell Over a year ago

@spencer7593 I think you should post your answer as Gordons doesn't return the correct count See Here

spencer7593 Over a year ago

I noticed that the second GROUP BY in Gordon's answer should be group by DBCB rather than group BY CRPB, that's going to cause that second set of counts to return the same number of rows and the same values for the count as the first query, not what OP specified.

|

Collectives™ on Stack Overflow

MySQL COUNT of distinct values from two independent columns of a table

2 Answers 2

5 Comments

8 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

5 Comments

8 Comments

Your Answer

Sign up or log in

Post as a guest

Related