1

I want to get the COUNT of distinct values from two independent columns of a table.

My table is:

ID     CR PB      DB CB    
-----------------------------
1      1000       1000
2     60000       1000
3      1000     (NULL)
4   1500000      13000
5     60000      12000
6      1000     (NULL)

expected output:

CR PB    cnt_crpb   DB CB    cnt_dbcb
1000       3        1000        2
60000      2        13000       1
1500000    1        12000       1  

I have tried to separate both columns CR PB and DB CB in two different tables and joined them using LEFT JOIN but does not give expected output as MySQL does not support FULL OUTER JOIN.

I have also tried using UNION which but gives result in rows.

Any help will be appreciated...

Thanks you.

2
  • 2
    how would you relate CR PB and DB CB so they appear in the same row? Commented Oct 6, 2014 at 15:10
  • 1
    @Barranka: it looks like OP is not expecting them to appear on the same row (which is what we'd expect); it appears that the rows are correlated on ascending values of cnt_crpb and cnt_dbcb (highest value correlated with highest value, next highest with next highest), which is a rather bizarre result. It's possible to return a result like this, but the SQL is way more involved. The normative approach would be to return the counts as separate rowsets. Commented Oct 6, 2014 at 15:22

2 Answers 2

3

I think you need to do this using union all:

select max(CRPB) as CRPB, max(CRPB_cnt) as CRPB_cnt, max(DBCB) as DBCB, max(DBCB_cnt) as DBCB_cnt
from ((select (@rn1 := @rn1 + 1) as rn, CRPB, count(CRPB) as CRPB_cnt, NULL as DBCB, NULL as DBCB_cnt
       from table t cross join
            (select @rn1 := 0) as vars
       group by CRPB
      ) union all
      (select (@rn2 := @rn2 + 1) as rn, NULL, NULL, DBCB, count(DBCB) as DBCB_cnt
       from table t cross join
            (select @rn2 := 0) as vars
       group by DBCB
      )
     ) x
group by rn;

This will guarantee results regardless of which list is longest.

Sign up to request clarification or add additional context in comments.

5 Comments

This similar to the approach I would use. (Mine was a little more complicated, I assumed the specification was to "correlate" the counts by ordering them in descending value (highest count for CRPB and highest count for DBCB, next highest with next highest.)... I got the counts first, then assigned the rownum, and then did the UNION ALL of those results.
(that second group by should be group by DBCB)
The only other issue is this is not handling NULL values so it returns an additional row HERE incase you want to remove the nulls first :) +1 from me though nice answer!
Thanks a lot, it worked as per my expectations... instead of using COUNT(*) i used COUNT(CRPB) and COUNT(DBCB) which resulted in NULL occurrence as 0... Will this change cause some Run Time issues? Because as per our sample data until now no issue occurred.
@tejas033 . . . Using count(*) versus count(<column name>) is pretty similar in this case, because you are grouping by the column name.
1

Note you need to determine which column will produce more results aka either CR PB or DB CB whichever produces the most results will be the first select you want to do then left join the other. assuming that there is an uneven number of results from the two

SELECT `CR PB`, cnt_crpb, `DB CB`, cnt_dbcb
FROM
(   SELECT `CR PB`, COUNT(*) as cnt_crpb, @a := @a + 1 as num_rows_a
    FROM test_table
    CROSS JOIN (SELECT @a := 0 ) temp
    WHERE `CR PB` is not null
    GROUP BY `CR PB`
)t
LEFT JOIN
(   SELECT `DB CB`, COUNT(*) as cnt_dbcb, @b := @b + 1 as num_rows_b
    FROM test_table
    CROSS JOIN (SELECT @b := 0)temp1
    WHERE `DB CB` is not null
    GROUP BY `DB CB`
)t1 ON t1.num_rows_b = t.num_rows_a;

Fiddle Demo

8 Comments

With a left join, you will lose results if the second list is longer than the first.
This is an approach. Some issues: What if there are more "distinct" values of "DB CB" than there are of "distinct" values of "CR PB"? What if udv @a has a value of 3 when this statement is executed? (I think the specification is ambiguous, whether the rows are to be ordered by ascending values of CR PB, or ordered by descending value of count, descending value of DB CB.)
True that the number of rows may be more on one side than the other... correlating two unrelated columns isn't easy. but that is something the OP will need to test on his side.
@spencer7593 I think you should post your answer as Gordons doesn't return the correct count See Here
I noticed that the second GROUP BY in Gordon's answer should be group by DBCB rather than group BY CRPB, that's going to cause that second set of counts to return the same number of rows and the same values for the count as the first query, not what OP specified.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.