0

I Currently have written an sql query like this:

SELECT   a, b, count(id) / ????
FROM     X
group by a, b

Please pay attention to the ????? in the code above. There are two groups in group by. Namely by columns a and b.

What I intend to do is to divide the (1) count of items in the second group (b) to the (2) count of items in the first group (a).

count(id) accomplishes (1). But I don't know what to write for (2).

P.S.: However I know It is possible to replace "????" with another complex select query, but I want to learn if there is a simpler way to aggregate items of first Group by.

Update: Sample data:

id       a           b
______________________
1       'A'         'G'
2       'A'         'H'
3       'A'         'H'
4       'B'         'G'
5       'B'         'G'
6       'B'         'K'
7       'B'         'K'

results:

a           b          [unnamed]
________________________________
'A'         'G'        0.33333
'A'         'H'        0.66667
'B'         'G'        0.5
'B'         'K'        0.5

The third column is the percentage of columns with values as in column b relative to values in column a.

Thank you.

6
  • 1
    Edit your question and provide sample data and desired results. What is the first group? What is the second group? Commented Nov 15, 2017 at 13:14
  • First, count(*) as I have learned is not what you want, it selects all columns and on large tables, this can be a performance hit. You may want to switch to count(eventid) assuming that eventid does exist and is your primary key. Commented Nov 15, 2017 at 13:22
  • Yes @Roland. count(eventid) is correct. however as i tested count(*) is also working identical. I am editing the question to make it more clear. Commented Nov 15, 2017 at 13:23
  • Then an idea might be, that you use views (see CREATE VIEW) for each count and then select (means "materializing") both views. But this could be a performance hit again as these SELECTs and views can be "expensive" on large tables with many millions of rows. Maybe a better solution is to switch to a programmatic/SQL approach. First use your programming language's "count rows" method/function, then divide both and make sure division-by-zero does not happen. Commented Nov 15, 2017 at 13:26
  • 1
    @AliJey yes, sure it produces the same result as primary key or all columns are both unique and will give you same row count. Commented Nov 15, 2017 at 13:27

2 Answers 2

2

Maybe this works if you are using SQL Server

SELECT  DISTINCT  a, b, (COUNT(*)OVER(PARTITION BY a, b)* 1.0/COUNT(*)OVER(PARTITION BY a)) as unnamed
FROM Sample 
Sign up to request clarification or add additional context in comments.

4 Comments

I Tried it and it seems to work. Since I edited the question, can you change the column names so it conforms to the question and is more clear?
Can you try an EXPLAIN in front of it and check how it is performing? I hope your DBMS support this.
@Roland, ur right, I assumed this is in sql server, seems op is not using that
In my case This answer (query) takes 14 Seconds to execute. While Tree Frog's answer does it almost instantly. However this answer suits the question requirements more.
2
select A.a, B.b, a.num, b.num, CAST(b.num as float) / cast(a.num as float)
from
(select a, count(*) as num
from @table 
group by a) as A
join 
(select a, b, count(*) as num
from @table
group by a, b) as B on A.a = B.a

1 Comment

Please use backticks around your code. Like SELECT foo FROM bar.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.