Using COUNT and GROUP BY in Spark SQL

Question

I'm trying to get pretty basic output that pulls unique NDC Codes for medications and counts the number of unique patients that take each drug. My dataset basically looks like this:

patient_id | drug_ndc
---------------------
01         | 250
02         | 725       
03         | 1075
04         | 1075
05         | 250
06         | 250

I want the output to look something like this:

NDC  | Patients
--------------
250  |  3
1075 |  2
725  |  1

I tried using some queries like this:

select distinct drug_ndc as NDC, count patient_id as Patients
from table 1
group by 1
order by 1

But I keep getting errors. I've tried with and without using an alias, but to no avail.

Gordon Linoff · Accepted Answer · 2019-09-12 16:15:28Z

2

The correct syntax should be:

select drug_ndc as NDC, count(*) as Patients
from table 1
group by drug_ndc
order by 1;

SELECT DISTINCT is almost never appropriate with GROUP BY. And you can can use COUNT(*) unless the patient id can be NULL.

answered Sep 12, 2019 at 16:15

Gordon Linoff

1.3m62 gold badges706 silver badges857 bronze badges

Sign up to request clarification or add additional context in comments.

3 Comments

Raphael Roth Over a year ago

but this is not counting the number of unique patients... just the total number

Gordon Linoff Over a year ago

@RaphaelRoth . . . There are no duplicates in the data in the question -- or mentioned. count(distinct) is not appropriate.

Raphael Roth Over a year ago

yes, it is mentioned :"counts the number of unique patients"

Raphael Roth · Accepted Answer · 2019-09-12 18:34:54Z

1

to get the number of unique patients, you should do:

select drug_ndc as NDC, count(distinct patient_id) as Patients
from table 1
group by drug_ndc;

answered Sep 12, 2019 at 18:34

Raphael Roth

27.3k19 gold badges98 silver badges152 bronze badges

Collectives™ on Stack Overflow

Using COUNT and GROUP BY in Spark SQL

2 Answers 2

3 Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

3 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related