46

I am trying to get the following to return a count for every organization using a left join in PostgreSQL, but I cannot figure out why it's not working:

  select o.name as organisation_name,
         coalesce(COUNT(exam_items.id)) as total_used
  from organisations o
  left join exam_items e on o.id = e.organisation_id
  where e.item_template_id = #{sanitize(item_template_id)}
  and e.used = true
  group by o.name
  order by o.name

Using coalesce doesn't seem to work. I'm at my wit's end! Any help would certainly be appreciated!

To clarify what's not working, at the moment the query only returns values for organisations that have a count greater than 0. I would like it to return a line for every organisation, regardless of the count.

Table definitions:

TABLE exam_items
  id serial NOT NULL
  exam_id integer
  item_version_id integer
  used boolean DEFAULT false
  question_identifier character varying(255)
  organisation_id integer
  created_at timestamp without time zone NOT NULL
  updated_at timestamp without time zone NOT NULL
  item_template_id integer
  stem_id integer
  CONSTRAINT exam_items_pkey PRIMARY KEY (id)

TABLE organisations
  id serial NOT NULL
  slug character varying(255)
  name character varying(255)
  code character varying(255)
  address text
  organisation_type integer
  created_at timestamp without time zone NOT NULL
  updated_at timestamp without time zone NOT NULL
  super boolean DEFAULT false
  CONSTRAINT organisations_pkey PRIMARY KEY (id)
8
  • 1
    coalesce(COUNT(exam_items.id),0) as total_used ??? Commented Mar 17, 2013 at 23:40
  • well yeah, I've also tried COUNT(exam_items.id) as total_used, but i have to say sql is not really my forte! Commented Mar 17, 2013 at 23:47
  • sorry I see what you mean! I've tried that, but still no luck :/ Commented Mar 17, 2013 at 23:53
  • 1
    When posting a question, please spell out what "it's not working" means within the question - returning nothing, returning an incorrect value, yielding an error, etc. Commented Mar 17, 2013 at 23:58
  • 1
    Not relevant, but still important: organisation_id integer REFERENCES organisations(id) Commented Mar 18, 2013 at 0:30

2 Answers 2

88

Fix the LEFT JOIN

This should work:

SELECT o.name AS organisation_name, count(e.id) AS total_used
FROM   organisations   o
LEFT   JOIN exam_items e ON e.organisation_id = o.id 
                        AND e.item_template_id = #{sanitize(item_template_id)}
                        AND e.used
GROUP  BY o.name
ORDER  BY o.name;

You had a LEFT [OUTER] JOIN but the later WHERE conditions made it act like a plain [INNER] JOIN.
Move the condition(s) to the JOIN clause to make it work as intended. This way, only rows that fulfill all these conditions are joined in the first place (or columns from the right table are filled with NULL). Like you had it, joined rows are tested for additional conditions virtually after the LEFT JOIN and removed if they don't pass, just like with a plain JOIN.

count() never returns NULL to begin with. It's an exception among aggregate functions in this respect. Therefore, COALESCE(COUNT(col)) never makes sense, even with additional parameters. The manual:

It should be noted that except for count, these functions return a null value when no rows are selected.

Bold emphasis mine. See:

count() must be on a column defined NOT NULL (like e.id), or where the join condition guarantees NOT NULL (e.organisation_id, e.item_template_id, or e.used) in the example.

Since used is type boolean, the expression e.used = true is noise that burns down to just e.used.

Since o.name is not defined UNIQUE NOT NULL, you may want to GROUP BY o.id instead (id being the PK) - unless you intend to fold rows with the same name (including NULL).

Aggregate first, join later

If most or all rows of exam_items are counted in the process, this equivalent query is typically considerably faster / cheaper:

SELECT o.id, o.name AS organisation_name, e.total_used
FROM   organisations o
LEFT   JOIN (
   SELECT organisation_id AS id   -- alias to simplify join syntax
        , count(*) AS total_used  -- count(*) = fastest to count all
   FROM   exam_items
   WHERE  item_template_id = #{sanitize(item_template_id)}
   AND    used
   GROUP  BY 1
   ) e USING (id)
ORDER  BY o.name, o.id;

(This is assuming that you don't want to fold rows with the same name like mentioned above - the typical case.)

Now we can use the faster / simpler count(*) in the subquery, and we need no GROUP BY in the outer SELECT.

See:

Sign up to request clarification or add additional context in comments.

7 Comments

@wildplasser: That one has been missed by the best. It's a sneaky one.
I guess, it is basically a style/taste issue. Personally, I would never write such a line (I hope ;-) Just like coalesce() with only one argument makes no sense. And the lack of correlation names / aliases confuses me.
@wildplasser: I always start by formatting and cleaning up the query, before I even try to understand the heap of text.
Me, too. Normally. (that's the real reason for my love for CTEs) But in case I think I get the issue in one reading pass, I get too trigger happy.
Just in case you are getting 1 instead of 0 for the count: stackoverflow.com/questions/44155025/… (the column specificied in the count function is important).
|
15

To make it clear,

the important line is GROUP BY MAIN_TABLE that will handle NULL value from SOME_TABLE

SELECT COUNT(ST.ID)
FROM MAIN_TABLE MT
LEFT JOIN SOME_TABLE ST ON MT.ID = ST.MT_ID

GROUP BY MT.ID -- this line is a must

1 Comment

Simple & clear 👍🔥

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.