1

I am having an issue with the GROUP BY function. I have the following SQL which is returning partially correct results, however, I can't seem to figure out how to order the GROUP BY and COUNT in such a way where it returns the correct amount of rows.

I have the following data in the table where the count of the bookdescid is greater than 1 - [1]: https://i.sstatic.net/QKzaL.png

Of the data shown in the screenshot above there are 24 unique publisherid

In the subselect part of the query shown below, i am trying to return the publisherid where the bookdescid is > 1, however, i keep running into issues as when i group by bookdescid it only returns 14 rows. The result should be 24

Annoying because I had this working earlier but for the life of me i cannot figure out where i am going wrong now or how to get it back

The result should only be where the bookdescid is > 1 not grouping by the bookdescid only as then it is removing rows without considering the publisherid

I've tried many combinations of GROUP BY but i somehow cannot nail this one

code here :

SELECT publisherfullname
FROM publisher
WHERE publisherid IN (
  SELECT publisherid
  FROM published_by
  GROUP BY bookdescid
  HAVING count(bookdescid) > 1);
5
  • "Display the names of publishers that have worked together on the same book in any capacity identifying which publisher they worked with. There must be no rows that duplicate the same information" Commented Nov 8, 2020 at 3:17
  • What is the expected output here? Commented Nov 8, 2020 at 3:21
  • The expected outcome of the subselect is to return a list of 24 publisherid where the bookdescid count is greater than 1. Yet i am having issues when i group by bookdescid it is removing some of the rows where the publisherid is different I want something like the following where the bookdescid count is greater than 1 it shows me all of the publisherid rows that are associated with that bookdescid - prnt.sc/vfev11 Commented Nov 8, 2020 at 3:30
  • Basically, if the bookdescid count is greater than 1, return me all of the publisherid that are associated with that bookdescid Commented Nov 8, 2020 at 3:32
  • Voting to close as unclear what you are asking. No sample input, no sample output => hard to figure out what you want. Commented Nov 8, 2020 at 3:55

1 Answer 1

2

What you need to do is find all the bookdescid which have a count > 1, and then select the publisherid values associated with those bookdescid values. You can then join that result to the publisher table to get the publisher names. Without sample data it's hard to be certain, but this should work:

SELECT p.publisherfullname
FROM (
    SELECT DISTINCT publisherid
    FROM published_by
    WHERE bookdescid IN (
        SELECT bookdescid 
        FROM published_by
        GROUP BY bookdescid
        HAVING COUNT(bookdescid) > 1
    )
) pb
JOIN publisher p ON p.publisherid = pb.publisherid

(Small) demo on db-fiddle

Sign up to request clarification or add additional context in comments.

6 Comments

Worked like a charm! Thanks very much Nick. I see what you are saying now about the order of processes.
@KJDD1 I think you might need a DISTINCT on the outer subquery to avoid duplication of results where a publisher is involved with two separate books. See my edit.
@KJDD1 in terms of sample data, text in the post is far preferred, there are tools which can be used to convert it into DDL which can then be used to test queries on sites such as db-fiddle.com and dbfiddle.uk (both of which I highly recommend for testing)
@KJDD1 I've added a small demo to my answer with a subset of your data so you can check out one of the db fiddle sites. When you're asking a question, setting up a fiddle with sample data is a big plus.
Awesome! Thanks heaps Nick, much appreciated
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.