8

I'm using Django 1.6 with PostgreSQL and have following model:

# models.py
class Game(AbstractContentModel, AbstractScoreModel):
    name = models.CharField(_("name"), max_length=100, blank=True)
    developer = models.CharField(_('Developer'), max_length=255)
    distributor = models.CharField(_('Distributor'), max_length=255, blank=True)
    # ...
    reviews = models.ManyToManyField(Review, related_name="games", blank=True, verbose_name=_("Reviews"))
    videos = models.ManyToManyField(Video, related_name="games", blank=True, verbose_name=_("Videos"))
    images = models.ManyToManyField(Image, related_name="games", blank=True, verbose_name=_("Gallery"))

I'm trying to get all games and to each of them add count of related videos, reviews and images as follows:

# views.py
qs = Game.objects.all()
qs = qs.annotate(video_count=models.Count('videos'))
qs = qs.annotate(review_count=models.Count('reviews'))
qs = qs.annotate(image_count=models.Count('images'))

Result query is:

SELECT 
"content_game"."id", 
"content_game"."name",
"content_game"."developer", 
"content_game"."distributor",
COUNT("content_game_videos"."video_id") AS "video_count",
COUNT("content_game_reviews"."review_id") AS "review_count", 
COUNT("content_game_images"."image_id") AS "image_count" 
FROM "content_game" 
LEFT OUTER JOIN "content_game_videos" ON ( "content_game"."id" = "content_game_videos"."game_id" )
LEFT OUTER JOIN "content_game_reviews" ON ( "content_game"."id" = "content_game_reviews"."game_id" ) 
LEFT OUTER JOIN "content_game_images" ON ( "content_game"."id" = "content_game_images"."game_id" ) 
GROUP BY 
"content_game"."id", 
"content_game"."name",
"content_game"."developer", 
"content_game"."distributor";

My question is - why there are all selected fields in GROUP BY clause? And most importantly, how can I get rid of them besides raw SQL?

I gathered that it would be good to use .values(), but I want every field in result query. If I just use GROUP BY "content_game"."id", the result is the same, but I don't know how to use it like this with Django ORM.

4
  • 1
    My question is - why there are all selected fields in GROUP BY clause? And most importantly, how can I get rid of them besides raw SQL? Postgresql is not MySQL more_info Just try to execute raw query with only "content_game"."id" in the GROUP BY Commented Dec 15, 2015 at 8:58
  • @lad2025 Thanks, the issue is really with MySQL vs PostgreSQL. But I don't want to use raw SQL if I don't really have to. I found out that this bug was fixed by code.djangoproject.com/ticket/19259 , nevertheless I don't see the solution to it. Btw Django version is 1.6, I will add it to my question. Commented Dec 15, 2015 at 9:51
  • At the end I solved it by NOT counting related items in query. I have created fields images_count and videos_count on Game model and I'm updating those fields on save and delete. At least this way it will be always optimal for viewing :) Commented Dec 19, 2015 at 18:59
  • This might be helpful: docs.djangoproject.com/en/1.11/topics/db/aggregation/… Commented Jan 26, 2018 at 14:01

1 Answer 1

2

I'm too late to answer but I have kind of studied answer. First of all, the behavior what OP said, is correct. And it should be for all kind of relational database, but as @lad2025 said in the comment the only MySQL (MariaDB as well) was excusing to this behavior. So instead of getting rid of it we must learn it. As this is the only correct way to use it / understand it and get used to with it.

Currently OP has found solution that is workaround of keep updating the table fields with counts of rows somewhere and populate later instead of retrieving it data on the fly by query.

As whatever you add into GROUP BY must be included into SELECT and vice versa while using aggregative functions.

"content_game"."id", "content_game"."name", "content_game"."developer",  "content_game"."distributor"

Above fields must come into GROUP BY because if I try to print id or any field from the selection which row's field (single) value this should return? either last or first row from the group bunch which is incorrect (which MySQL does). So better keep all into GROUP BY.

Why? Because of ambiguity - group by produces a single record for each group of records. Group By columns contain the same value for all records in the group and any aggregative function produces the single value for a group, but "content_game"."developer" can be different for "content_game"."distributor" group "content_game"."id" will be different for all rows as it's primary key. Same for all fields.

Good explanation can be found here

And I would like people to share (edit my post) more statement in more nicer way. So other people can found this useful in the future.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.