I've spent the last day trying to get an aggregation over a time series from my db. I tried to use the Django ORM but quickly gave up and went running back to SQL. I don't think there's a way to use PSQL generate_series with it, I assume they'd prefer you to use itertools or another method in python.
I have a model much like this:
class Vote(models.Model):
value = models.IntegerField(default=0)
timestamp = models.DateTimeField('date voted', auto_now_add=True)
location = models.ForeignKey('location', on_delete=models.CASCADE)
What I want to do, is show a series of metrics over time -- for now, an aggregation per hour of the current day for the current user. The user has a timezone set (defaults to 'America/Chicago'). I've been jacking around with the postgres query, inserting tons of AS TIME ZONE casts in an effort to wrangle the bounds and return values of the query. I had it returning the correct results late last night but this morning, it's off again. I know it's got to be something very dumb that I'm doing. I even resorted to double-casting timestamps because of the weird way Postgres handles AT TIME ZONE (correcting TO UTC instead of FROM)
Again, I'd like to show buckets of aggregates for each hour of the user's current day up to/including 'now'.
This is my current query:
WITH hour_intervals AS (
SELECT * FROM generate_series(date_trunc('day',(SELECT TIMESTAMP 'today' AT TIME ZONE 'UTC' AT TIME ZONE %s)), (LOCALTIMESTAMP AT TIME ZONE 'UTC' AT TIME ZONE %s), '1 hour') start_time
)
SELECT f.start_time,
COUNT(id) total,
COUNT(CASE WHEN value > 0 THEN 1 END) AS positive_votes,
COUNT(CASE WHEN value = 0 THEN 1 END) AS indifferent_votes,
COUNT(CASE WHEN value < 0 THEN 1 END) AS negative_votes,
SUM(CASE WHEN value > 0 THEN 2 WHEN value = 0 THEN 1 WHEN value < 0 THEN -4 END) AS score
FROM votes_vote m
RIGHT JOIN hour_intervals f
ON m.timestamp AT TIME ZONE %s >= f.start_time AND m.timestamp AT TIME ZONE %s < f.start_time + '1 hour'::interval
AND m.location_id = %s
GROUP BY f.start_time
ORDER BY f.start_time
DEBUGGING INFO
Django 1.9.2 and my settings.py has USE_TZ=True
Postgres 9.5.2 and my login role for django has
ALTER ROLE yesno_django
SET client_encoding = 'utf8';
ALTER ROLE yesno_django
SET default_transaction_isolation = 'read committed';
ALTER ROLE yesno_django
SET TimeZone = 'UTC';
UPDATE Fiddling with the query some more, this is now a working query for today's votes...
WITH hour_intervals AS (
SELECT * FROM generate_series((SELECT TIMESTAMP 'today' AT TIME ZONE 'UTC'), (LOCALTIMESTAMP AT TIME ZONE 'UTC' AT TIME ZONE %s), '1 hour') start_time
)
SELECT f.start_time,
COUNT(id) total,
COUNT(CASE WHEN value > 0 THEN 1 END) AS positive_votes,
COUNT(CASE WHEN value = 0 THEN 1 END) AS indifferent_votes,
COUNT(CASE WHEN value < 0 THEN 1 END) AS negative_votes,
SUM(CASE WHEN value > 0 THEN 2 WHEN value = 0 THEN 1 WHEN value < 0 THEN -4 END) AS score
FROM votes_vote m
RIGHT JOIN hour_intervals f
ON m.timestamp AT TIME ZONE %s >= f.start_time AND m.timestamp AT TIME ZONE %s < f.start_time + '1 hour'::interval
AND m.location_id = %s
GROUP BY f.start_time
ORDER BY f.start_time
How come the query I had earlier worked perfectly from 7pm to 10pmish last night but then fails today? Should I expect this new query to fall down as well?
Can someone explain where I went wrong the first time (or every time)?
DATE_TRUNC? Django have built-in option for using it.votes = Vote.objects.filter(location=l).filter(timestamp__date=timezone.now().date()).extra({"hour":"date_trunc('hour',timestamp)"}).values("hour").order_by().annotate(score=score_annotation, count=Count('id'))I think it's close -- I'm going to play with this method a bit more. thanks!