Using function output in SQLAlchemy join clause

Question

I am trying to translate a fairly short bit of SQL into an sqlAlchemy ORM query. The SQL uses Postgres's generate_series to make a set of dates and my goal is to make a set of time series arrays categorized by one of the columns.

The tables (simplified) are very simple:

counts:
-----------------
count   (Integer)
day     (Date)
placeID (foreign key related to places)

"counts_pkey" PRIMARY KEY (day, placeID)

places:
-----------------
id
name   (varchar)

The output I'm after is a time series of counts for each place including null values when counts are not reported for a day. For example, this would correspond to a series over four days:

    array_agg    |    name
-----------------+-------------------
 {NULL,0,7,NULL} | A Place
 {NULL,1,NULL,2} | Some other place
 {5,NULL,3,NULL} | Yet another

I can do this fairly easily by taking a CROSS JOIN on a date range and places and joining that with the counts:

SELECT array_agg(counts.count), places.name 
FROM generate_series('2018-11-01', '2018-11-04', interval '1 days') as day 
CROSS JOIN  places 
LEFT OUTER JOIN counts on counts.day = day.day AND counts.PlaceID = places.id 
GROUP BY places.name;

What I can't seem to figure out is how to get SQLAlchemy to do this. After a lot of digging, I found an old google groups thread which almost works leading to this:

date_list = select([column('generate_series')])\
.select_from(func.generate_series(backthen, today, '1 day'))\ 
.alias('date_list')

time_series = db.session.query(Place.name, func.array_agg(Count.count))\
.select_from(date_list)\
.outerjoin(Count, (Count.day == date_list.c.generate_series) & (Count.placeID == Place.id ))\
.group_by(Place.name)

This creates a sub-select for the time series, but it produces a database error:

There is an entry for table "places", but it cannot be referenced from this part of the query.

So my question is: how would you do this in sqlalchemy. Also, I'm open to the idea that this is difficult because my approach with the SQL is bone-headed.

Ilja Everilä · Accepted Answer · 2018-11-04 21:06:29Z

The problem is that given the query construct SQLAlchemy produces a query along the lines of

SELECT ...
FROM places,
     (...) AS date_list LEFT OUTER JOIN count ON ... AND count."placeID" = places.id
...

There are 2 FROM-list items: places and the join. Items cannot cross-reference each other¹, and hence the error due to places.id in the ON-clause.

SQLAlchemy does not support explicit CROSS JOIN, but on the other hand a CROSS JOIN is equivalent to an INNER JOIN ON (TRUE). You could also omit wrapping the function expression in a subquery and use it as is by giving it an alias:

date_list = func.generate_series(backthen, today, '1 day').alias('gen_day')

time_series = session.query(Place.name, func.array_agg(Count.count))\
    .join(date_list, true())\
    .outerjoin(Count, (Count.day == column('gen_day')) &
                      (Count.placeID == Place.id ))\
    .group_by(Place.name)

¹: Except function-call FROM-items, or using LATERAL.

Collectives™ on Stack Overflow

Using function output in SQLAlchemy join clause

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related