sqlalchemy hybrid_attribute expression

Question

Assuming the following models:

class Worker(Model):
    __tablename__ = 'workers'
    ...
    jobs = relationship('Job',
                        back_populates='worker',
                        order_by='desc(Job.started)',
                        lazy='dynamic')

    @hybrid_property
    def latest_job(self):
        return self.jobs.first()  # jobs already ordered descending

    @latest_job.expression
    def latest_job(cls):
        Job = db.Model._decl_class_registry.get('Job')
        return select([func.max(Job.started)]).where(cls.id == Job.worker_id).as_scalar()

class Job(Model):
    ...
    started = db.Column(db.DateTime, default=datetime.utcnow)
    worker_id = db.Column(db.Integer, db.ForeignKey('workers.id'))
    worker = db.relationship('Worker', back_populates='jobs')

While this query provides correct results:

db.session.query(Worker).join(Job.started).filter(Job.started >= datetime.datetime(2017, 5, 10, 0, 2, 45, 932983)).distinct().count()

I was under the assumption I could query that field directly, but this query fails:

db.session.query(Worker).join(Job).filter(Worker.latest_job.started >= datetime.datetime(2017, 5, 10, 0, 2, 45, 932983)).count()

with this error:

AttributeError: Neither 'hybrid_property' object nor 'ExprComparator' object associated with Worker.latest_job has an attribute 'started'

How can I query this property directly? What am I missing here?

EDIT 1: Following @Ilja advice from his answer, I have attempted:

db.session.query(Worker).\
    join(Job).\
    filter(Worker.latest_job >= datetime.datetime(2017, 5, 10, 0, 2, 45, 932983)).\
    count()

but get this error:

TypeError: '>=' not supported between instances of 'Select' and 'datetime.datetime'

If you're getting TypeError: '>=' not supported between instances of 'Select' and 'datetime.datetime', you've removed the call to as_scalar(). — Ilja Everilä
– Ilja Everilä, Commented Apr 11, 2019 at 19:33
@IljaEverilä after digging around I see what you were trying to show me. Coming from Django I was expecting certain behavior which doesn't map to SQL the way SA does. Your answer was helpful and I would like to select it but it has been deleted. Please repost it. — Verbal_Kint
– Verbal_Kint, Commented Apr 11, 2019 at 19:37
@IljaEverilä are scalar values the only kind allowed to be returned in subqueries? — Verbal_Kint
– Verbal_Kint, Commented Apr 11, 2019 at 19:39
No, you can for example compare a constructed row against a subquery's result, or results if using the IN predicate. Also it is quite common to use subqueries in the FROM clause to produce derived tables. — Ilja Everilä
– Ilja Everilä, Commented Apr 11, 2019 at 19:43

Ilja Everilä · Accepted Answer · 2019-10-14 14:40:55Z

1

You're returning a scalar subquery from your hybrid property when used in SQL (class) context, so just use it as you'd use a value expression:

db.session.query(Worker).\
    filter(Worker.latest_job >= datetime.datetime(2017, 5, 10, 0, 2, 45, 932983)).\
    count()

The hybrid property itself needs to explicitly handle correlation in this case:

@latest_job.expression
def latest_job(cls):
    Job = db.Model._decl_class_registry.get('Job')
    return select([func.max(Job.started)]).\
        where(cls.id == Job.worker_id).\
        correlate(cls).\
        as_scalar()

Note that there's some asymmetry between your hybrid property's Python side and SQL side. It produces the latest Job object when accessed on an instance, compared to producing a correlated scalar subquery of max(started) in SQL. If you'd like it to return a Job row in SQL as well, you'd do something like

@latest_job.expression
def latest_job(cls):
    Job = db.Model._decl_class_registry.get('Job')
    return Job.query.\
        filter(cls.id == Job.worker_id).\
        order_by(Job.started.desc()).\
        limit(1).\
        correlate(cls).\
        subquery()

but that's actually less useful mostly, because usually – but not always – this kind of correlated subquery will be slower than joining against a subquery. For example in order to fetch workers with latest jobs that meet the original criteria:

job_alias = db.aliased(Job)
# This reads as: find worker_id and started of jobs that have no matching
# jobs with the same worker_id and greater started, or in other words the
# worker_id, started of the latest jobs.
latest_jobs = db.session.query(Job.worker_id, Job.started).\
    outerjoin(job_alias, and_(Job.worker_id == job_alias.worker_id,
                              Job.started < job_alias.started)).\
    filter(job_alias.id == None).\
    subquery()

db.session.query(Worker).\
    join(latest_jobs, Worker.id == latest_jobs.c.worker_id).\
    filter(latest_jobs.c.started >= datetime.datetime(2017, 5, 10, 0, 2, 45, 932983)).\
    count()

and of course if you just want the count, then you don't need the join at all:

job_alias = db.aliased(Job)
db.session.query(func.count()).\
    outerjoin(job_alias, and_(Job.worker_id == job_alias.worker_id,
                              Job.started < job_alias.started)).\
    filter(job_alias.id == None,
           Job.started >= datetime.datetime(2017, 5, 10, 0, 2, 45, 932983)).\
    scalar()

Please note that the call to Query.scalar() is not the same as Query.as_scalar(), but just returns the first value of the first row.

edited Oct 14, 2019 at 14:40

answered Apr 11, 2019 at 9:26

Ilja Everilä

53.4k9 gold badges138 silver badges142 bronze badges

Sign up to request clarification or add additional context in comments.

11 Comments

Ilja Everilä Over a year ago

You would not have to. The definition of a scalar subquery, usable as a scalar value, is that it is a table of 1 row, of 1 column. SQL is funny that way, and I'd wish it wasn't that "magical". Regarding the correlation issue, just add correlate(cls) to your select() construct, before as_scalar().

Ilja Everilä Over a year ago

For example this is true: select 1 = (select 1);. In SQL (almost) everything is a table. In fact the comparison = is defined in terms of rows, and when you say 1 = 1 it actually means (1) = (1), or in other words compare this row of 1 column to this other row of 1 column.

Ilja Everilä Over a year ago

A <comparison predicate> is defined in the standard as <comparison predicate> ::= <row value predicand> <comparison predicate part 2>, where <comparison predicate part 2> ::= <comp op> <row value predicand>.

Ilja Everilä Over a year ago

Technically session.query(Worker.latest_job_started).all() produces a list of result tuples. The reason it is not a 1-tuple of column of ... is that SQL treats a scalar subquery specially, as if a value.

Ilja Everilä Over a year ago

Usually you'd use a suitable join. Fixed the mish mash of ORM and Core style that you noted.

|

Collectives™ on Stack Overflow

sqlalchemy hybrid_attribute expression

1 Answer 1

11 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

11 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related