9

Say I have the following models:

class Department(Base):
    __tablename__ = 'departments'
    id = Column(Integer, primary_key=True)

class Employee(Base):
    __tablename__ = 'employees'
    id = Column(Integer, primary_key=True)
    department_id = Column(None, ForeignKey(Department.id), nullable=False)
    department = relationship(Department, backref=backref('employees'))

Sometimes, when I query departments, I would also like to fetch the number of employees they have. I can achieve this with a column_property, like so:

Department.employee_count = column_property(
    select([func.count(Employee.id)])
    .where(Employee.department_id == Department.id)
    .correlate_except(Employee)
)

Department.query.get(1).employee_count # Works

But then the count is always fetched via a subquery, even when I don't need it. Apparently I can't ask SQLAlchemy not to load this at query time, either:

Department.query.options(noload(Department.employee_count)).all()
# Exception: can't locate strategy for <class 'sqlalchemy.orm.properties.ColumnProperty'> (('lazy', 'noload'),)

I've also tried implementing this with a hybrid property instead of a column property:

class Department(Base):
    #...
    
    @hybrid_property
    def employee_count(self):
        return len(self.employees)

    @employee_count.expression
    def employee_count(cls):
        return (
            select([func.count(Employee.id)])
            .where(Employee.department_id == cls.id)
            .correlate_except(Employee)
        )

With no luck:

Department.query.options(joinedload('employee_count')).all()
# AttributeError: 'Select' object has no attribute 'property'

I know I can just query the count as a separate entity, but I need it often enough that I'd really prefer the convenience of having it as an attribute on the model. Is this even possible in SQLAlchemy?

Edit: To clarify, I want to avoid the N+1 problem and have the employee count get loaded in the same query as the departments, not in a separate query for each department.

5
  • Can't you just use a plain descriptor? Commented Sep 14, 2016 at 0:32
  • @univerio: No, because I need the counts to be calculated in the same query. If I have a collection of Employers, I don't want to run a query for each one. Commented Sep 14, 2016 at 3:19
  • Did you ask in sqlachemy user group? It is pretty active. Commented Sep 17, 2016 at 3:20
  • Can't you just add a calculated column with number of employees? Commented Sep 17, 2016 at 3:24
  • @denfromufa: No actually, I hadn't thought of that! I will ask there. As for the calculated column, I can do that, but this is for a legacy codebase and I really want to make as few changes as possible. Commented Sep 17, 2016 at 4:26

1 Answer 1

12
+500

The loading strategies that you tried are for relationships. The loading of a column_property is altered in the same way as normal columns, see Deferred Column Loading.

You can defer the loading of employee_count by default by passing deferred=True to column_property. When a column is deferred, a select statement is emitted when the property is accessed.

defer and undefer from sqlalchemy.orm allow this to be changed when constructing a query:

from sqlalchemy.orm import undefer
Department.query.options(undefer('employee_count')).all()
Sign up to request clarification or add additional context in comments.

1 Comment

Aha! I knew I was close, but I didn't know about deferred column loading. Just tested it out and it works exactly as expected. Thank you!

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.