2

I'm having a hard time using foreign key references for inheritance in SQLAlchemy.

I have a drives table that looks like this with id as the primary key:

   Column     |         Type          | Collation | Nullable |              Default
---------------+-----------------------+-----------+----------+------------------------------------
 id            | integer               |           | not null | nextval('drives_id_seq'::regclass)
 model         | integer               |           | not null |

I also have another table called smart that looks like this with <ts, drive> as a primary key and drive is a foreign key referencing drives.id:

   Column     |         Type          | Collation | Nullable |              Default
---------------+-----------------------+-----------+----------+------------------------------------
 drive         | integer                    |           | not null | nextval('drives_id_seq'::regclass)
 ts            | timestamp without timezone |           | not null |
 value         | integer                    |           |          |

I have the following class definitions to represent the above tables.

class Drives(Base):
    __tablename__ = 'drives'
    id = Column('id', Integer, primary_key=True)
    model = Column('model', String)

class Smart(Base):
   ___tablename__ = 'smart'
   drive = Column('drive', Integer, ForeignKey=Drives.id)
   ts = Column('ts', TIMESTAMP)
   value = Column('value', Integer)
   drives = relationship('Drives')
   # I would like something like the following to work, but I get an AttributeError for `model`
   __mapper_args__ = {'primary_key': [ts, drive], 'polymorphic_on': drives.model} 

I would like to create two derived classes ModelASmart or ModelBSmart where smart.value is interpreted differently based on the model corresponding to the drive.

class ModelASmart(Smart):
    __mapper_args__ = {'polymorphic_identity': 'ModelA', 'primary_key': [Smart.ts, Smart.drive]}
    @hybrid_property
    def actual_value(self):
        return self.value * 2

class ModelBSmart(Smart):
    __mapper_args__ = {'polymorphic_identity': 'ModelB', 'primary_key': [Smart.ts, Smart.drive]}
    @hybrid_property
    def actual_value(self):
        return self.value * 3

My question: How do I refer to a column (model) from another table (drives) as a discriminator in the main table smart?

1 Answer 1

3

You could use a column_property attribute to make the model "local" to Smart, at the expense of having a correlated subquery:

class Drives(Base):
    __tablename__ = 'drives'
    id = Column(Integer, primary_key=True)
    model = Column(String)

class Smart(Base):
   __tablename__ = 'smart'
   drive = Column(Integer, ForeignKey(Drives.id), primary_key=True)
   ts = Column(DateTime, primary_key=True)
   value = Column(Integer)
   drives = relationship(Drives)
   model = column_property(select([Drives.model]).where(Drives.id == drive))
   __mapper_args__ = {'polymorphic_on': model}

class ModelASmart(Smart):
    __mapper_args__ = {'polymorphic_identity': 'ModelA'}
    @hybrid_property
    def actual_value(self):
        return self.value * 2

class ModelBSmart(Smart):
    __mapper_args__ = {'polymorphic_identity': 'ModelB'}
    @hybrid_property
    def actual_value(self):
        return self.value * 3

The column property will always be included in your queries, which means that the correlated subquery will probably determine query performance.

SQLAlchemy has other means of introducing attributes over relationships as well, such as association proxies and hybrid properties, but those cannot be used as a polymorphic_on discriminator. Yet another, a bit more exotic, possibility would be to map Smart over a join between smart and drives table.


Another option would be to forgo using inheritance, and use a plain hybrid property on Smart instead:

class Drives(Base):
    __tablename__ = 'drives'
    id = Column(Integer, primary_key=True)
    model = Column(String)

class Smart(Base):
   __tablename__ = 'smart'
   drive = Column(Integer, ForeignKey(Drives.id), primary_key=True)
   ts = Column(DateTime, primary_key=True)
   value = Column(Integer)
   drives = relationship(Drives)
   _model_coeff = {
       'ModelA': 2,
       'ModelB': 3,
   }
   @hybrid_property
   def actual_value(self):
       return self.value * self._model_coeff[self.drives.model]
   @actual_value.expression
   def actual_value(cls):
       return cls.value * case(
           cls._model_coeff,
           value=select([Drives.model]).
                 where(Drives.id == cls.drive).
                 as_scalar())

This uses the "shorthand" format of case() to map the lookup dict to an SQL CASE expression. A query such as:

session.query(Smart, Smart.actual_value)

will use the correlated subquery to choose between coefficients, but there is another option as well using eager loading:

session.query(Smart).options(joinedload(Smart.drives, innerjoin=True))

This way the related Drives instance is loaded in the same query, so the hybrid property does not need to perform a fetch when accessed on an instance:

# `s` is an instance from the previous eager loading query. This access
# will not fire additional queries.
s.actual_value
Sign up to request clarification or add additional context in comments.

3 Comments

I tested the column_property approach mentioned above and can confirm that it works. I was a little lost figuring out the construct of column_property and this answer was helpful. I find the second approach interesting for my use case. Once I understand the intricacies of the application, I may choose hybrid_property instead of inheritance.
Ilja, If we expect the Smart table in the above example to have 500 million rows, do you have thoughts on which of the two approaches would work better?
The inheritance and using the expression of the hybrid approach are very similar in that both use a correlated subquery, which I'm sure will determine the performance. It'll depend on how well your DBMS optimizes it. The hybrid approach offers another approach as well, though: you can fetch Smart instances while eager loading Smart.drives relationship. Since it is a many-to-one relationship, joined eager loading is going to perform well, especially since you can set innerjoin=True, as Smart.drive is (part of) primary key.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.