5

With SqlAlchemy, is it possible to build a query which will update only the first matching row?

In my case, I need to update the most recent log entry:

class Log(Base):
    __tablename__ = 'logs'
    id = Column(Integer, primary_key=True)
    #...
    analyzed = Column(Boolean)

session.query(Log)  \
    .order_by(Log.id.desc())  \
    .limit(1)  \
    .update({ 'analyzed': True })

Which results into:

InvalidRequestError: Can't call Query.update() when limit() has been called

It makes sense, since UPDATE ... LIMIT 1 is a MySQL-only feature (with the solution given here)

But how would I do the same with PostgreSQL? Possibly, using the subquery approach?

1
  • 2
    The best solution depends on whether each concurrent transaction should update the same first row according to ORDER BY, or the next row, not yet locked, or a single, random / arbitrary row matching some criteria. Commented Sep 21, 2014 at 1:43

3 Answers 3

8

The subquery recipe is the right way to do it, now we only need to build this query with SqlAlchemy.

Let's start with the subquery:

sq = ssn.query(Log.id)  \
    .order_by(Log.id.desc())  \
    .limit(1)  \
    .with_for_update()

And now use it with as_scalar() with the example from the update() docs:

from sqlalchemy import update

q = update(Log)  \
    .values({'analyzed': True})  \
    .where(Log.id == sq.as_scalar())

Print the query to have a look at the result:

UPDATE logs 
SET analyzed=:analyzed 
WHERE logs.id = (
    SELECT logs.id 
    FROM logs ORDER BY logs.id DESC 
    LIMIT :param_1 
    FOR UPDATE
)

Enjoy!

Sign up to request clarification or add additional context in comments.

2 Comments

... and if you're trying to use it for queuing, keep concurrency in mind. It's not atomic, so multiple sessions can grab and update the same row.
That's a little unfortunate because a subquery isn't directly needed in plenty of RDBM's. Is there a way to force SQLAlchemy to limit an update?
2

To prevent the same row from being updated multiple times, add:

WHERE analyzed <> :analyzed

Or if NULL values are allowed:

WHERE analyzed IS DISTINCT FROM :analyzed

Add the same condition to the outer UPDATE as well, which is almost always a good idea in any case to avoid empty updates.

Concurrent transactions being blocked by the ROW SHARE lock from FOR UPDATE wake up as soon as the first transaction finishes. Since the changed row does not pass the WHERE condition any more, the subquery returns no row and nothing happens.

While later transactions lock a new row to update ...

You could use advisory locks to always update the next unlocked row without waiting. More details here:

There is also the related PGQ to implement queues. (Never used it myself).

Comments

1

my DB won't work with limit inside subquery - so I ended up using something like this:

log_id = session.query(Log.id)  \
    .order_by(Log.id.desc())  \
    .limit(1)
log_id = [log.id for log in log_id]
session.query(Log).filter(Log.id.in_(log_id)).delete()

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.