33

I have a caching problem when I use sqlalchemy.

I use sqlalchemy to insert data into a MySQL database. Then, I have another application process this data, and update it directly.

But sqlalchemy always returns the old data rather than the updated data. I think sqlalchemy cached my request ... so ... how should I disable it?

1

9 Answers 9

57

The usual cause for people thinking there's a "cache" at play, besides the usual SQLAlchemy identity map which is local to a transaction, is that they are observing the effects of transaction isolation. SQLAlchemy's session works by default in a transactional mode, meaning it waits until session.commit() is called in order to persist data to the database. During this time, other transactions in progress elsewhere will not see this data.

However, due to the isolated nature of transactions, there's an extra twist. Those other transactions in progress will not only not see your transaction's data until it is committed, they also can't see it in some cases until they are committed or rolled back also (which is the same effect your close() is having here). A transaction with an average degree of isolation will hold onto the state that it has loaded thus far, and keep giving you that same state local to the transaction even though the real data has changed - this is called repeatable reads in transaction isolation parlance.

http://en.wikipedia.org/wiki/Isolation_%28database_systems%29

Sign up to request clarification or add additional context in comments.

4 Comments

"SQLAlchemy's session works by default in a transactional mode" --- can you show us a way to stop the default please? I dont want explainations just want 1 line of code to disable transaction completely. Especially for stupid SELECT calls.
Actually THERE IS caching in SQLAlchemy (at least, now in 2021 ) ). I faced this problem wth session.execute command. You can find information about caching here (search "cached since" string on the page) github.com/sqlalchemy/sqlalchemy/blob/master/doc/build/core/…
@AnarSalimkhanov Mind though, that the caching you are referring to is only a statement compilation cache. From your linked doc: it "is caching the SQL string that is passed to the database only, and not the data returned by a query. It is in no way a data cache and does not impact the results returned for a particular SQL statement nor does it imply any memory use linked to fetching of result rows."
@amain Hmm... Interesting. Because I really had a problem with caching. Though the DB was updated, I used to get old RESPONSE data, until I disabled it. Now I can't test it, because it was in one of my old projects, and I don't remember where it was )
26

This issue has been really frustrating for me, but I have finally figured it out.

I have a Flask/SQLAlchemy Application running alongside an older PHP site. The PHP site would write to the database and SQLAlchemy would not be aware of any changes.

I tried the sessionmaker setting autoflush=True unsuccessfully I tried db_session.flush(), db_session.expire_all(), and db_session.commit() before querying and NONE worked. Still showed stale data.

Finally I came across this section of the SQLAlchemy docs: http://docs.sqlalchemy.org/en/latest/dialects/postgresql.html#transaction-isolation-level

Setting the isolation_level worked great. Now my Flask app is "talking" to the PHP app. Here's the code:

engine = create_engine(
    "postgresql+pg8000://scott:tiger@localhost/test",
    isolation_level="READ UNCOMMITTED"
)

When the SQLAlchemy engine is started with the "READ UNCOMMITED" isolation_level it will perform "dirty reads" which means it will read uncommited changes directly from the database.

Hope this helps


Here is a possible solution courtesy of AaronD in the comments

from flask.ext.sqlalchemy import SQLAlchemy

class UnlockedAlchemy(SQLAlchemy):
    def apply_driver_hacks(self, app, info, options):
        if "isolation_level" not in options:
            options["isolation_level"] = "READ COMMITTED"
    return super(UnlockedAlchemy, self).apply_driver_hacks(app, info, options)

7 Comments

If you are using Flask-SQLAlchemy, you can subclass flask.ext.sqlalchemy.SQLAlchemy and override the apply_driver_hacks function to set the isolation level, while still keeping all of the Flask integration. Also, probably isolation level READ COMMITTED is sufficient providing both applications are committing their writes after they make them and not waiting for a long time. That way you don't have to worry about dirty reads - it just gives you a fresh DB snapshot every time you read.
@AaronD Could you post your code to subclass flask.ext.sqlalchemy.SQLAlchemy as you mentioned?
I just have this in my code: class UnlockedAlchemy(SQLAlchemy): def apply_driver_hacks(self, app, info, options): if not "isolation_level" in options: options["isolation_level"] = "READ COMMITTED" return super(UnlockedAlchemy, self).apply_driver_hacks(app, info, options)
Lifesaver! I am using engine_from_config to read the sqlalchemy configuration from file and I simply added: sqlalchemy.isolation_level = READ UNCOMMITTED to my config file and external changes are now properly reflected in my app :-)
This does not make sense. If the transaction to update the database is properly committed (by the php site), why you need to set the isolation level to "READ UNCOMMITTED"? It's more like a problem on how your PHP site is updating the database.
|
5

I have tried session.commit(), session.flush() none worked for me.

After going through sqlalchemy source code, I found the solution to disable caching.
Setting query_cache_size=0 in create_engine worked.

create_engine(connection_string, convert_unicode=True, echo=True, query_cache_size=0)

3 Comments

It's worth noting that the question and the other answers discuss apparent data caching, where retrieved data doesn't match the latest data in the database. query_cache_size controls the size of SQLAlchemy's cache of recently generated SQL queries as strings. It has no effect on query results, apart from potentially making them slower. It would of course affect memory usage.
Actually, despite wise conversations above, it is the direct answer on question title. And Source code comments say: Set to zero to disable caching. And it works: experiment with disabled caching experiment with default caching
@Dmitry no, that setting only disables statement caching, it has nothing to do with the caching of results.
4

Additionally to zzzeek excellent answer,

I had a similar issue. I solved the problem by using short living sessions.

with closing(new_session()) as sess:
    # do your stuff

I used a fresh session per task, task group or request (in case of web app). That solved the "caching" problem for me.

This material was very useful for me:

When do I construct a Session, when do I commit it, and when do I close it

2 Comments

The link above is going to the docs for session. The title implies it should be pointing here: docs.sqlalchemy.org/en/rel_0_8/orm/…
3

This was happening in my Flask application, and my solution was to expire all objects in the session after every request.

from flask.signals import request_finished
def expire_session(sender, response, **extra):
    app.db.session.expire_all()
request_finished.connect(expire_session, flask_app)

Worked like a charm.

Comments

0

One thing that might be relevant for others is when you are creating the session and engine is to pass expire_on_commit=True into the sessionmaker call.

sessionmaker(bind=engine, expire_on_commit=True)

I was experiencing something where a sqlalchemy relationship was not begin refreshed after creation and committing (since in my relationship I define an order_by function, but my secondary objects were not in the expected order, and this fixed the issue.

1 Comment

expire_on_commit is true by default, FWIW.
-1

First, there is no cache for SQLAlchemy. Based on your method to fetch data from DB, you should do some test after database is updated by others, see whether you can get new data.

(1) use connection:
connection = engine.connect()
result = connection.execute("select username from users")
for row in result:
    print "username:", row['username']
connection.close()
(2) use Engine ...
(3) use MegaData...

please folowing the step in : http://docs.sqlalchemy.org/en/latest/core/connections.html

Another possible reason is your MySQL DB is not updated permanently. Restart MySQL service and have a check.

1 Comment

Thanks for reply. I have solved it. I just forget session.close when i use scoped_session...
-1

import sqlalchemy as db
db.create_engine(db_string, query_cache_size=0, pool_size=30, max_overflow=0, pool_pre_ping=True, echo=True, echo_pool=True)

we can solve this in sqlalchemy version greater than 1.3.13, below this version it doesn't support disabling statement cache.

Comments

-3

As i know SQLAlchemy does not store caches, so you need to looking at logging output.

4 Comments

I thinks so. I opened echo = True but got nothing useful.
I update the data without using sqlalchemy.. use MySQLdb.. I ensure the data have updated in MySQL..
try to set the autocommit to True in your sessionmaker (bind=self.engine, autocommit=True)
Thanks for reply. I have solved it. I just forget session.close when i use scoped_session. faint..

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.