2

I have a Flask REST API, running with a gunicorn/nginx stack. There is global SQLAlchemy session set up once for each thread that the API runs on. I set up an endpoint /test/ for running the unit tests for the API. One test makes a POST request to add something to the database, then has a finally: clause to clean up:

def test_something():
    try:
        url = "http://myposturl"
        data = {"content" : "test post"}
        headers = {'content-type': 'application/json'}
        result = requests.post(url, json=data, headers=headers).json()
        validate(result, myschema)
    finally:
        db.sqlsession.query(MyTable).filter(MyTable.content == "test post").delete()
        db.sqlsession.commit()

The problem is that the thread to which the POST request is made now has a "test post" object in its session, but the database has no such object because the thread on which the tests ran deleted that thing from the database. So when I make a GET request to the server, about 1 in 4 times (I have 4 gunicorn workers), I get the "test post" object, and 3 in 4 times I do not. This is because the threads each have their own session object, and they are getting out of sync, but I don't really know what to do about it....

Here is my setup for my SQLAlchemy session:

def connectSQLAlchemy():
    import sqlalchemy
    import sqlalchemy.orm
    engine = sqlalchemy.create_engine(connection_string(DBConfig.USER, DBConfig.PASSWORD, DBConfig.HOST, DBConfig.DB))
    session_factory = sqlalchemy.orm.sessionmaker(bind=engine)
    Session = sqlalchemy.orm.scoped_session(session_factory)
    return Session()

# Create a global session for everyone
sqlsession = connectSQLAlchemy()

2 Answers 2

7

Please use flask-sqlalchemy if you're using flask, it takes care of the lifecycle of the session for you.

If you insist on doing it yourself, the correct pattern is to create a session for each request instead of having a global session. You should be doing

Session = scoped_session(session_factory, scopefunc=flask._app_ctx_stack.__ident_func__)
return Session

instead of

Session = scoped_session(session_factory)
return Session()

And do

session = Session()

every time you need a session. By virtue of the scoped_session and the scopefunc, this will return you a different session in each request, but the same session in the same request.

Sign up to request clarification or add additional context in comments.

9 Comments

I did what you said, and I put self.session = db.Session() in the init function for a new APIResource(Resource) class, but I'm having the same problem... I add something to the database and delete it in separate threads, but the adding thread remembers the thing.
@Scott You're saying you have multiple threads within the same request?
No. Just the gunicorn threads. I think the problem with this solution is that I need to tear down the session at the end of the request somewhere.
@Scott Well, yes. You weren't doing that before? Again, I highly recommend using flask-sqlalchemy, as it takes care of that for you.
@univerio I'm having trouble understanding what the effective difference between scoped_session(session_factory) and scoped_session(session_factory, scopefunc=flask._app_ctx_stack.__ident_func__) is. In the former, we are using thread local storage - each thread that calls Session() will get it's own session. Each request to the Flask will create a new thread and thus get it's own session. Why isn't this effectively the same as the latter?
|
-1

Figured it out. What I did was to add a setup and teardown to the request in my app's __init__.py:

@app.before_request
def startup_session():
    db.session = db.connectSQLAlchemy()

@app.teardown_request
def shutdown_session(exception=None): 
    db.session.close()

still using the global session object in my db module:

db.py:

....
session = None
....

The scoped_session handles the different threads, I think...

Please advise if this is a terrible way to do this for some reason. =c)

4 Comments

scoped_session does not handle multiple threads like this. As soon as you have multiple threads you're going to run into trouble.
You need to connect in a per-process fashion, eg 4 procs = 4 DB connections and open Sessions in a per-request fashion 1 request = 1 new Session.
@alextsil "connect in a per-process fashion" so db.connectSQLAlchemy() creates both a connection AND a session?
@Him No, you need to run the create_engine inside an @app.before.server_start listener and then create new session inside an @app.before_request listener. (The naming is not precise but you get the idea.) This way each worker will get its own connection pool and each request will get its own session.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.