4

From the Best Practices for Working with AWS Lambda Functions:

Take advantage of execution context reuse to improve the performance of your function. Initialize SDK clients and database connections outside of the function handler, [...]

I would like to implement this principle to improve my lambda function, where a database handle is initialized and closed every time the function is invocated. Take the following example:

def lambda_handler(event, context):
    # Open a connection to the database
    db_handle = connect_database()    

    # Do something with the database
    result = perform_actions(db_handle)  

    # Clean up, close the connection
    db_handle.close()       

    # Return the result
    return result    

From my understanding of the AWS documentation, the code should be optimized as follows:

# Initialize the database connection outside the handler
db_handle = conn_database()

def lambda_handler(event, context):
    # Do something with the database and return the result
    return perform_actions(db_handle)

This would result in the db_handle.close() method not being called, thus potentially leaking a connection.

How should I handle the cleanup of such resources when using AWS Lambda with Python?

2 Answers 2

1
+50

Many people looking for the same thing with you. I believe it is impossible at this time. But we could handle the issue from the database side.

Take a look at this one

Sign up to request clarification or add additional context in comments.

Comments

0

The connection leak would only happen while the Lambda execution environment is alive; in other words the connection would timeout (be closed) after the execution environment is destroyed.

Whether a global connection object is worth implementing depends on your particular use case:
- how much of the total execution time is taken by the database initialization
- how often your function is called
- how do you handle database connection errors

If you want to have a bit more control of the connection you can try this approach which recycles the database connection every two hours or when encountering a database-related exception:

# Initialize the global object to hold database connection and timestamp
db_conn = {
    "db_handle": None,
    "init_dt": None
}

def lambda_handler(event, context):
    # check database connection
    if not db_conn["db_handle"]:
        db_conn["db_handle"] = connect_database()
        db_conn["init_dt"] = datetime.datetime.now() 
    # Do something with the database and return the result
    try:
        result = do_work(db_conn["db_handle"])
    except DBError:
         try:
             db_conn["db_handle"].close()
         except:
             pass
         db_conn["db_handle"] = None
         return "db error occured"      
    # check connection age
    if datetime.datetime.now() - db_conn["init_dt"] > datetime.timedelta(hours=2):
         db_conn["db_handle"].close()
         db_conn["db_handle"] = None
    return result

Please note I haven't tested the above on Lambda so you need to check it with your setup.

3 Comments

The connection would indeed timeout after a while (depending on the database settings), but that's not the same as cleaning up resources. My question is specifically about cleaning up resources (of any kind) before the execution context is cancelled.
I don't think there is a Lambda execution environment destroyed event hook at the moment; you can try to reduce the leak duration by adjusting the recycle interval and idle connection timeout setting for the database (idle_in_transaction_session_timeout for PostgreSQL, wait_timeout for MySQL etc.). You can also put a connection pooler in between.
This solves the issue in the example, but doesn't answer the question

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.