1

I am using aws lambda function, python and monogdb atlas. I have executed the below code.

client = MongoClient('mongodb+srv://app_user:[email protected]/test')
db = client.test

def lambda_handler(event, context):
    print("Received event: " + json.dumps(event, indent=1))
    user_profile = db.user_profile
    Email = event['email']
    res = Email.find('@')
    if res == -1:
        disuname = list(user_profile.find({"username" : Email},{"name": 1, "photo": 1, "bio": 1}))
        uid = json.dumps(disuname, default=json_util.default)
        return json.loads(uid)

    else:
       disuname = list(user_profile.find({"email": Email},{"name": 1, "photo": 1, "bio": 1}))
       uid = json.dumps(disuname, default=json_util.default)
       return json.loads(uid)

In this above code execute, mongodb connections size will increase and it will reach maximum size. I heard the concept of mongodb connection pooling but i didn't understand how to implement it in pymongo in lambda function. Can you please help me the solution.

7
  • Simplest way is to create a file which has code to create a DB connection & call this file in every handler before any operation is being executed(So basically once req is validated + DB connection is established then go ahead and do your business logic), downside for this is as lambda fun's are directly related to a handler you need to call this file in every handler. Or in the other way you might need to create some kind of middleware to create DB connection & actual API calls would also go thru it..!! But anyway at the end how to make use of connection pooling is given in below answer.. Commented Nov 19, 2019 at 16:47
  • @srinivasy I have understand the concept but i didn't understand how to implement in code bro Commented Nov 19, 2019 at 17:06
  • what do you mean by implementing in code ? Commented Nov 19, 2019 at 17:07
  • @srinivasy Bro I mean, i didn't understand, how to write the code for mongodb connection pooling. Commented Nov 19, 2019 at 17:21
  • As it's clearly stated below, all you need to do is to pass additional options to your connection, what includes in options is your choice. I'm not getting when you say you're not able to implement connection pooling in code !! What exactly do you mean by implementing it ? Commented Nov 19, 2019 at 17:33

1 Answer 1

-1

These below lines are from the link given below, this should be a great explanation or a better starting point - Please read through :

Every MongoClient instance has a built-in connection pool per server in your MongoDB topology. These pools open sockets on demand to support the number of concurrent MongoDB operations that your multi-threaded application requires. There is no thread-affinity for sockets.

The size of each connection pool is capped at maxPoolSize, which defaults to 100. If there are maxPoolSize connections to a server and all are in use, the next request to that server will wait until one of the connections becomes available.

The client instance opens one additional socket per server in your MongoDB topology for monitoring the server’s state.

For example, a client connected to a 3-node replica set opens 3 monitoring sockets. It also opens as many sockets as needed to support a multi-threaded application’s concurrent operations on each server, up to maxPoolSize. With a maxPoolSize of 100, if the application only uses the primary (the default), then only the primary connection pool grows and the total connections is at most 103. If the application uses a ReadPreference to query the secondaries, their pools also grow and the total connections can reach 303.

It is possible to set the minimum number of concurrent connections to each server with minPoolSize, which defaults to 0. The connection pool will be initialized with this number of sockets. If sockets are closed due to any network errors, causing the total number of sockets (both in use and idle) to drop below the minimum, more sockets are opened until the minimum is reached.

The maximum number of milliseconds that a connection can remain idle in the pool before being removed and replaced can be set with maxIdleTime, which defaults to None (no limit).

The default configuration for a MongoClient works for most applications:

client = MongoClient(host, port)

Create this client once for each process, and reuse it for all operations. It is a common mistake to create a new client for each request, which is very inefficient.

To support extremely high numbers of concurrent MongoDB operations within one process, increase maxPoolSize:

client = MongoClient(host, port, maxPoolSize=200)

… or make it unbounded:

client = MongoClient(host, port, maxPoolSize=None)

Once the pool reaches its maximum size, additional threads have to wait for sockets to become available. PyMongo does not limit the number of threads that can wait for sockets to become available and it is the application’s responsibility to limit the size of its thread pool to bound queuing during a load spike. Threads are allowed to wait for any length of time unless waitQueueTimeoutMS is defined:

client = MongoClient(host, port, waitQueueTimeoutMS=100)

A thread that waits more than 100ms (in this example) for a socket raises ConnectionFailure. Use this option if it is more important to bound the duration of operations during a load spike than it is to complete every operation.

When close() is called by any thread, all idle sockets are closed, and all sockets that are in use will be closed as they are returned to the pool.

Ref: Connection pooling in pymongo

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.