-1

I have a Lambda that gets thousands of events sent to it at one time. Concurrency is left at the default, which means AWS will spin up several instances to handle the incoming events. The Lambda takes the data and inserts some data into a database if that data doesn't already exist. The Lambda is written in Node.js and uses Knex to connect to a Postgres database.

The Lambda essentially contains this logic:

Does a record with ID X exist?     
a. Yes: do nothing
b. No: create a new record with ID X.

The problem is that when 50 Lambdas spin up at the same time, they'll enter a race condition where, say, 3 or 4 of them will check for the existing record at the same time (or within microseconds of each other) and not find it, therefore inserting multiple, duplicate records.

I know one way to solve this would be to create a unique constraint on the table to prevent multiple records with ID X. Then my logic would look like this:

Does a record with ID X exist? 
a. Yes: do nothing 
b. No: create a new record with ID X.
   b.1. Did that succeed?
      a. Yes: continue on.
      b. No, it threw a unique constraint error: go back to line 1.

This seems a bit contrived, but should work. Is there a better option?

EDIT:

Here is the actual code:

let location = await Location.query().where({ external_id }).first();
if(!location){
    location = await Location.query().insert({
        name,
        external_id
    });
}
9
  • I think this question is not necessarily specific to aws-lambda or postgresql...it's generalized to consistency models and which route to take. Jespen has some diagrams to help explain the strengths of each model Commented Apr 19, 2019 at 5:53
  • And perhaps a good first read would be with: Correctness. This link has more drawings to help explain. Commented Apr 19, 2019 at 5:58
  • @JaromandaX I have added the actual code. It literally does what I explained, but I hope you're satisfied. Commented Apr 19, 2019 at 6:00
  • 1
    Some databases let you perform an "upsert" operation which means: "update if found, otherwise insert" Commented Apr 19, 2019 at 6:06
  • 1
    @RichS that's an interesting idea. I think Postgres does support that, but not sure Knex does. I will investigate. Thanks for the idea. Commented Apr 19, 2019 at 6:09

1 Answer 1

2

Code like this:

Does a record with ID X exist?      
a. Yes: do nothing 
b. No: create a new record with ID X.

without locking the database somehow is a race condition. Between querying for record X and creating it, some other request can create it too. Don't do it this way, ever. This is racy.

You have to look at the specific tools your database offers, but a common way to execute the above sequence is to set up the database so that it doesn't allow duplicates for ID X and then you just attempt to create the record with ID x. Then, it will atomically either get created or return an error and there will be no opportunity for a race condition. You just look for the error and handle it

Sign up to request clarification or add additional context in comments.

5 Comments

Thank you. So your recommendation is to do as I suggested and create a unique constraint, letting an insert fail, and then falling back to trying to select the record a second time, when it should now exist?
@user2719094 - Generally yes. But, I don't know what you mean by select it the second time. If you're trying to create or update the record, then many databases have an atomic operation for that specific operation (sometimes called upsert) which will do an insert or update.
@jfriend00 yeah, i realize that probably wasn't clear. I really need to get the row ID for the inserted record so I can use it somewhere else, so if the insert fails, I still need to retry the select so I can get the ID I need.
Long version: Postgres UPSERT. Short-version: INSERT ... ON CONFLICT UPDATE

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.