1

I am working with a project where we create a bunch of entries in firestore based on results from an API endpoint we do not control, using a firestore cloud function. The API endpoint returns ids which we use for the document ids, but it does not include any timestamp information. Since we want to include a createdDate in our documents, we are using admin.firestore.Timestamp.now() to set the timestamp of the document. On subsequent runs of the function, some of the documents will already exist so if we use batch.commit with create, it will fail since some of the documents exist. However, if we use batch.commit with update, we will either not be able to include a timestamp, or the current timestamp will be overwritten. As a final requirement, we do update these documents from a web application and set some properties like a state, so we can't limit the permissions on the documents to disallow update completely.

What would be the best way to achieve this? I am currently using .create and have removed the batch, but I feel like this is less performant, and I occasionally do get the error Error: 4 DEADLINE_EXCEEDED on the firestore function.

First prize would be a batch that can create or update the documents, but does not edit the createdDate field. I'm also hoping to avoid reading the documents first to save a read, but I'd be happy to add it in if it's the best solution. Thanks!

Current code is something like this:

  const createDocPromise = docRef
    .create(newDoc)
    .then(() => {
      // success, do nothing
    })
    .catch(err => {
      if (
        err.details &&
        err.details.includes('Document already exists')
      ) {
        // doc already exists, ignore error
      } else {
        console.error(`Error creating doc`, err);
      }
    });

2 Answers 2

2

This might not be possible with batched writes as set() will overwrite the existing document, update() will update the timestamp and create() will throw an error as you've mentioned. One workaround would be to use create() for each document with Promise.allSettled() that won't run catch() if any of the promise fails.

const results = [] // results from the API

const promises = results.map((r) => db.doc(`col/${r.id}`).create(r));

const newDocs = await Promise.allSettled(promises)

// either "fulfilled" or "rejected"
newDocs.forEach((result) => console.log(result.status))

If any documents exists already, create() will throw an error and status for that should be rejected. This way you won't have to read the document at first place.


Alternatively, you could store all the IDs in a single document or RTDB and filter out duplicates (this should only cost 1 read per invocation) and then add the data.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks! I ended up with Promise.all, I didn't know that Promise.allSettled existed but not I see I can use it if I target ES2020. I think it would make sense to make that change. Do you know how the performance of this compares to if you use a transaction or batch? I would imagine it is slower without a batch.
@TimTrewartha yes, batched writes would be a bit faster than individually updating each document but that should it shouldn't be noticeable given that a batched write also has a max limit of 500 per batch.
0

Since you prefer to keep the batch and you want to avoid reading the documents, a possible solution would be to store the timestamps in a field of type Array. So, you don't overwrite the createdDate field but save all the values corresponding to the different writes.

This way, when you read one of the documents you sort this array and take the oldest value: it is the very first timestamp that was saved and corresponds to the document creation.

This way you don't need any extra writes or extra reads.

2 Comments

I see, that option did cross my mind. Would you use a sentinel to push to the array? Could you still order on that field if it is an array? Our frontend application currently sorts the entries by createdDate descending
Yes you can use FieldValue.arrayUnion to push to the Array. And you can, in the same Cloud Function, set the value of a single field that you will use for sorting if and only if the Array has one element (the first element being the creation date).

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.