2

I have a lambda function that gets triggered every time a file is written onto an S3 bucket. My understanding is that every time a single file gets in (this is a potential scenario, rather than having a batch of files being sent), an API call is fired up and that means that I am charged. My question is: can I batch multiple files so that each API calls will only be called if, for example, I have a batch of 10 files? Is this a good practice? I should not be in the position of having a processing time greater than 15 minutes, so the use of the lambda is still fine.

Thank you

2 Answers 2

4

You can use SQS to decouple this scenario, the lambda triggering point will be SQS, in there you can set batch size whatever you want.

enter image description here

Sign up to request clarification or add additional context in comments.

2 Comments

Thank you, this is a good point indeed. I was trying to avoid implementing SQS at this stage as I will be getting files weekly, however I do not know at this stage whether they will send them all at once and also whether I am covering multiple 3rd party producers. I understand that I cannot really batch once "a" file gets onto the bucket, so it seems that I need to have something in the middle like SQS
Yea, that is a good aws service to dicupple components. Even if you don't want to make SQS as triggering point for the lambda, you can make cloudwatch to schedule the lambda using cron job and pull the messages from SQS as bulk.
1
  • 1 - One solution is group your files into a rar and put into S3. Thus for multiple files, your api will be triggered once only.

  • 2 - The other solution as said by kamprasad is to use SQS.

  • 3 - One last solution that I can think of is to use a cronjob to trigger the lambda as per your requirement. Inside your lambda do the processing using threads to make your task done faster. Keep in mind you have to choose Memory and time carefully in this scenario.

I've personally used the last solution quite frequently.

1 Comment

I would recommend against the cronjob trigger. It sounds like you are proposing an event-driven workflow, which has some benefits against faults. With a cronjob you're pulling events instead of pushing them when you need to, and so you run the risk of unnecessary invocations or errors due to failed upstream work.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.