0

I have a long running AWS Lambda function that I am executing from my webapp. Using the documentation [1], it works fine however my problem is this particular lambda function does not return anything back to the application its output is saved to S3 and it runs for a long time 20-30s. Is there a way to trigger the lambda and not wait for the return value since I don't want to wait/block my app while the lambda is running. Right now I am using an ExecutorService as a que to execute lambda requests since I have to wait for each invocation, when the app crashes or restarts I lose jobs that are waiting to be executed.

[1] https://aws.amazon.com/blogs/developer/invoking-aws-lambda-functions-from-java/

6
  • 3
    If something is asynchronous, why not use SQS/kinesis/MSK? Commented May 3, 2021 at 13:50
  • Similarly, you can decouple Java from calling the function by triggering the lambda from those options as well by producing a message in any client language to the respective queue/shard/topic Commented May 3, 2021 at 13:52
  • @OneCricketeer I would like to keep moving parts to a minimum not rely on aws services too much make it easier to test develop locally Commented May 3, 2021 at 13:56
  • Look at the localstack project for integration testing outside of mocking or the AWS ecosystem. If you use MSK, you can run Kafka locally Commented May 3, 2021 at 13:59
  • Otherwise, you can fire-and-forget a thread that calls the lambda, but then you need to persist each execution event to track whether those calls actually completed at all, or constantly polling S3 for updates Commented May 3, 2021 at 14:01

1 Answer 1

1

Tracking status is not necessarily a difficult issue. Use a simple S3 "file exists" call after each job execution to know if the lambda is done.

However, as you've pointed out, you might lose job information at some point. To remove this issue, you need some persistence layer outside your JVM. A KV store would work, store some (timestamp, jobId, status) fields in a database, and periodically check from your web server and only update from the lambda.

Alternatively, to reduce end-to-end time frame further, a queuing mechanism would be better (unless you also want the full history of jobs, but this can be constructed along with the queue). As mentioned in the comments, AWS offers many built in solutions that can directly be used with Lambda, or you need additional infrastructure like RabbitMQ / Redis to built a task event bus.

With that, lambda is now optional. You'd effectively periodically pull off events into a worker queue, which either can be very dumb passthroughs and invoke the lambda, or do the work themselves directly. Combine this with ECS/EKS/EC2 autoscaling and it might actually run faster than lambda since you can scale in/out based on queue size. Then you write the output events to a success/error notification "channel" after the S3 file is written

Back in the web server, you'll have to modify code to now be listening for messages asynchronously from that channel, and when you get a success message, you'll know that you should be able to access the S3 resources

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.