3

I have more than 30GB file stored in s3,and I want to write an Lambda function which will access that file, parse it and then run some algorithm on the same. I am not sure if my lambda function can take that big file and work on it as Max execution time for Lambda function is 300 sec(5 min). I found AWS S3 feature regarding faster acceleration, but will it help?

Considering the scenario other than lambda function can any one suggest any other service to host my code as micro service and parse the file?

Thanks in Advance

1
  • EMR is well suited for this. Commented Jan 21, 2017 at 11:54

1 Answer 1

6

It is totally based on the processing requirements and frequency of processing.

You can use Amazon EMR for parsing the file and run the algorithm, and based on the requirement you can terminate the cluster or keep it alive for frequent processing. https://aws.amazon.com/emr/getting-started/

You can try using Amazon Athena (Recently launched) service, that will help you for parsing and processing files stored in S3. The infrastructure need will be taken care by Amazon. http://docs.aws.amazon.com/athena/latest/ug/getting-started.html

For Complex Processing flow requirements, you can use combinations of AWS services like AWS DataPipeline - for managing the flow and AWS EMR or EC2 - to run the processing task.https://aws.amazon.com/datapipeline/

Hope this helps, thanks

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.