0

I am trying to export DynamoDB table as JSON format to S3 and from there import it to BigQuery. The important part is exporting DynamoDB table as JSON format to S3, because the table I am working on is not a small table. This table contains 5.6 million records and about 15.000 (on a quiet day) new records are inserted every day. I came across with a blog post which suggests Lambda (ref: http://randomwits.com/blog/export-dynamodb-s3) function but table.scan() function does not work well with large tables.

So how can I export DynamoDB table in JSON format to S3 and from there import it to BigQuery efficiently? I saw some options like HEVO, Glue, etc. But I don't know which way would be the most efficient.

3

1 Answer 1

1

You can do this with AWS lambda, lambda is triggered by DynamoDB stream, then this lambda will write to cloud logging, from cloud logging you will have to create a sink and make big query as the destination

Sign up to request clarification or add additional context in comments.

3 Comments

That will help export new data, but not the existing data.
Then you can utilize dynamodb export to s3, then query the data using athena, the query results can be put on a new bucket -> AWS Lambda -> Cloud Logging -> Sink to BQ docs.aws.amazon.com/amazondynamodb/latest/developerguide/…
Right, I'm just pointing out that your answer addresses the change data capture, but not the original data.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.