1

Is there a way to have the dynamodb rows for each user, backed up in s3 with a csv file.

Then using streams, when a row is mutated, change that row in s3 in the csv file.

The csv readers that are currently out there are geared towards parsing the csv for use within the lambda.

Whereas I would like to find a specific row, given by the stream and then replace it with another row without having to load the whole file into memory as it may be quite big. The reason I would like a backup on s3, is because in the future I will need to do batch processing on it and reading 300k files from dynamo within a short period of time, is not preferable.

5
  • you could use a lambda triggered on dynamodb update: docs.aws.amazon.com/amazondynamodb/latest/developerguide/… Commented Mar 11, 2018 at 1:43
  • @avigil The problem I am having, is having that lambda update the file. As in a way to read it from s3, find the line and update it. I have used fast-csv for example and it only allowed me to parse the line and not update it. Commented Mar 11, 2018 at 1:45
  • 1
    You will need to read in the contents of the S3 object, parse it and update as necessary, then overwrite the object with your updated version. See boto3 documentation for S3 put or upload_fileobj Commented Mar 11, 2018 at 2:14
  • @avigil I was hoping to avoid reading the whole file into lambda, to just update one file Commented Mar 11, 2018 at 13:32
  • 2
    you sadly can't do that if you are using S3. Consider switching to a database for easy incremental updates. Commented Mar 11, 2018 at 15:55

1 Answer 1

4

Read the data from S3, parse as csv using your favorite library and update, then write back to S3:

import io
import boto3

s3 = boto3.resource('s3')
bucket = s3.Bucket('mybucket')

with io.BytesIO() as data:
    bucket.download_fileobj('my_key', data)

    # parse csv data and update as necessary
    # then write back to s3

    bucket.upload_fileobj(data, 'my_key')

Note that S3 does not support object append or update if that was what you were hoping for- see here. You can only read and overwrite. You might take this into account when designing your system.

Sign up to request clarification or add additional context in comments.

2 Comments

With this way, if the file is large, then I would need to read and rewrite the whole file back to s3?
yes but thats the only way to do it in S3. Make many small objects and it won't be a problem.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.