4

This question has been asked earlier in the following link: How to write dynamodb scan data's in CSV and upload to s3 bucket using python?

I have amended the code as advised in the comments. The code looks like as follows:

import csv
import boto3
import json
dynamodb = boto3.resource('dynamodb')
db = dynamodb.Table('employee_details')
def lambda_handler(event, context):
    AWS_BUCKET_NAME = 'session5cloudfront'
    s3 = boto3.resource('s3')
    bucket = s3.Bucket(AWS_BUCKET_NAME)
    path = '/tmp/' + 'employees.csv'
    try:
        response = db.scan()
        myFile = open(path, 'w')  

        for i in response['Items']:
            csv.register_dialect('myDialect', delimiter=' ', quoting=csv.QUOTE_NONE)
            with myFile:
                writer = csv.writer(myFile, dialect='myDialect')
                writer.writerows(i)
            print(i)
    except :
        print("error")

    bucket.put_object(
        ACL='public-read',
        ContentType='application/csv',
        Key=path,
        # Body=json.dumps(i),
    )
    # print("here")
    body = {
        "uploaded": "true",
        "bucket": AWS_BUCKET_NAME,
        "path": path,
    }
    # print("then here")
    return {
        "statusCode": 200,
        "body": json.dumps(body)
    }

I am a novice, please help me in fixing this code as it is having problem in inserting data in file created in S3 Bucket.

Thanks

1
  • Perfect, Thanks John. It worked smoothly Commented May 23, 2020 at 10:43

1 Answer 1

12

I have revised the code to be simpler and to also handle paginated responses for tables with more than 1MB of data:

import csv
import boto3
import json

TABLE_NAME = 'employee_details'
OUTPUT_BUCKET = 'my-bucket'
TEMP_FILENAME = '/tmp/employees.csv'
OUTPUT_KEY = 'employees.csv'

s3_resource = boto3.resource('s3')
dynamodb_resource = boto3.resource('dynamodb')
table = dynamodb_resource.Table(TABLE_NAME)

def lambda_handler(event, context):

    with open(TEMP_FILENAME, 'w') as output_file:
        writer = csv.writer(output_file)
        header = True
        first_page = True

        # Paginate results
        while True:

            # Scan DynamoDB table
            if first_page:
                response = table.scan()
                first_page = False
            else:
                response = table.scan(ExclusiveStartKey = response['LastEvaluatedKey'])

            for item in response['Items']:

                # Write header row?
                if header:
                    writer.writerow(item.keys())
                    header = False

                writer.writerow(item.values())

            # Last page?
            if 'LastEvaluatedKey' not in response:
                break

    # Upload temp file to S3
    s3_resource.Bucket(OUTPUT_BUCKET).upload_file(TEMP_FILENAME, OUTPUT_KEY)
Sign up to request clarification or add additional context in comments.

1 Comment

Please note that, as of Aug 2020, this usecase (DDB to S3) is supported by AWS and requires no coding. More info at their blog here: aws.amazon.com/blogs/aws/…

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.