Upload bulk csv data into existing DynamoDB table

Question

I'm trying to migrate data from a csv file into an existing AWS DynamoDB table, as part of an AWS Amplify web app.

I followed this CloudFormation tutorial, using the below template.

I was only able to create a new DynamoDB table, but not use an existing table and add data to it.

QUESTION: Is there a way to modify the template so that I can provide an existing table name at the "Specify stack details" step in the wizard, under "DynamoDBTableName", so that the csv data will be added to the table? If not, is there an alternative process?

{
    "AWSTemplateFormatVersion": "2010-09-09",
    "Metadata": {

    },
    "Parameters" : {
        "BucketName": {
            "Description": "Name of the S3 bucket you will deploy the CSV file to",
            "Type": "String",
            "ConstraintDescription": "must be a valid bucket name."
        },
        "FileName": {
            "Description": "Name of the S3 file (including suffix)",
            "Type": "String",
            "ConstraintDescription": "Valid S3 file name."
        },
        "DynamoDBTableName": {
            "Description": "Name of the dynamoDB table you will use",
            "Type": "String",
            "ConstraintDescription": "must be a valid dynamoDB name."
        }
    },
    "Resources": {
        "DynamoDBTable":{
            "Type": "AWS::DynamoDB::Table",
            "Properties":{
                "TableName": {"Ref" : "DynamoDBTableName"},
                "BillingMode": "PAY_PER_REQUEST",
                "AttributeDefinitions":[
                    {
                        "AttributeName": "id",
                        "AttributeType": "S"
                    }
                ],
                "KeySchema":[
                    {
                        "AttributeName": "id",
                        "KeyType": "HASH"
                    }
                ],
                "Tags":[
                    {
                        "Key": "Name",
                        "Value": {"Ref" : "DynamoDBTableName"}
                    }
                ]
            }
        },
        "LambdaRole" : {
          "Type" : "AWS::IAM::Role",
          "Properties" : {
            "AssumeRolePolicyDocument": {
              "Version" : "2012-10-17",
              "Statement" : [
                {
                  "Effect" : "Allow",
                  "Principal" : {
                    "Service" : ["lambda.amazonaws.com","s3.amazonaws.com"]
                  },
                  "Action" : [
                    "sts:AssumeRole"
                  ]
                }
              ]
            },
            "Path" : "/",
            "ManagedPolicyArns":["arn:aws:iam::aws:policy/service-role/AWSLambdaBasicExecutionRole","arn:aws:iam::aws:policy/AWSLambdaInvocation-DynamoDB","arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess"],
            "Policies": [{
                             "PolicyName": "policyname",
                             "PolicyDocument": {
                                       "Version": "2012-10-17",
                                       "Statement": [{
                                    "Effect": "Allow",
                                                "Resource": "*",
                                                  "Action": [
                                                              "dynamodb:PutItem",
                                                              "dynamodb:BatchWriteItem"
                                                  ]
                                      }]
                             }
                    }]
          }
       },
        "CsvToDDBLambdaFunction": {
            "Type": "AWS::Lambda::Function",
            "Properties": {
                "Handler": "index.lambda_handler",
                "Role": {
                    "Fn::GetAtt": [
                        "LambdaRole",
                        "Arn"
                    ]
                },
                "Code": {
                    "ZipFile": {
                        "Fn::Join": [
                            "\n",
                            [
                                "import json",
                                "import boto3",
                                "import os",
                                "import csv",
                                "import codecs",
                                "import sys",
                                "",
                                "s3 = boto3.resource('s3')",
                                "dynamodb = boto3.resource('dynamodb')",
                                "",
                                "bucket = os.environ['bucket']",
                                "key = os.environ['key']",
                                "tableName = os.environ['table']",
                                "",
                                "def lambda_handler(event, context):",
                                "",  
                                "",
                                "   #get() does not store in memory",
                                "   try:",
                                "       obj = s3.Object(bucket, key).get()['Body']",
                                "   except:",
                                "       print(\"S3 Object could not be opened. Check environment variable. \")",
                                "   try:",
                                "       table = dynamodb.Table(tableName)",
                                "   except:",
                                "       print(\"Error loading DynamoDB table. Check if table was created correctly and environment variable.\")",
                                "",
                                "   batch_size = 100",
                                "   batch = []",
                                "",
                                "   #DictReader is a generator; not stored in memory",
                                "   for row in csv.DictReader(codecs.getreader('utf-8-sig')(obj)):",
                                "      if len(batch) >= batch_size:",
                                "         write_to_dynamo(batch)",
                                "         batch.clear()",
                                "",
                                "      batch.append(row)",
                                "",
                                "   if batch:",
                                "      write_to_dynamo(batch)",
                                "",
                                "   return {",
                                "      'statusCode': 200,",
                                "      'body': json.dumps('Uploaded to DynamoDB Table')",
                                "   }",
                                "",
                                "",    
                                "def write_to_dynamo(rows):",
                                "   try:",
                                "      table = dynamodb.Table(tableName)",
                                "   except:",
                                "      print(\"Error loading DynamoDB table. Check if table was created correctly and environment variable.\")",
                                "",
                                "   try:",
                                "      with table.batch_writer() as batch:",
                                "         for i in range(len(rows)):",
                                "            batch.put_item(",
                                "               Item=rows[i]",
                                "            )",
                                "   except:",
                                "      print(\"Error executing batch_writer\")"
                            ]
                        ]
                    }
                },
                "Runtime": "python3.7",
                "Timeout": 900,
                "MemorySize": 3008,
                "Environment" : {
                    "Variables" : {"bucket" : { "Ref" : "BucketName" }, "key" : { "Ref" : "FileName" },"table" : { "Ref" : "DynamoDBTableName" }}
                }
            }
        },

        "S3Bucket": {
            "DependsOn" : ["CsvToDDBLambdaFunction","BucketPermission"],
            "Type": "AWS::S3::Bucket",
            "Properties": {

                "BucketName": {"Ref" : "BucketName"},
                "AccessControl": "BucketOwnerFullControl",
                "NotificationConfiguration":{
                    "LambdaConfigurations":[
                        {
                            "Event":"s3:ObjectCreated:*",
                            "Function":{
                                "Fn::GetAtt": [
                                    "CsvToDDBLambdaFunction",
                                    "Arn"
                                ]
                            }
                        }
                    ]
                }
            }
        },
        "BucketPermission":{
            "Type": "AWS::Lambda::Permission",
            "Properties":{
                "Action": "lambda:InvokeFunction",
                "FunctionName":{"Ref" : "CsvToDDBLambdaFunction"},
                "Principal": "s3.amazonaws.com",
                "SourceAccount": {"Ref":"AWS::AccountId"}
            }
        }
    },
    "Outputs" : {

    }
}

Another answer Dennis' answer is one solution, but you can also comment out the "DynamoDBTable" part within the "Resources" of the JSON file.

Dennis Traub · Accepted Answer · 2021-01-03 01:42:14Z

2

You can migrate CSV files from Amazon S3 to Amazon DynamoDB using the AWS Database Migration Service (DMS). Have a look at this step-by step walkthrough.

answered Jan 3, 2021 at 1:42

Dennis Traub

51.9k7 gold badges100 silver badges113 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

Greg Forel Over a year ago

Thanks Dennis. I wish they had easier tools and GUIs to perform such tasks that are so common. I was also able to add to the existing table using the template I posted by just commenting out the DynamoDBTable resource. I didn't know that was the command creating the Table.

Collectives™ on Stack Overflow

Upload bulk csv data into existing DynamoDB table

1 Answer 1

1 Comment

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

1 Comment

Your Answer

Sign up or log in

Post as a guest

Related