0

I'm adding data from csv file using lambda function the data is added but there's an error in my table in dynamodb I see my headers also a row in table here's my code :

import boto3
s3=boto3.client("s3")

dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('maysales')

def lambda_handler(event, context):
    bucketna=event['Records'][0]['s3']['bucket']['name']
    s3_name=event['Records'][0]['s3']['object']['key']
    response=s3.get_object(Bucket=bucketn,Key=s3_name)
    data=response['Body'].read().decode("utf-8")
    salesnbs=data.split("\n")
    for ko in salesnbs:
        kos=ko.split(",")
        table.put_item(
            Item = { 
            "Date": kos[0],
            "name": kos[1],
            "fam": kos[2],
            "locati": kos[3],
            "adress": kos[4],
            "country": kos[5],
            "city": kos[6]
        })

my table contains headers already:

1
  • You might want to consider using the CSV module, which has a DictReader class. This automatically converts your CSV into a dictionary where the headers are used a dictionary keys. This means your header is also skipped when parsing the rows. Commented Apr 17, 2020 at 15:02

4 Answers 4

1

The first row of most CSV files contain the header labels, if you don't want to add that row to your dynamodb table, you need to skip past that first row before you start doing your insertions, i.e:

row = 0
for ko in salesnbs:
    if row == 0:
       continue # don't process this line

    row = row + 1
    kos=ko.split(",")
    table.put_item(
        Item = { 
        "Date": kos[0],
        "name": kos[1],
        "fam": kos[2],
        "locati": kos[3],
        "adress": kos[4],
        "country": kos[5],
        "city": kos[6]
    })

(syntax might not be 100% correct, but that is the idea)

Sign up to request clarification or add additional context in comments.

1 Comment

row = row + 1 should be under IF statement block else it won't do anything
1

It's not entirely clear what's the problem from description, but I suggest using Python's built-in module csv to handle CSV data. This way you won't need to worry about headers and splitting file into columns, since module provides tools for that.

import csv
...

# Here you can also specify delimiter if need be
reader = csv.DictReader(response['Body'])
for row in reader:
    table.put_item(
            Item = { 
            "Date": row["Date"],
            "name": row["name"],
            "fam": row["fam"],
            ...
        })

Module uses first row of the file for column names.

2 Comments

If you provide an example using the CSV module, this would be a good answer.
@OleksiiDonoha nice
0

Now from below modified code of @E.J. Brennan, we can able to skip header while pushing csv file from s3 into dynamodb. The below piece of code to replaced into your lambda function.

import boto3
s3_client = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('yourdynamodbtablename')
def lambda_handler(event, context):
    bucket = event['Records'][0]['s3']['bucket']['name']
    s3_file_name = event['Records'][0]['s3']['object']['key']
    response = s3_client.get_object(Bucket=bucket,Key=s3_file_name)
    fileData = response['Body'].read().decode("utf-8")
    print(fileData)
    modelData = fileData.split("\n")
    header = 0
    for row in modelData:
        print(row)
        if header == 0:
            header = header+1
            continue
        row_data = row.split(",")
        try:
            table.put_item(
                Item = {
                    'ID': row_data[0],
                    'NAME': row_data[1],
                    'SUBJECT': row_data[2]
                }
            )
        except Exceptions as e:
            print('End of File')
    return 'news rows were inserted successful without header into db'

Comments

-1

Use boto3 client instead of resource. Install dynamodb-json

from dynamodb_json import json_util as dynamo_json
import json
import boto3
s3=boto3.client("s3")

dynamodb = boto3.client('dynamodb')

def lambda_handler(event, context):
    bucketna=event['Records'][0]['s3']['bucket']['name']
    s3_name=event['Records'][0]['s3']['object']['key']
    response=s3.get_object(Bucket=bucketn,Key=s3_name)
    data=response['Body'].read().decode("utf-8")
    salesnbs=data.split("\n")
    for ko in salesnbs:
        kos=ko.split(",")
        data = { 
            "Date": kos[0],
            "name": kos[1],
            "fam": kos[2],
            "locati": kos[3],
            "adress": kos[4],
            "country": kos[5],
            "city": kos[6]
        }
       client.put_item(TableName='maysales',Item=json.loads(dynamo_json.dumps(data)))

2 Comments

How does this answer the OP's problem?
Your answer could be improved by elaborating on why using boto3.client over boto3.resource will answer the question. stackoverflow.com/help/how-to-answer

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.