3

I am trying to read a file from S3, which has the following content stored in it:

   {"empID":{"n":"7"},"name":{"s":"NewEntry"}}
   {"empID":{"n":"3"},"name":{"s":"manish"}}
   {"empID":{"n":"2"},"name":{"s":"mandeep"}}
   {"empID":{"n":"4"},"name":{"s":"Vikas"}}
   {"empID":{"n":"1"},"name":{"s":"babbar"}}

I want to iterate over each and every object and do some some processing on them.

I am taking reference from this code:

import json
import boto3
s3_obj =boto3.client('s3')

s3_clientobj = s3_obj.get_object(Bucket='dane-fetterman-bucket', Key='mydata.json')
s3_clientdata = s3_clientobj['Body'].read().decode('utf-8')

print("printing s3_clientdata")
print(s3_clientdata)
print(type(s3_clientdata))


s3clientlist=json.loads(s3_clientdata)
print("json loaded data")
print(s3clientlist)
print(type(s3clientlist))

but there is not any "Body" attribute in the file. Can i get some points to do the desired stuff.

1 Answer 1

1

The issue is that the file actually contains individual JSON on each line, rather than being a complete JSON object itself.

Therefore, the program needs to process each line independently:

import json
import boto3

s3_client = boto3.client('s3')

s3_clientobj = s3_client.get_object(Bucket='my-bucket', Key='mydata.json')

for line in s3_clientobj['Body'].iter_lines():
    object = json.loads(line)
    print(f"ID: {object['empID']['n']} Name: {object['name']['s']}")

Alternatively, you could download the whole object to disk, then just use normal for line in open('file'): syntax.

See also: Read a file line by line from S3 using boto?

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.