How to write a pandas dataframe to_json() to s3 in json format

Question

I have an AWS lambda function which creates a data frame, I need to write this file to a S3 bucket.

import pandas as pd
import boto3
import io

# code to get the df

destination = "output_" + str(datetime.datetime.now().strftime('%Y_%m_%d_%H_%M_%S')) + '.json'

df.to_json(destination) # this file should be written to S3 bucket

mellifluous · Accepted Answer · 2021-01-13 23:01:26Z

8

The following code runs in AWS Lambda and uploads the json file to S3.

Lambda role should have S3 access permissions.

import pandas as pd
import boto3
import io

# code to get the df

destination = "output_" + str(datetime.datetime.now().strftime('%Y_%m_%d_%H_%M_%S')) + '.json'

json_buffer = io.StringIO()

df.to_json(json_buffer)

s3 = boto3.resource('s3')
my_bucket = s3.Bucket('my-bucket-name')

my_bucket.put_object(Key=destination, Body=json_buffer.getvalue())

answered Jan 13, 2021 at 23:01

mellifluous

3,0756 gold badges39 silver badges51 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

kevin_theinfinityfund Over a year ago

Nailed it. All other s3/json answers on StackOverflow would not reconcile with the df.to_json part, which was a must for my problem. Thank you.

Aman Sehgal · Accepted Answer · 2022-02-10 15:23:45Z

0

You can use following code as well

#Creating Session using Boto3

session = boto3.Session(
aws_access_key_id='<key ID>',
aws_secret_access_key='<secret_key>'
)
 
#Create s3 session with boto3

s3 = session.resource('s3')
 
json_buffer = io.StringIO()
 
# Create dataframe and convert to pandas
df = spark.range(4).withColumn("organisation", lit("stackoverflow"))
df_p = df.toPandas()
df_p.to_json(json_buffer, orient='records')
 
#Create s3 object
object = s3.Object('<bucket-name>', '<JSON file name>')
 
#Put the object into bucket
result = object.put(Body=json_buffer.getvalue())

answered Feb 10, 2022 at 15:23

Aman Sehgal

5664 silver badges15 bronze badges

Collectives™ on Stack Overflow

How to write a pandas dataframe to_json() to s3 in json format

2 Answers 2

1 Comment

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

1 Comment

Comments

Your Answer

Sign up or log in

Post as a guest

Related