I am trying to write a CSV file to my S3 bucket from inside a Lambda function. Everything is fine, except I cannot capture special characters; basically I need my file to be UTF-8 encoded. I do not want to use pandas or unicodecsv as those are not inbuilt to Lambda's environment.
Below is my current Lambda function:
import boto3
import csv
import io
def lambda_handler(event, context):
s3 = boto3.resource('s3')
bucket = s3.Bucket("my-bucket-name-goes-here")
fn = "sample_csv_lambda.csv"
write_csv(fn, bucket)
def write_csv(target_filename, bucket):
buff = io.StringIO()
writer = csv.writer(buff, dialect="excel", delimiter=",")
writer.writerow([f"header{i}" for i in range(1, 6)])
writer.writerow([1, 2, 3, 4, 5])
writer.writerow(["u", "b", "w", "d", "ş"])
writer.writerow(["n", "p", "m", "q", "ğ"])
buff2 = io.BytesIO(buff.getvalue().encode(encoding="UTF-8"))
print(buff2.getvalue().decode("utf-8"))
bucket.upload_fileobj(buff2, target_filename)
The print value on the second-to-last line outputs the special characters as intended, however once I download and open the CSV file, the characters in it are still not UTF-8.
PS: I like the current formulation of my code as I do not need to temporarily save the file in a "/tmp" folder as suggested by some other questions/answers. I also do not need to package and upload pandas/unicodecsv to my Lambda environment; too complicated for a beginner like me. Please keep this in mind when you answer.
are still not UTF-8.mean? Is the text mangled? Did you expect non-English characters to somehow change? This page is UTF8, the code you posted is UTF8,"ğ"is a UTF8 string with a single charactersaving CSVs as "utf-8-sig"but here you usedutf-8which doesn't emit the BOM that would tell Excel this is a UTF8 fileutf-8-siginstead ofutf-8. If you want to create Excel files, you can use a library like opepyxl to create real Excel files. What you do now is force Excel to import a text file using defaults. If you used the Data > Import menu you'd be able to specify the encoding. Right now Excel has to guess