7

I am writing a python 3 AWS Lambda routine, that will take the S3 bucket and Key (source_key) from the Lambda event object and copy the file to another S3 bucket with the same Key value (Destination_key).

However, the S3 Key in the event object is encoded in such a manner that when I use the source_key value to write to the destination bucket S3 throws a 404 error.

Key returned by S3 Lambda Event object:

'object': {'key': 'SBN-Fwd_+USPS+-+Springdale%2C+OH+-+Mail+Processing+Facility+-+Bid+Extension+Notice.eml' 

Error when submitting 'key' value back to S3:

{'Error': {'Code': 'NoSuchKey', 'Message': 'The specified key does not exist.', 'Key': 'SBN-Fwd_+USPS+-+Springdale%2C+OH+-+Mail+Processing+Facility+-+Bid+Extension+Notice.eml'}, 'ResponseMetadata': {'RequestId': '2C0154D58032B5B4', 'HostId': 'zxp56SHdODohW5ln8B5GOW+YPqGfL4/kJGD+qV46yMhLZU92BrOC/hlh/HPHywAuGuJiICL0RFk=', 'HTTPStatusCode': 404, 'HTTPHeaders': {'x-amz-request-id': '2C0154D58032B5B4', 'x-amz-id-2': 'zxp56SHdODohW5ln8B5GOW+YPqGfL4/kJGD+qV46yMhLZU92BrOC/hlh/HPHywAuGuJiICL0RFk=', 'content-type': 'application/xml', 'transfer-encoding': 'chunked', 'date': 'Thu, 20 Sep 2018 16:40:00 GMT', 'server': 'AmazonS3'}, 'RetryAttempts': 0}}

I simply used the boto3 to copy the source_key to the destination_key while specifying a different bucket.

 copy_source = {'Bucket': source_bucket, 'Key': source_key}
 destination_key = source_key

 s3resource.copy(copy_source ,destination_bucket, destination_key)

This routine works perfectly as long as the source_key does not contain any strange characters (space, comma, etc)

How can I process the source_key to make sure that it is compatible as a destination key? I could not find any documentation on what S3 expects for encoding.

1 Answer 1

9

S3 keys in event messages are URL encoded. From AWS documentation:

The s3 key provides information about the bucket and object involved in the event. The object key name value is URL encoded. For example, "red flower.jpg" becomes "red+flower.jpg" (Amazon S3 returns "application/x-www-form-urlencoded" as the content type in the response).

In order to re-use bucket and key correctly you need to decode them. In Python (>= 3.5) you can use unquote_plus

from urllib.parse import unquote_plus 

copy_source = {'Bucket': source_bucket, 'Key': source_key}
destination_bucket = unquote_plus(source_bucket, encoding='utf-8')
destination_key = unquote_plus(source_key, encoding='utf-8')

s3resource.copy(copy_source ,destination_bucket, destination_key)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.