8

I need to convert a .zip file from S3 to a .gzip file using boto3 python in an AWS lambda function. Any suggestions on how to do this?

Here is what I have so far:

import json
import boto3
import zipfile
import gzip

s3 = boto3.resource('s3')

def lambda_handler(event, context):

    bucket = event['Records'][0]['s3']['bucket']['name']
    key = event['Records'][0]['s3']['object']['key']

    try: 
        s3Obj = s3.Object(bucket_name=bucket, key=key)
        response = s3Obj.get()
        data = response['Body'].read()
        zipToGzip = gzip.open(data, 'wb')
        zipToGzip.write(s3.upload_file(bucket, (s3 + '.gz')))
        zipToGzip.close()
    except Exception as e:
        print(e)
        print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
        raise e
6
  • More details and your current code would help. Do you want to re-upload the file to S3 gziped, or just do something with it locally gzipped? Why does it have to be a lambda function? Do you mean python lambda, or AWS lambda? Commented Oct 14, 2015 at 17:09
  • I mean AWS Lambda Function using python as it is supported now. I have a file on S3 that is in .zip format, I need to change it to .gzip format. Commented Oct 14, 2015 at 17:26
  • Great, thanks for the clarification. What happens with the current code? Does it raise an exception, or not do what you want...? Commented Oct 14, 2015 at 17:51
  • In that current code, when it gets to the zipToGzip = gzip.open(data, 'wb') piece, it errors saying: file() argument 1 must be encoded string without NULL bytes, not str Commented Oct 14, 2015 at 17:58
  • Sounds like either the S3 object doesn't exist, or the bucket / key are incorrect? Or could be a permission issue on the object possibly. I'd suggest checking that, and printing out data to verify what is there. Commented Oct 14, 2015 at 18:07

1 Answer 1

11

OK, got it figured out. Thanks for your input Lee.

import json
import boto3
import zipfile
import gzip

print('Loading function')

s3 = boto3.resource('s3')
s3_client = boto3.client('s3')

def lambda_handler(event, context):

    # Get the object from the event and show its content type
    bucket = event['Records'][0]['s3']['bucket']['name']
    key = event['Records'][0]['s3']['object']['key']

    try: 
        s3_client.download_file(bucket, key, '/tmp/file.zip')
        zfile = zipfile.ZipFile('/tmp/file.zip')
        namelist = zfile.namelist()

        if len(namelist) >1:
            pass
            #alertme()

        for filename in namelist:
            data = zfile.read(filename)
            f = open('/tmp/' + str(filename), 'wb')
            f.write(data)
            f.close()

        zipToGzip = gzip.open('/tmp/data.gz', 'wb')
        zipToGzip.write(data)
        zipToGzip.close()
        s3_client.upload_file('/tmp/data.gz', bucket, key + '.gz')
        s3_client.delete_object(Bucket=bucket, Key=key)
    except Exception as e:
        print(e)
        print('Error getting object {} from bucket {}. Make sure they exist and your bucket is in the same region as this function.'.format(key, bucket))
        raise e
Sign up to request clarification or add additional context in comments.

3 Comments

What was the issue? Quite a few changes in your code. But saving the object to a temporary file fixed it?
@Lee, I'm pretty sure the issue is that the Boto3 object isn't being read correctly as binary. So Scotty is having to download the file from the S3 bucket, instead of using response['Body'].read() with a get - which should be the binary file contents. I'm having the same issue - trying to print response['Body'].read() just gives me the string "PK". Not sure what this means. Having to pull from the bucket is pretty annoying.
You can accept your answer to your own question if you like

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.