Importing a package from AWS lambda temp directory in Python 3.8

Question

The problem I have is similar to this SO question but the answer doesn't work for me. I am trying to import Python library (let's say xgboost) from /tmp folder in AWS Lambda.

Library requests is added to Lambda layer and what I did is:

import json
import io
import os
import zipfile
import requests
import sys

sys.path.insert(0, '/tmp/')
sys.path.append('/tmp/')

os.environ["PYTHONPATH"] = "/var/task"

def get_pkgs(url):
    print("Getting Packages...")
    re = requests.get(url)
    z = zipfile.ZipFile(io.BytesIO(re.content))
    print("Extracting Packages...")
    z.extractall("/tmp/")
    print("Packages are downloaded and extracted.")
    
def attempt_import():
    print("="*50)
    print("ATTEMPT TO IMPORT DEPENDENCIES...")
    print("="*50)
    import xgboost
    print("IMPORTING DONE.")
    
def main():
    URL = "https://MY_BUCKET.s3.MY_REGION.amazonaws.com/MY_FOLDER/xgboost/xgboost.zip"

    get_pkgs(URL)
    attempt_import()
    
def lambda_handler(event, context):
    main()
    return "Hello Lambda"

The error I get is [ERROR] ModuleNotFoundError: No module named 'xgboost'. I gave my S3 bucket all necessary permissions, and I am positive that Lambda can access the .zip file since the requests.get works and variable z returns:

<zipfile.ZipFile file=<_io.BytesIO object at 0x7fddaf31c400> mode='r'>

Downloading packages in the lambda execution is wasting your $. Instead, you should either package your dependencies into the deployment package or build a lambda layer. — jellycsc
– jellycsc, Commented Mar 10, 2021 at 17:00
@jellycsc The issue is I already have multiple packages in 5 layers, that are close to 260MB which is the limit. Temporary lambda folder has additional space of 512MB so this solution can work to me. — Makaroni
– Makaroni, Commented Mar 10, 2021 at 17:12
@jellycsc Do you mean Sagemaker? Can you send some reference/material? — Makaroni
– Makaroni, Commented Mar 10, 2021 at 17:16

Naman · Accepted Answer · 2021-03-11 15:46:00Z

1

You could try using the boto3 library to download the file from S3 to /tmp directory as explained in https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.download_file

import boto3
s3 = boto3.resource('s3')
s3.meta.client.download_file('mybucket', 'hello.txt', '/tmp/hello.txt')

answered Mar 11, 2021 at 15:46

Naman

3062 silver badges10 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Makaroni Over a year ago

This could be the solution, but the issue is when you try to import large python package (such as xgboost), downloaded .zip file and its extracted folder in tmp directory are larger than 500MB which results in: [ERROR] OSError: [Errno 28] No space left on device

dsumsky Over a year ago

tmp directory or ephemeral storage of Lambda function can be expanded up to 10GB now.

Makaroni · Accepted Answer · 2021-03-20 18:33:22Z

0

Actually, my code above works and I had a rather silly error. Instead of zipping the xgboost package folders (xgboost, xgboost.libs and xgboost.dist-info) I actually zipped their parental folder which I named package-xgoboost, and that didn't worked in AWS lambda. Be sure that you actually zip those 3 folders directly.

Also, make sure your xgboost library is up-to-date. Previously I used version 1.2.1 which didn't work either. Upgrading the library and zipping the newest xgboost version (in my case 1.3.3) finally worked.

answered Mar 20, 2021 at 18:33

Makaroni

9324 gold badges17 silver badges42 bronze badges

Collectives™ on Stack Overflow

Importing a package from AWS lambda temp directory in Python 3.8

2 Answers 2

2 Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

2 Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related