0

I wrote a python script initialized using poetry to connect to Gdrive API and dockerized it using this Dockerfile :

# First stage: build the application
FROM python:3.10.2-slim as builder

# Set the working directory
WORKDIR /app

# Install Poetry and dependencies for building packages
RUN apt-get update && apt-get install -y libpq-dev gcc \
    && pip install --no-cache-dir --upgrade pip \
    && pip install --no-cache-dir poetry

# Copy necessary files
COPY . .

# Install Python dependencies in /.venv
RUN poetry config virtualenvs.create true \
    && poetry config virtualenvs.in-project true \
    && poetry install --no-interaction --no-ansi

# Second stage: create the final image
FROM python:3.10.2-slim

# Set the working directory
WORKDIR /home/appuser

# Install the libpq-dev package
RUN apt-get update && apt-get install -y libpq-dev

# Copy the virtual environment from the builder stage
COPY --from=builder /app/.venv /.venv
ENV PATH="/.venv/bin:$PATH"

# Copy only the necessary files to the image
COPY qdm_report_project/ qdm_report_project/

RUN ls ./qdm_report_project/

# Create a new user (to avoid security risks with the root user)
RUN useradd --create-home appuser \
    && mkdir -p /home/appuser/.config \
    && chown -R appuser:appuser /home/appuser/

# Create a new entrypoint script - Used for output files ownership
RUN echo '#!/bin/bash' > /entrypoint.sh \
    && echo 'if [ "$(id -u)" = "0" ]; then' >> /entrypoint.sh \
    && echo '  groupadd -g ${USER_GID} appusergrp' >> /entrypoint.sh \
    && echo '  usermod -u ${USER_UID} -g ${USER_GID} appuser' >> /entrypoint.sh \
    && echo '  exec runuser -u appuser -- python qdm_report_project/main.py "$@"' >> /entrypoint.sh \
    && echo 'else' >> /entrypoint.sh \
    && echo '  exec python qdm_report_project/main.py "$@"' >> /entrypoint.sh \
    && echo 'fi' >> /entrypoint.sh \
    && chmod +x /entrypoint.sh

# Switch to appuser
USER appuser

ENTRYPOINT ["/entrypoint.sh"]

the structure of the project inside the container looks like this :

enter image description here

the main.py contains :

from googleDrive import GoogleApi
def main():
    google_api = GoogleApi(settings)

and GoogleApi.py contains :

import pkg_resources

def get_resource_path(package_name: str, file_name: str):
    """
    Gets file's path at the top-level of resources folder
    :param package_name:
    :param file_name:
    :return:
    """
    return pkg_resources.resource_filename(package_name, os.path.join("resources", file_name))


    class GoogleApi:
        def __init__(self, config: LazySettings):
            ...
            credentials_path = get_resource_path(Path(__file__).parent.name, "serv_acc_google.json")
            print(credentials_path)
            ...

the code just work fine when executed locally but not inside the container.

root@7dea7641260b:/home/appuser/qdm_report_project# python main.py --config "/some_path"
Traceback (most recent call last):
  File "/.venv/lib/python3.10/site-packages/pkg_resources/__init__.py", line 416, in get_provider
    module = sys.modules[moduleOrReq]
KeyError: 'qdm_report_project'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/appuser/qdm_report_project/main.py", line 486, in <module>
    main()
  File "/home/appuser/qdm_report_project/main.py", line 416, in main
    google_api = GoogleApi(settings)
  File "/home/appuser/qdm_report_project/googleDrive.py", line 129, in __init__
    credentials_path = get_resource_path(Path(__file__).parent.name, "serv_acc_google.json")
  File "/home/appuser/qdm_report_project/googleDrive.py", line 117, in get_resource_path
    return pkg_resources.resource_filename(package_name, os.path.join("resources", file_name))
  File "/.venv/lib/python3.10/site-packages/pkg_resources/__init__.py", line 1375, in resource_filename
    return get_provider(package_or_requirement).get_resource_filename(
  File "/.venv/lib/python3.10/site-packages/pkg_resources/__init__.py", line 418, in get_provider
    __import__(moduleOrReq)
ModuleNotFoundError: No module named 'qdm_report_project'

do you have any idea what it could be ?

Edit:

running python -m pip list shows me the following :

enter image description here

I can see the editable project location as "/app" of the missing module

Thank you

6
  • How are you actually running the container? How are you getting that shell? In particular, it looks like the command you're running in the debugging shell is in a different directory from the base directory of the application, and the fact that you're in the module directory as opposed to having qdm_report_project as a subdirectory of the current directory makes a difference. Commented Jan 21 at 12:27
  • the code is now executed via an airflow dag using the DockerOperator resulting of course to the same error. The shell I shared is coming from this command : docker run -it -u 0 --entrypoint /bin/bash <image id> Commented Jan 21 at 14:29
  • If you remove all of the options – just docker run image-name – do you have the same problem, or does the application start up? How are you launching it in Airflow (you wouldn't be able to use an interactive shell there)? Commented Jan 21 at 14:36
  • yes docker run gives the same error. The problem is somewhere in the container before reaching the Airflow dag. Commented Jan 21 at 14:42
  • running python -m pip list I can see the editable project location as "/app" of the missing module Commented Jan 21 at 14:53

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.