0

I am working with Mlflow==2.19.0 in a Red Hat Enterprise Linux Server release 7.9 (Maipo). Everythig works fine except with the log_image method that for some reason is converting parts of the string into special characters

wrong path /data/gcprcmsbx/work/hive/gcprcmsbx_work/citilabs/sherlock/mlflow/artifacts/ip70574/1/3403fe396bf64129980fb6e7f26b5875/artifacts/images/sample_image%step%0%timestamp%1745516773738�f3cea2-6baa-4b8a-9107-68f0ab83b572.png

right path hdfs:/data/gcprcmsbx/work/hive/gcprcmsbx_work/citilabs/sherlock/mlflow/artifacts/ip70574/1/3403fe396bf64129980fb6e7f26b5875/artifacts/images/sample_image%step%0%timestamp%1745516773738%daf3cea2-6baa-4b8a-9107-68f0ab83b572.png

I tried to use any of the following env configurations

os.environ["ESCAPE_PERCENT"] = "1"
os.environ["LANG"] = "en_US.UTF-8"
os.environ["LC_ALL"] = "en_US.UTF-8"
os.environ["PYTHONUTF8"] = "1"
os.environ["PYTHONIOENCODING"] = "UTF-8"
os.environ["NO_PROXY"] = "*"

in the python scripts, but they do not work.

Does any one know how to address this issue? By the way, I cannot use sudo since I do not have admin permissions and I have to log the images in that format to make comparisons of images between runs in mlflow.

Keep in mind that I am using the log_image() method that comes from the following

def log_image(
    image: Union["numpy.ndarray", "PIL.Image.Image", "mlflow.Image"],
    artifact_file: Optional[str] = None,
    key: Optional[str] = None,
    step: Optional[int] = None,
    timestamp: Optional[int] = None,
    synchronous: Optional[bool] = False,
) -> None:
    """
    Logs an image in MLflow, supporting two use cases:

    1. Time-stepped image logging:
        Ideal for tracking changes or progressions through iterative processes (e.g.,
        during model training phases).

        - Usage: :code:`log_image(image, key=key, step=step, timestamp=timestamp)`

    2. Artifact file image logging:
        Best suited for static image logging where the image is saved directly as a file
        artifact.

        - Usage: :code:`log_image(image, artifact_file)`

    The following image formats are supported:
        - `numpy.ndarray`_
        - `PIL.Image.Image`_

        .. _numpy.ndarray:
            https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html

        .. _PIL.Image.Image:
            https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.Image

        - :class:`mlflow.Image`: An MLflow wrapper around PIL image for convenient image logging.

    Numpy array support
        - data types:

            - bool (useful for logging image masks)
            - integer [0, 255]
            - unsigned integer [0, 255]
            - float [0.0, 1.0]

            .. warning::

                - Out-of-range integer values will raise ValueError.
                - Out-of-range float values will auto-scale with min/max and warn.

        - shape (H: height, W: width):

            - H x W (Grayscale)
            - H x W x 1 (Grayscale)
            - H x W x 3 (an RGB channel order is assumed)
            - H x W x 4 (an RGBA channel order is assumed)

    Args:
        image: The image object to be logged.
        artifact_file: Specifies the path, in POSIX format, where the image
            will be stored as an artifact relative to the run's root directory (for
            example, "dir/image.png"). This parameter is kept for backward compatibility
            and should not be used together with `key`, `step`, or `timestamp`.
        key: Image name for time-stepped image logging. This string may only contain
            alphanumerics, underscores (_), dashes (-), periods (.), spaces ( ), and
            slashes (/).
        step: Integer training step (iteration) at which the image was saved.
            Defaults to 0.
        timestamp: Time when this image was saved. Defaults to the current system time.
        synchronous: *Experimental* If True, blocks until the image is logged successfully.

    .. code-block:: python
        :caption: Time-stepped image logging numpy example

        import mlflow
        import numpy as np

        image = np.random.randint(0, 256, size=(100, 100, 3), dtype=np.uint8)

        with mlflow.start_run():
            mlflow.log_image(image, key="dogs", step=3)

    .. code-block:: python
        :caption: Time-stepped image logging pillow example

        import mlflow
        from PIL import Image

        image = Image.new("RGB", (100, 100))

        with mlflow.start_run():
            mlflow.log_image(image, key="dogs", step=3)

    .. code-block:: python
        :caption: Time-stepped image logging with mlflow.Image example

        import mlflow
        from PIL import Image

        # If you have a preexisting saved image
        Image.new("RGB", (100, 100)).save("image.png")

        image = mlflow.Image("image.png")
        with mlflow.start_run() as run:
            mlflow.log_image(run.info.run_id, image, key="dogs", step=3)

    .. code-block:: python
        :caption: Legacy artifact file image logging numpy example

        import mlflow
        import numpy as np

        image = np.random.randint(0, 256, size=(100, 100, 3), dtype=np.uint8)

        with mlflow.start_run():
            mlflow.log_image(image, "image.png")

    .. code-block:: python
        :caption: Legacy artifact file image logging pillow example

        import mlflow
        from PIL import Image

        image = Image.new("RGB", (100, 100))

        with mlflow.start_run():
            mlflow.log_image(image, "image.png")
    """
    run_id = _get_or_start_run().info.run_id
    MlflowClient().log_image(run_id, image, artifact_file, key, step, timestamp, synchronous)

You can use the following code to replicate what I am doing

x = range(10)
y = [i**2 for i in x]

plt.plot(x, y, marker='o')

plt.title('Basic Line Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')

name_jpeg = 'line_plot.jpeg'
plt.savefig(name_jpeg, format='jpeg')


fig = plt.gcf()

def fig2img(fig):
    """Converts a Matplotlib figure to a PIL Image."""
    buf = io.BytesIO()
    fig.savefig(buf, format='png', bbox_inches='tight')
    buf.seek(0)
    img = Image.open(buf)
    return img


pil_image = fig2img(fig=fig)

mlflow.log_image(image=pil_image,                 
                 key='sample_image',
                 step=None,
                 timestamp=None)

Note: I must use key != None because I need to compare images between runs

2

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.