I am working with Mlflow==2.19.0 in a Red Hat Enterprise Linux Server release 7.9 (Maipo). Everythig works fine except with the log_image method that for some reason is converting parts of the string into special characters
wrong path /data/gcprcmsbx/work/hive/gcprcmsbx_work/citilabs/sherlock/mlflow/artifacts/ip70574/1/3403fe396bf64129980fb6e7f26b5875/artifacts/images/sample_image%step%0%timestamp%1745516773738�f3cea2-6baa-4b8a-9107-68f0ab83b572.png
right path hdfs:/data/gcprcmsbx/work/hive/gcprcmsbx_work/citilabs/sherlock/mlflow/artifacts/ip70574/1/3403fe396bf64129980fb6e7f26b5875/artifacts/images/sample_image%step%0%timestamp%1745516773738%daf3cea2-6baa-4b8a-9107-68f0ab83b572.png
I tried to use any of the following env configurations
os.environ["ESCAPE_PERCENT"] = "1"
os.environ["LANG"] = "en_US.UTF-8"
os.environ["LC_ALL"] = "en_US.UTF-8"
os.environ["PYTHONUTF8"] = "1"
os.environ["PYTHONIOENCODING"] = "UTF-8"
os.environ["NO_PROXY"] = "*"
in the python scripts, but they do not work.
Does any one know how to address this issue? By the way, I cannot use sudo since I do not have admin permissions and I have to log the images in that format to make comparisons of images between runs in mlflow.
Keep in mind that I am using the log_image() method that comes from the following
def log_image(
image: Union["numpy.ndarray", "PIL.Image.Image", "mlflow.Image"],
artifact_file: Optional[str] = None,
key: Optional[str] = None,
step: Optional[int] = None,
timestamp: Optional[int] = None,
synchronous: Optional[bool] = False,
) -> None:
"""
Logs an image in MLflow, supporting two use cases:
1. Time-stepped image logging:
Ideal for tracking changes or progressions through iterative processes (e.g.,
during model training phases).
- Usage: :code:`log_image(image, key=key, step=step, timestamp=timestamp)`
2. Artifact file image logging:
Best suited for static image logging where the image is saved directly as a file
artifact.
- Usage: :code:`log_image(image, artifact_file)`
The following image formats are supported:
- `numpy.ndarray`_
- `PIL.Image.Image`_
.. _numpy.ndarray:
https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html
.. _PIL.Image.Image:
https://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.Image
- :class:`mlflow.Image`: An MLflow wrapper around PIL image for convenient image logging.
Numpy array support
- data types:
- bool (useful for logging image masks)
- integer [0, 255]
- unsigned integer [0, 255]
- float [0.0, 1.0]
.. warning::
- Out-of-range integer values will raise ValueError.
- Out-of-range float values will auto-scale with min/max and warn.
- shape (H: height, W: width):
- H x W (Grayscale)
- H x W x 1 (Grayscale)
- H x W x 3 (an RGB channel order is assumed)
- H x W x 4 (an RGBA channel order is assumed)
Args:
image: The image object to be logged.
artifact_file: Specifies the path, in POSIX format, where the image
will be stored as an artifact relative to the run's root directory (for
example, "dir/image.png"). This parameter is kept for backward compatibility
and should not be used together with `key`, `step`, or `timestamp`.
key: Image name for time-stepped image logging. This string may only contain
alphanumerics, underscores (_), dashes (-), periods (.), spaces ( ), and
slashes (/).
step: Integer training step (iteration) at which the image was saved.
Defaults to 0.
timestamp: Time when this image was saved. Defaults to the current system time.
synchronous: *Experimental* If True, blocks until the image is logged successfully.
.. code-block:: python
:caption: Time-stepped image logging numpy example
import mlflow
import numpy as np
image = np.random.randint(0, 256, size=(100, 100, 3), dtype=np.uint8)
with mlflow.start_run():
mlflow.log_image(image, key="dogs", step=3)
.. code-block:: python
:caption: Time-stepped image logging pillow example
import mlflow
from PIL import Image
image = Image.new("RGB", (100, 100))
with mlflow.start_run():
mlflow.log_image(image, key="dogs", step=3)
.. code-block:: python
:caption: Time-stepped image logging with mlflow.Image example
import mlflow
from PIL import Image
# If you have a preexisting saved image
Image.new("RGB", (100, 100)).save("image.png")
image = mlflow.Image("image.png")
with mlflow.start_run() as run:
mlflow.log_image(run.info.run_id, image, key="dogs", step=3)
.. code-block:: python
:caption: Legacy artifact file image logging numpy example
import mlflow
import numpy as np
image = np.random.randint(0, 256, size=(100, 100, 3), dtype=np.uint8)
with mlflow.start_run():
mlflow.log_image(image, "image.png")
.. code-block:: python
:caption: Legacy artifact file image logging pillow example
import mlflow
from PIL import Image
image = Image.new("RGB", (100, 100))
with mlflow.start_run():
mlflow.log_image(image, "image.png")
"""
run_id = _get_or_start_run().info.run_id
MlflowClient().log_image(run_id, image, artifact_file, key, step, timestamp, synchronous)
You can use the following code to replicate what I am doing
x = range(10)
y = [i**2 for i in x]
plt.plot(x, y, marker='o')
plt.title('Basic Line Plot')
plt.xlabel('X-axis')
plt.ylabel('Y-axis')
name_jpeg = 'line_plot.jpeg'
plt.savefig(name_jpeg, format='jpeg')
fig = plt.gcf()
def fig2img(fig):
"""Converts a Matplotlib figure to a PIL Image."""
buf = io.BytesIO()
fig.savefig(buf, format='png', bbox_inches='tight')
buf.seek(0)
img = Image.open(buf)
return img
pil_image = fig2img(fig=fig)
mlflow.log_image(image=pil_image,
key='sample_image',
step=None,
timestamp=None)
Note: I must use key != None because I need to compare images between runs
urllib.parse.unquote()and it may convert%dato strange char.