1

I have a folder called data which is created from the following code

import shutil
import os

folderName = "..Data"
current_directory = os.getcwd()
final_directory = os.path.join(current_directory,folderName)
if os.path.exists(final_directory):
    shutil.rmtree(final_directory)
if not os.path.exists(final_directory):
    os.makedirs(final_directory)

I want to get the path of this data folder and refrence it in a YAML file. I dont want to manually state the path as it will be run on azure using azure pipelines. is there a way i can get this path from a python script into a YAML file? for reference https://learn.microsoft.com/en-us/azure/machine-learning/how-to-create-register-data-assets?tabs=CLI trying to create a URI folder

1 Answer 1

0

If you have the option, you could use the python SDK to create a uri_folder data asset.

You can create a data asset in Azure Machine Learning using [the following] Python Code

Here is the sample provided in the docs, slightly tweaked for your case:

from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes
import shutil
import os

folderName = "..Data"
current_directory = os.getcwd()
final_directory = os.path.join(current_directory,folderName)
if os.path.exists(final_directory):
    shutil.rmtree(final_directory)
if not os.path.exists(final_directory):
    os.makedirs(final_directory)

# Supported paths include:
# local: './<path>'
# blob:  'https://<account_name>.blob.core.windows.net/<container_name>/<path>'
# ADLS gen2: 'abfss://<file_system>@<account_name>.dfs.core.windows.net/<path>/'
# Datastore: 'azureml://datastores/<data_store_name>/paths/<path>'

my_data = Data(
    path=final_directory,
    type=AssetTypes.URI_FOLDER,
    description="<description>",
    name="<name>",
    version='<version>'
)

ml_client.data.create_or_update(my_data)

Update

If yaml is the only option, note that it is possible to embed python code into a yaml file like so:

$schema: https://azuremlschemas.azureedge.net/latest/data.schema.json

# Supported paths include:
# local: ./<path>
# blob:  https://<account_name>.blob.core.windows.net/<container_name>/<path>
# ADLS gen2: abfss://<file_system>@<account_name>.dfs.core.windows.net/<path>/
# Datastore: azureml://datastores/<data_store_name>/paths/<path>
type: uri_folder
name: <name_of_data>
description: <description goes here>
path: |
    import os
    folderName = "..Data"
    current_directory = os.getcwd()
    final_directory = os.path.join(current_directory,folderName)
    print(final_directory)
Sign up to request clarification or add additional context in comments.

2 Comments

if possible want to avoid using python SDK for this in my use case would be better if I can use azure cli for this. using azure pipelines
@muhammad, I have updated the answer, but please test and see.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.