0

I'm trying to retrieve a big file from an API and save it on an Azure Storage account, so I am designing an Azure Function. I don't want my code to download all the data and then write all the data, I can have a data input stream from this API, and I would like to stream data to an output blob.

Here is a small example

import azure.functions as func

def main(req: func.HttpRequest, outputblob: func.Out[func.InputStream]) -> func.HttpResponse:
    name = "stranger"

    # mimick a stream
    for char in name:
        outputblob.set(char)

    return func.HttpResponse(
        "Hello "+name,
        status_code=200
    )

Here is my function.json:

{
  "scriptFile": "__init__.py",
  "bindings": [
    {
      "authLevel": "function",
      "type": "httpTrigger",
      "direction": "in",
      "name": "req",
      "methods": [
        "get",
        "post"
      ]
    },
    {
      "type": "http",
      "direction": "out",
      "name": "$return"
    },
    {
      "type": "blob",
      "direction": "out",
      "name": "outputblob",
      "path": "container/hello.txt",
      "connection": "connection_storage"
    }
  ]
}

And when I open the file container/hello.txt from my storage, it contains only the last character, "r", and weighs only 1 byte.

I think that outputblob.set(data) overwrites the data to the output blob.

How can I stream data and append it to my output blob? I'd rather use output blob bindings, but I can use "ContainerClient" objects.

(EDIT: In the docs, they specify that we can use

Streams as func.Out[func.InputStream]

)

2 Answers 2

1

In the loop, when you are mimicking a stream you are overriding the content of the output blob each iteration, that's why at the end you are receiving the last letter.

Solution: assign the whole array of bytes to the outputblob.

See https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-blob-output?tabs=python for the reference.

Sign up to request clarification or add additional context in comments.

7 Comments

Thanks for the answer, unfortunately I sometimes can't retrieve all the data in memory at once then write it to the output blob. For instance I could download a 10 GB file from an API, this would not fit in the memory.
I am not a python expert but I guess there is a possibility to "append" bytes to an existing stream, maybe that's a way to go?
I you find such a way, please write an answer and I'll accept it :)
The simplest way would be to copy the current content, append new data manually and set it to the blob: out=outputBlob.read().decode("utf-8")+char outputblob.set(out)
Or to implement AppendBlobService from Azure Storage SDK :) learn.microsoft.com/en-us/python/api/azure-storage-blob/…
|
1

I used the upload_blob method of azure.storage.blob.BlobClient with blob_type="AppendBlob":

import logging

import requests
from azure.storage.blob import BlobServiceClient


def stream_to_blob(url: str, filename: str) -> int:
    """Download a stream from an URL into a blob." """
    logging.info("Downloading...")
    sess = requests.Session()
    get_fichier = sess.get(url, stream=True)
    get_fichier.raise_for_status()

    # Connect to the storage account :
    connection_str = "DefaultEndpointsProtocol=..."
    blob_service_client = BlobServiceClient.from_connection_string(connection_str)
    container_client = blob_service_client.get_container_client("mycontainer")
    blob_client = container_client.get_blob_client(filename)
    filesize = 0

    # Appending data one block at a time
    # We can upload data up to 4 MB at a time
    for block in get_fichier.iter_content(4 * 1024 * 1024):
        filesize += len(block)
        logging.info(
            "Appending %d bytes to the blob (total = %d)...", len(block), filesize
        )
        blob_client.upload_blob(block, blob_type="AppendBlob")

    logging.info("Downloading finished")
    return filesize

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.