Azure Speech-Text '_io.BytesIO' object has no attribute '_handle'

Question

I am attempting to convert a .wav file, containing audio of someone talking, to a transcription of what was said. It is a mobile app so I am using React Native and expo go for development. The audio is sent to an azure HTTP trigger function where the audio (encoded as Base64) is decoded attempted to be used for azure's speech recognition. I have made sure that the sample rate, channel and sample width are all correct for the sdk.

def speech_recognize_continuous_from_file(audio_data):
    speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)

    # ERROR OCCURS HERE: stream=audio_data does not work
    audio_config = speechsdk.audio.AudioConfig(stream=audio_data)

    speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)


def transcriptionFunction(req: func.HttpRequest) -> func.HttpResponse:
    logging.info('Python HTTP trigger function processed a request.')

    try:
        req_body = req.get_json()
        audioBase64 = req_body.get('audioBase64')

        # Converts base64 to wav
        decodedAudio = base64.b64decode(audioBase64)
        audioIO = io.BytesIO(decodedAudio)

        # Begins transcription
        speech_recognize_continuous_from_file(audioIO)
        

        return func.HttpResponse("Check Server Console for response", status_code=200)

I have tested my speech recognizing continuous function with a .wav file so I know that works. I have also checked the right format of the .wav file which is correct. Due to this being a serverless function, I cannot use filename= as there is no local storage.

Matt Drinkall · Accepted Answer · 2024-02-20 15:14:45Z

1

Your answer helped alot and sent me down the right path. You are correct that I need to use filename. The next error I encountered was you are unable to store files in azure functions as they are stateless. To then counter this issue all was needed was the use of tempfile library. The completed solution is below.

    with tempfile.NamedTemporaryFile(delete=False, suffix=".wav") as tmp_audio_file:
        tmp_audio_file.write(decodedAudio)
        tmp_filename = tmp_audio_file.name

    # Stores the resulted text
    text_result = speech_recognize_continuous_from_stream(tmp_filename)

    os.unlink(tmp_filename)

answered Feb 20, 2024 at 15:14

Matt Drinkall

132 bronze badges

Sign up to request clarification or add additional context in comments.

Comments

Dasari Kamali · Accepted Answer · 2024-02-11 08:15:48Z

The error '_io.BytesIO' object has no attribute '_handle' suggesting that the stream attribute is not recognized as expected by the Speech SDK.

The issue arises from passing a BytesIO object directly to speechsdk.audio.AudioConfig(stream=audio_stream). This constructor expects a file-like object, but a BytesIO object doesn't have a _handle attribute, causing the error.

To fix this, you can use a .wav file in the line below: speechsdk.audio.AudioConfig(filename="temp.wav") instead of passing the raw audio data directly to the speechsdk.audio.AudioConfig() constructor. Here's the modified code:

Code :

import logging
import azure.functions as func
import base64
import os
import azure.cognitiveservices.speech as speechsdk

speech_key = "<speech_key>"
service_region = "<speech_region>"

def speech_recognize_continuous_from_stream(audio_data):
    speech_config = speechsdk.SpeechConfig(subscription=speech_key, region=service_region)
    audio_config = speechsdk.audio.AudioConfig(filename="temp.wav")
    speech_recognizer = speechsdk.SpeechRecognizer(speech_config=speech_config, audio_config=audio_config)

    result = speech_recognizer.recognize_once()
    return result.text if result.reason == speechsdk.ResultReason.RecognizedSpeech else ""

def main(req: func.HttpRequest) -> func.HttpResponse:
    logging.info('Python HTTP trigger function processed a request.')

    try:
        req_body = req.get_json()
        audioBase64 = req_body.get('audioBase64')
        decodedAudio = base64.b64decode(audioBase64)
        
        with open("temp.wav", "wb") as audio_file:
            audio_file.write(decodedAudio)
        transcription_result = speech_recognize_continuous_from_stream("temp.wav")
        os.unlink("temp.wav")

        return func.HttpResponse(transcription_result, status_code=200)

    except Exception as e:
        logging.error(f"Error: {str(e)}")
        return func.HttpResponse("Internal Server Error", status_code=500)

Postman output :

{
    "audioBase64":"your_base64_data"
}

Hello, this is a test of the speech synthesis service.

enter image description here

Output :

It ran successfully as shown below.

C:\Users\xxxxxxx\Documents\xxxxxxx>func start
Found Python version 3.10.11 (python).

Azure Functions Core Tools
Core Tools Version:       4.0.5030 Commit hash: N/A  (64-bit)
Function Runtime Version: 4.15.2.20177


Functions:

        HttpTrigger1: [GET,POST] http://localhost:7071/api/HttpTrigger1

For detailed output, run func with --verbose flag.
[2024-02-10T19:58:48.856Z] Worker process started and initialized.
[2024-02-10T19:58:54.658Z] Host lock lease acquired by instance ID '00000xxxxxxxxxxxxxxxxxx'.
[2024-02-10T19:58:56.634Z] Executing 'Functions.HttpTrigger1' (Reason='This function was programmatically called via the host APIs.', Id=3cd9c444b944xxxxxxxxxxxx)
[2024-02-10T19:58:56.843Z] Python HTTP trigger function processed a request.
[2024-02-10T19:59:00.598Z] Executed 'Functions.HttpTrigger1' (Succeeded, Id=3cd9c444xxxxxxxxxx, Duration=4040ms)

enter image description here

Collectives™ on Stack Overflow

Azure Speech-Text '_io.BytesIO' object has no attribute '_handle'

2 Answers 2

Comments

Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

2 Answers 2

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Related