1

i want to read audio file from s3 directly in python.

First, I record audio, here is my blob settings

blob = new Blob(audioChunks,{type: 'audio/wav'});

then using django i uploaded this file to s3

req=request.POST.get('data')
d=req.split(",")[1]
file_content_io = BytesIO(base64.b64decode(d))
s3_path='audio/file_name_{}.wav'.format(random.randint(0,99))
default_storage.save(s3_path, file_content_io)

then i download file directly

from scipy.io.wavfile import read
from io import BytesIO
from urllib.request import urlopen

with urlopen(file) as response:
    audio=BytesIO(response.read())
    speech_array=read(audio)

Now its giving me following error

ValueError: File format b'OggS' not understood. Only 'RIFF' and 'RIFX' supported.

Any solution? I also tried librosa, thats also not working. The only thing i want to read file directly without saving in disk

1 Answer 1

1

Don't know if you are willing to switch from urllib to boto3, but I resolved the issue of loading .wav files from S3 directly into Python using boto3:

import io
import boto3
import librosa

s3 = boto3.resource('s3')
bucket = s3.Bucket('bucket_name')
for file in  bucket.objects.filter(Prefix='your_prefix'):
    bin_obj = file.get()['Body'].read()
    data = librosa.load(io.BufferedReader(io.BytesIO(bin_obj)))
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.