Python: write a wav file into numpy float array

Question

ifile = wave.open("input.wav")

how can I write this file into a numpy float array now?

@JoranBeasley it has to be float.

IAM
– IAM

2013-05-27 18:51:31 +00:00
Commented May 27, 2013 at 18:51 — IAM
– IAM, Commented May 27, 2013 at 18:51

SuperStormer · Accepted Answer · 2021-11-07 01:54:24Z

45

>>> from scipy.io.wavfile import read
>>> a = read("adios.wav")
>>> numpy.array(a[1],dtype=float)
array([ 128.,  128.,  128., ...,  128.,  128.,  128.])

Typically it would be bytes which are then ints... here we just convert it to float type.

You can read about read here: https://docs.scipy.org/doc/scipy/reference/tutorial/io.html#module-scipy.io.wavfile

edited Nov 7, 2021 at 1:54

SuperStormer

5,5185 gold badges29 silver badges40 bronze badges

answered May 27, 2013 at 18:58

Joran Beasley

114k13 gold badges168 silver badges187 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

IAM Over a year ago

thanks! one more question, how could I do this for all .wav-files in the current working directory? I mean saving each file in a cycle in an array, and concentrating it by the end of each step to a main array?

Matthew Walker · Accepted Answer · 2020-06-21 06:40:49Z

25

Seven years after the question was asked...

import wave
import numpy

# Read file to get buffer                                                                                               
ifile = wave.open("input.wav")
samples = ifile.getnframes()
audio = ifile.readframes(samples)

# Convert buffer to float32 using NumPy                                                                                 
audio_as_np_int16 = numpy.frombuffer(audio, dtype=numpy.int16)
audio_as_np_float32 = audio_as_np_int16.astype(numpy.float32)

# Normalise float32 array so that values are between -1.0 and +1.0                                                      
max_int16 = 2**15
audio_normalised = audio_as_np_float32 / max_int16

edited Jun 21, 2020 at 6:40

answered Jun 10, 2020 at 8:05

Matthew Walker

2,8054 gold badges27 silver badges36 bronze badges

9 Comments

Unsigned_Arduino Over a year ago

How should I install the wave module? pip install wave?

Matthew Walker Over a year ago

@Unsigned_Arduino Have you just tried it? According to the docs, the wave module has been part of Python since at least version 2.7, and it's still included in version 3.8: docs.python.org/3.8/library/wave.html

Unsigned_Arduino Over a year ago

Just tried it, it's included. I never seen this module before so I questioned it's existance in the PSL.

Trees Over a year ago

Hi Matthew Walker, thanks for such a nice answer. I want to ask, that the size of audio_normalised is twice that of samples, so is it representing data for 2 channels, or sth else, please can you elaborate a bit?

Matthew Walker Over a year ago

@avocado getsampwidth() returns the sample width in bytes, so 2 bytes => int16, or 4 bytes => int32. I guess I just hadn't come across WAV files with anything other than 2 bytes per sample. Good point.

|

Community · Accepted Answer · 2020-12-20 16:53:02Z

13

Use librosa package and simply load wav file to numpy array with:

y, sr = librosa.load(filename)

loads and decodes the audio as a time series y, represented as a one-dimensional NumPy floating point array. The variable sr contains the sampling rate of y, that is, the number of samples per second of audio. By default, all audio is mixed to mono and resampled to 22050 Hz at load time. This behavior can be overridden by supplying additional arguments to librosa.load().

More information at Librosa library documentation

edited Dec 20, 2020 at 16:53

CommunityBot

11 silver badge

answered Dec 13, 2019 at 15:07

Esterlinkof

1,5243 gold badges22 silver badges29 bronze badges

Comments

Andreas Prokopiou · Accepted Answer · 2021-04-21 10:35:12Z

0

Don't have enough reputation to comment underneath @Matthew Walker 's answer, so I make a new answer to add an observation to Matt's answer. max_int16 should be 2**15-1 not 2**15.

Better yet, I think the normalization line should be replaced with:

audio_normalised = audio_as_np_float32 / numpy.iinfo(numpy.int16).max

If the audio is stereo (i.e. two channels) the left right values are interleaved, so to get the stereo array the following can be used :

channels = ifile.getnchannels()
audio_stereo = np.empty((int(len(audio_normalised)/channels), channels))
audio_stereo[:,0] = audio_normalised[range(0,len(audio_normalised),2)]
audio_stereo[:,1] = audio_normalised[range(1,len(audio_normalised),2)]

I believe this answers @Trees question in the comments section.

edited Apr 21, 2021 at 10:35

answered Apr 21, 2021 at 9:57

Andreas Prokopiou

112 bronze badges

1 Comment

Matthew Walker Over a year ago

The issue with the definition of max_int16 is interesting. The range of 16 bit integers is -32,768 to 32,767. If we want to scale from -1 to 1 then we want to divide by the largest possible value, in an absolute sense, or 32,768, which is 2**15. Hence the definition of max_int16 in my answer.

Collectives™ on Stack Overflow

Python: write a wav file into numpy float array

4 Answers 4

1 Comment

9 Comments

Comments

1 Comment

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

4 Answers 4

1 Comment

9 Comments

Comments

1 Comment

Your Answer

Sign up or log in

Post as a guest

Linked

Related