5

So, I'm trying to use the Python Wave module to get an audio file and basically get all of the frames from it, examine them, and then write them back to another file. I tried to output the sound that I'm reading to another file just now, but it came out either as noise, or as no sound at all. So, I'm pretty sure that I'm not analyzing the file and getting the correct frames...? I'm dealing with a stereo 16-bit sound file. While I could use a simpler file to just understand the process, I eventually want to be able to accept any kind of sound file to work with, so I need to understand what the problem is.

I also noted that 32-bit sound files wouldn't be read by the Wave module - it gave me an error of "Unknown Format". Any ideas about that? Is it something I can bypass so that I could at least, for example, read 32-bit audio files, even if I can only 'render' 16-bit files?

I'm somewhat aware that wave files are interleaved between the left and right channels (first sample's for the left channel, second's for the right, etc)., but how do I separate the channels? Here's my code. I cut out the output code to just see if I'm reading the files correctly. I'm using Python 2.7.2:

import scipy
import wave
import struct
import numpy
import pylab

fp = wave.open('./sinewave16.wav', 'rb') # Problem loading certain kinds of wave files in binary?

samplerate = fp.getframerate()
totalsamples = fp.getnframes()
fft_length = 2048 # Guess
num_fft = (totalsamples / fft_length) - 2

temp = numpy.zeros((num_fft, fft_length), float)

leftchannel = numpy.zeros((num_fft, fft_length), float)
rightchannel = numpy.zeros((num_fft, fft_length), float)

for i in range(num_fft):

    tempb = fp.readframes(fft_length / fp.getnchannels() / fp.getsampwidth());

    #tempb = fp.readframes(fft_length)

    up = (struct.unpack("%dB"%(fft_length), tempb))

    #up = (struct.unpack("%dB"%(fft_length * fp.getnchannels() * fp.getsampwidth()), tempb))
    #print (len(up))
    temp[i,:] = numpy.array(up, float) - 128.0


temp = temp * numpy.hamming(fft_length)

temp.shape = (-1, fp.getnchannels())

fftd = numpy.fft.rfft(temp)

pylab.plot(abs(fftd[:,1]))

pylab.show()

#Frequency of an FFT should be as follows:

#The first bin in the FFT is DC (0 Hz), the second bin is Fs / N, where Fs is the sample rate and N is the size of the FFT. The next bin is 2 * Fs / N. To express this in general terms, the nth bin is n * Fs / N.
# (It would appear to me that n * Fs / N gives you the hertz, and you can use sqrt(real portion of number*r + imaginary portion*i) to find the magnitude of the signal

Currently, this will load the sound file, unpack it into a struct, and plot the sound file so that I can look at it, but I don't think it's getting all of the audio file, or it's not getting it correctly. Am I reading the wave file into the struct correctly? Are there any up-to-date resources on using Python to read and analyze wave / audio files? Any help would be greatly appreciated.

1

1 Answer 1

6

Perhaps you should try the SciPy io.wavefile module:

http://docs.scipy.org/doc/scipy/reference/io.html

Sign up to request clarification or add additional context in comments.

2 Comments

I just checked it out, and it appears to read the audio clearly, which is good. Thanks for the suggestion.
The link is dead, this is the new one: docs.scipy.org/doc/scipy/reference/io.html

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.