1

So basically I am trying to read in the information of a wave file so that I can take the byte information and create an array of time->amplitude points.

import wave

class WaveFile:

    # `filename` is the name of the wav file to open
    def __init__(self, fileName):
        self.wf = wave.open(fileName, 'r')
        self.soundBytes = self.wf.readframes(-1)
        self.timeAmplitudeArray = self.__calcTimeAmplitudeArray()


     def __calcTimeAmplitudeArray(self):
         self.internalTimeAmpList = [] # zero out the internal representation

         byteList = self.soundBytes
         if((byteList[i+1] & 0x080) == 0):
             amp = (byteList[i] & 0x0FF) + byteList[i+1] << 8
             #more code continues.....

Error:

if((int(byteList[i+1]) & 0x080) == 0):
TypeError: unsupported operand type(s) for &: 'str' and 'int'

I have tried using int() to convert to integer type, but to no avail. I come from a Java background where this would done using the byte type, but that does not appear to be a language feature of Python. Any direction would be appreciated.

2 Answers 2

2

Your problem comes from the fact that the wave library is just giving you raw binary data (in the form of a string).

You'll probably need to check the form of the data with self.wf.getparams(). This returns (nchannels, sampwidth, framerate, nframes, comptype, compname). If you do have 1 channel, a sample width of 2, and no compression (fairly common type of wave), you can use the following (import numpy as np) to get the data:

byteList = np.fromstring(self.soundBytes,'<h')

This returns a numpy array with the data. You don't need to loop. You'll need something different in the second paramater if you have a different sample width. I've tested with with a simple .wav file and plot(byteList); show() (pylab mode in iPython) worked.

See Reading *.wav files in Python for other methods to do this.

Numpyless version

If you need to avoid numpy, you can do:

import array
bytelist = array.array('h')
byteList.fromstring(self.soundBytes)

This works like before (tested with plot(byteList); show()). 'h' means signed short. len, etc. works. This does import the wav file all at once, but then again .wav usually are small. Not always.

Sign up to request clarification or add additional context in comments.

3 Comments

Unfortunately I cannot use numpy because our developers work on 64 bit windows machines.
Anyway, use self.wf.getnframes() to get the length of the array.
By the way, I usually use 32 bit python on 64 bit windows, so I use the offical builds of numpy, etc. If you run 64 bit python, this very useful site should help: Unofficial Windows Binaries for Python Extension Packages. If you are interested in going that route.
1

I usually use the array-module for this and the fromstring method.

My standard-pattern for operating on chunks of data is this:

def bytesfromfile(f):
    while True:
        raw = array.array('B')
        raw.fromstring(f.read(8192))
        if not raw:
            break
        yield raw

with open(f_in, 'rb') as fd_in:
    for byte in bytesfromfile(fd_in):
        # do stuff

Above 'B' denotes unsigned char, i.e. 1-byte.

If the file isn't huge, then you can just slurp it:

In [8]: f = open('foreman_cif_frame_0.yuv', 'rb')

In [9]: raw = array.array('B')

In [10]: raw.fromstring(f.read())

In [11]: raw[0:10]
Out[11]: array('B', [10, 40, 201, 255, 247, 254, 254, 254, 254, 254])

In [12]: len(raw)
Out[12]: 152064

Guido can't be wrong...

If you instead prefer numpy, I tend to use:

    fd_i = open(file.bin, 'rb')
    fd_o = open(out.bin, 'wb')

    while True:
        # Read as uint8
        chunk = np.fromfile(fd_i, dtype=np.uint8, count=8192)
        # use int for calculations since uint wraps
        chunk = chunk.astype(np.int)
        if not chunk.any():
            break
        # do some calculations
        data = ...

        # convert back to uint8 prior to writing.
        data = data.astype(np.uint8)
        data.tofile(fd_o)

    fd_i.close()
    fd_o.close()

or to read the whole-file:

In [18]: import numpy as np

In [19]: f = open('foreman_cif_frame_0.yuv', 'rb')

In [20]: data = np.fromfile(f, dtype=np.uint8)

In [21]: data[0:10]
Out[21]: array([ 10,  40, 201, 255, 247, 254, 254, 254, 254, 254], dtype=uint8)

3 Comments

Unfortunately I cannot use numpy as I am developing on 64 bit Windows. The end environment will be a linux box, but I develop on a Windows box. :( Could you explain the 8192 magic number in there?
Then go for the array-approach. Added an example above... 8192 is just the chunk-size, in this case I'm reading 8192-bytes per iteration...
What would be a good way to get the number of bytes from the return array? len() does not work on a "generator". Sorry, I am very new to Python so I don't know a lot about the standard libraries or even basic language syntax.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.