5

I have:

import librosa
from scipy import signal 
import scipy.io.wavfile as sf    

samples, sample_rate = sf.read(args.file)
nperseg = int(sample_rate * 0.001 * 20)
frequencies, times, spectrogram = signal.spectrogram(samples, 
                                                     sample_rate, 
                                                     nperseg=nperseg, 
                                                     window=signal.hann(nperseg))

audio_signal = librosa.griffinlim(spectrogram)
print(audio_signal, audio_signal.shape)

sf.write('test.wav', audio_signal, sample_rate)

However, this produces a (near) empty sound file.

4
  • To use Griffin Lim, you need a magnitude spectrogram. I'd try to specify the mode in your signal.spectrogram(... mode='magnitude') call. Haven't tested. Commented Feb 24, 2020 at 14:09
  • Nope. Same result Commented Feb 24, 2020 at 22:45
  • 1
    I can't comment about the librosa library. Assuming, that is not the problem, did you try the scipy.io library for read and write of the audio file? scipy.io.wavfile.read and scipy.io.wavfile.write. Note that the order changes from signal, signal_rate to signal_rate, signal. (docs.scipy.org/doc/scipy/reference/…) Commented Feb 27, 2020 at 11:17
  • I'm using soundfile Commented Mar 1, 2020 at 18:37

2 Answers 2

8
+50

As @DrSpill mentioned, scipy.io.wav.read and scipy.io.wav.write orders were wrong and also the import from librosa was not correct. This should do it:

import librosa
import numpy as np
import scipy.signal
import scipy.io.wavfile

# read file
file    = "temp/processed_file.wav"
fs, sig = scipy.io.wavfile.read(file)
nperseg = int(fs * 0.001 * 20)

# process
frequencies, times, spectrogram = scipy.signal.spectrogram(sig, 
                                                           fs, 
                                                           nperseg=nperseg, 
                                                           window=scipy.signal.hann(nperseg))
audio_signal = librosa.core.spectrum.griffinlim(spectrogram)
print(audio_signal, audio_signal.shape)

# write output
scipy.io.wavfile.write('test.wav', fs, np.array(audio_signal, dtype=np.int16))

Remark: The resulting file had an accelerated tempo when I heard it, I think this is due to your processing but with some tweaking it should work.

A good alternative, would be to only use librosa, like this:

import librosa
import numpy as np

# read file
file    = "temp/processed_file.wav"
sig, fs = librosa.core.load(file, sr=8000)

# process
abs_spectrogram = np.abs(librosa.core.spectrum.stft(sig))
audio_signal = librosa.core.spectrum.griffinlim(abs_spectrogram)

print(audio_signal, audio_signal.shape)

# write output
librosa.output.write_wav('test2.wav', audio_signal, fs)
 
Sign up to request clarification or add additional context in comments.

2 Comments

Does the reconstructed file have the same number of samples?
Sorry for the late answer, I just further tested this and in the second solution, one should specify the input sampling rate (I edited the code accordingly). Using ffmpeg I verified that the input and output signals have the same sampling rates. However, you should be careful about the bitrates and the encoding when using the second solution. For more please refer to the librosa-documentation.
1

librosa.output was removed. It is no longer providing its deprecated output module. Instead try soundfile.write:

import numpy as np
import soundfile as sf
sf.write('stereo_file.wav', np.random.randn(10, 2), 44100, 'PCM_24')

#Per your code you could try:
sf.write('test.wav', audio_signal, sample_rate, 'PCM_24')

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.