How to get a spectrogram offline with the right shape as an input to recognize()?

Question

I am trying to perform offline recognition with my own trained model according to this doc: https://github.com/tensorflow/tfjs-models/tree/master/speech-commands

I had the same issue as https://github.com/tensorflow/tfjs/issues/3820 described, and I had tried all solutions suggested from there, including the colab (preprocessing model)support https://colab.research.google.com/github/tensorflow/tfjs-models/blob/master/speech-commands/training/browser-fft/training_custom_audio_model_in_python.ipynb#scrollTo=1AjdTru5NnQQ which worked fine with its given wav files but got an array of NaN values when using my own wav files：

filepath = '/my/own/file.wav'
file_contents = tf.io.read_file(filepath)
wavform = tf.expand_dims(tf.squeeze(tf.audio.decode_wav(
      file_contents, 
      desired_channels=-1,
      desired_samples=TARGET_SAMPLE_RATE).audio, axis=-1), 0)
    cropped_waveform = tf.slice(waveform, begin=[0, 0], size=[1, EXPECTED_WAVEFORM_LEN])    
    spectrogram = tf.squeeze(preproc_model(cropped_waveform), axis=0)
print(spectrogram)


Output:

tf.Tensor(
[[[nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
   ...
   [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]]], shape=(43, 232, 1), dtype=float32)

Is there a way to solve this problem?

For instance, should I modify my wav files data according to the given wav files? But how? Did I miss some important steps during the preprocessing procedure while handling my own wav files? Or is there a simpler way to achieve this in javascript instead of in python?

EchoShao · Accepted Answer · 2021-09-08 04:00:11Z

1

Your problem is identical to the github issue https://github.com/tensorflow/tfjs/issues/3820.

Can you check if your input tensor of preproc_model() contains a lot of zero entries? I think it's these zero entries that cause the "nan" problem.

answered Sep 8, 2021 at 4:00

EchoShao

566 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

Jia Li Over a year ago

thanks! I've already solved the problem but just haven't got time to come up with a detailed explanation and solution

EchoShao Over a year ago

That's great! preproc_model() seems to have some problem with input data containing loads of zero entries which may happen if the recorder has some sorts of latency. If this is your case, adding small random noise to the input may help. (Just leave some comments here in case someone need the information.)

Collectives™ on Stack Overflow

How to get a spectrogram offline with the right shape as an input to recognize()?

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related