0

I am trying to perform offline recognition with my own trained model according to this doc: https://github.com/tensorflow/tfjs-models/tree/master/speech-commands

I had the same issue as https://github.com/tensorflow/tfjs/issues/3820 described, and I had tried all solutions suggested from there, including the colab (preprocessing model)support https://colab.research.google.com/github/tensorflow/tfjs-models/blob/master/speech-commands/training/browser-fft/training_custom_audio_model_in_python.ipynb#scrollTo=1AjdTru5NnQQ which worked fine with its given wav files but got an array of NaN values when using my own wav files:

filepath = '/my/own/file.wav'
file_contents = tf.io.read_file(filepath)
wavform = tf.expand_dims(tf.squeeze(tf.audio.decode_wav(
      file_contents, 
      desired_channels=-1,
      desired_samples=TARGET_SAMPLE_RATE).audio, axis=-1), 0)
    cropped_waveform = tf.slice(waveform, begin=[0, 0], size=[1, EXPECTED_WAVEFORM_LEN])    
    spectrogram = tf.squeeze(preproc_model(cropped_waveform), axis=0)
print(spectrogram)


Output:

tf.Tensor(
[[[nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
   ...
   [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]
  [nan]]], shape=(43, 232, 1), dtype=float32)

Is there a way to solve this problem?

For instance, should I modify my wav files data according to the given wav files? But how? Did I miss some important steps during the preprocessing procedure while handling my own wav files? Or is there a simpler way to achieve this in javascript instead of in python?

1 Answer 1

1

Your problem is identical to the github issue https://github.com/tensorflow/tfjs/issues/3820.

Can you check if your input tensor of preproc_model() contains a lot of zero entries? I think it's these zero entries that cause the "nan" problem.

Sign up to request clarification or add additional context in comments.

2 Comments

thanks! I've already solved the problem but just haven't got time to come up with a detailed explanation and solution
That's great! preproc_model() seems to have some problem with input data containing loads of zero entries which may happen if the recorder has some sorts of latency. If this is your case, adding small random noise to the input may help. (Just leave some comments here in case someone need the information.)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.