Skip to main content
Filter by
Sorted by
Tagged with
0 votes
1 answer
182 views

I tried to deploy Whisper on Azure ML. I am using the Whipser-openAI-v3 model for deployment. The endpoint creation successes but deployment failed with the error ResouceOperationFailed and so the ...
Mostafa M. Galal's user avatar
0 votes
0 answers
179 views

why transcribe stage we remove N_FRAMES from mel and in for loop over the mel_segment it didn't take the last segment if it's less than 3000 frame why? let's suppose that he mel = [80,4100] first mel ...
AbdElRhaman Fakhrygmailcom's user avatar
0 votes
1 answer
149 views

Im trying to use mozilla deepspeech to transcribe text however im running into issues importing the Model module. here is my code from deepspeech.model import model model_file_path='deepspeech-0.9.3-...
Beginner_coder's user avatar
1 vote
0 answers
13 views

I'm adapting a sphinx model for Brazilian portuguese with my own data by following their tutorial and got stuck on the bw command in the "Accumulating observation counts" section. I made ...
Ícaro Lorran's user avatar
0 votes
1 answer
526 views

It's a simple react package that convert user audio to text. I install the package and try its basic code example but it shows a error "RecognitionManager.js:247 Uncaught ReferenceError: ...
Aakash Saini's user avatar
0 votes
1 answer
1k views

Word Information Lost (WIL) is a measure of the performance of an automated speech recognition (ASR) service (e.g. AWS Transcribe, Google Speech-to-Text, etc.) against a gold standard (usually human-...
jayp's user avatar
  • 362
4 votes
0 answers
841 views

I'm trying to use Google's Speech-to-Text v2 API for transcription and speaker diarization. Per this supported languages page, I should be able to create a Recognizer using the "long" model ...
jayp's user avatar
  • 362
0 votes
1 answer
33 views

I am going through hbka.pdf (WFST paper). https://cs.nyu.edu/~mohri/pub/hbka.pdf A WFST figure for reference Here the input label i, the output label o, and weight w of a transition are marked on the ...
Anantha Krishnan's user avatar
1 vote
0 answers
693 views

i'm trying to fine tunning whisper-medium for Koreans language. Here is tutorial that i followed. And here is my experiment setting python==3.9.16 transformers==4.27.4 tokenizers==0.13.3 torch==2.0.0 ...
남영우's user avatar
3 votes
3 answers
4k views

Is there any way to get list of models available on Hugging Face? E.g. for Automatic Speech Recognition (ASR).
Neerav Mathur Jazzy's user avatar
5 votes
1 answer
8k views

I want to segment a video transcript into chapters based on the content of each line of speech. The transcript would be used to generate a series of start and end timestamps for each chapter. This is ...
nonsequiter's user avatar