Skip to content

Commit 27f485a

Browse files
vad : Silero VAD v6.2.0 (#3524)
* Add ggml-silero-v6.2.0 to download candidates * Make default VAD model ggml-silero-v6.2.0 * Make VAD model in documentations ggml-silero-v6.2.0
1 parent d9b7613 commit 27f485a

File tree

12 files changed

+38
-37
lines changed

12 files changed

+38
-37
lines changed

README.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -755,23 +755,23 @@ written in Python that is fast and accurate.
755755

756756
Models can be downloaded by running the following command on Linux or MacOS:
757757
```console
758-
$ ./models/download-vad-model.sh silero-v5.1.2
759-
Downloading ggml model silero-v5.1.2 from 'https://huggingface.co/ggml-org/whisper-vad' ...
760-
ggml-silero-v5.1.2.bin 100%[==============================================>] 864.35K --.-KB/s in 0.04s
761-
Done! Model 'silero-v5.1.2' saved in '/path/models/ggml-silero-v5.1.2.bin'
758+
$ ./models/download-vad-model.sh silero-v6.2.0
759+
Downloading ggml model silero-v6.2.0 from 'https://huggingface.co/ggml-org/whisper-vad' ...
760+
ggml-silero-v6.2.0.bin 100%[==============================================>] 864.35K --.-KB/s in 0.04s
761+
Done! Model 'silero-v6.2.0' saved in '/path/models/ggml-silero-v6.2.0.bin'
762762
You can now use it like this:
763763

764-
$ ./build/bin/whisper-cli -vm /path/models/ggml-silero-v5.1.2.bin --vad -f samples/jfk.wav -m models/ggml-base.en.bin
764+
$ ./build/bin/whisper-cli -vm /path/models/ggml-silero-v6.2.0.bin --vad -f samples/jfk.wav -m models/ggml-base.en.bin
765765

766766
```
767767
And the following command on Windows:
768768
```console
769-
> .\models\download-vad-model.cmd silero-v5.1.2
770-
Downloading vad model silero-v5.1.2...
771-
Done! Model silero-v5.1.2 saved in C:\Users\danie\work\ai\whisper.cpp\ggml-silero-v5.1.2.bin
769+
> .\models\download-vad-model.cmd silero-v6.2.0
770+
Downloading vad model silero-v6.2.0...
771+
Done! Model silero-v6.2.0 saved in C:\Users\danie\work\ai\whisper.cpp\ggml-silero-v6.2.0.bin
772772
You can now use it like this:
773773

774-
C:\path\build\bin\Release\whisper-cli.exe -vm C:\path\ggml-silero-v5.1.2.bin --vad -m models/ggml-base.en.bin -f samples\jfk.wav
774+
C:\path\build\bin\Release\whisper-cli.exe -vm C:\path\ggml-silero-v6.2.0.bin --vad -m models/ggml-base.en.bin -f samples\jfk.wav
775775

776776
```
777777

@@ -783,15 +783,15 @@ This model can be also be converted manually to ggml using the following command
783783
$ python3 -m venv venv && source venv/bin/activate
784784
$ (venv) pip install silero-vad
785785
$ (venv) $ python models/convert-silero-vad-to-ggml.py --output models/silero.bin
786-
Saving GGML Silero-VAD model to models/silero-v5.1.2-ggml.bin
786+
Saving GGML Silero-VAD model to models/silero-v6.2.0-ggml.bin
787787
```
788788
And it can then be used with whisper as follows:
789789
```console
790790
$ ./build/bin/whisper-cli \
791791
--file ./samples/jfk.wav \
792792
--model ./models/ggml-base.en.bin \
793793
--vad \
794-
--vad-model ./models/silero-v5.1.2-ggml.bin
794+
--vad-model ./models/silero-v6.2.0-ggml.bin
795795
```
796796

797797
### VAD Options

bindings/ruby/README.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -134,20 +134,20 @@ Support for Voice Activity Detection (VAD) can be enabled by setting `Whisper::P
134134
```ruby
135135
Whisper::Params.new(
136136
vad: true,
137-
vad_model_path: "silero-v5.1.2",
137+
vad_model_path: "silero-v6.2.0",
138138
# other arguments...
139139
)
140140
```
141141

142-
When you pass the model name (`"silero-v5.1.2"`) or URI (`https://huggingface.co/ggml-org/whisper-vad/resolve/main/ggml-silero-v5.1.2.bin`), it will be downloaded automatically.
143-
Currently, "silero-v5.1.2" is registered as pre-converted model like ASR models. You also specify file path or URI of model.
142+
When you pass the model name (`"silero-v6.2.0"`) or URI (`https://huggingface.co/ggml-org/whisper-vad/resolve/main/ggml-silero-v6.2.0.bin`), it will be downloaded automatically.
143+
Currently, "silero-v6.2.0" is registered as pre-converted model like ASR models. You also specify file path or URI of model.
144144

145145
If you need configure VAD behavior, pass params for that:
146146

147147
```ruby
148148
Whisper::Params.new(
149149
vad: true,
150-
vad_model_path: "silero-v5.1.2",
150+
vad_model_path: "silero-v6.2.0",
151151
vad_params: Whisper::VAD::Params.new(
152152
threshold: 1.0, # defaults to 0.5
153153
min_speech_duration_ms: 500, # defaults to 250
@@ -330,7 +330,7 @@ Using VAD separately from ASR
330330
VAD feature itself is useful. You can use it separately from ASR:
331331
332332
```ruby
333-
vad = Whisper::VAD::Context.new("silero-v5.1.2")
333+
vad = Whisper::VAD::Context.new("silero-v6.2.0")
334334
vad
335335
.detect("path/to/audio.wav", Whisper::VAD::Params.new)
336336
.each_with_index do |segment, index|

bindings/ruby/lib/whisper/model/uri.rb

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -206,6 +206,7 @@ def escaping(path)
206206

207207
%w[
208208
silero-v5.1.2
209+
silero-v6.2.0
209210
].each do |name|
210211
@pre_converted_models[name] = URI.new("https://huggingface.co/ggml-org/whisper-vad/resolve/main/ggml-#{name}.bin")
211212
end

bindings/ruby/test/test_params.rb

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -218,12 +218,12 @@ def test_vad
218218

219219
def test_vad_model_path
220220
assert_nil @params.vad_model_path
221-
@params.vad_model_path = "silero-v5.1.2"
222-
assert_equal Whisper::Model.pre_converted_models["silero-v5.1.2"].to_path, @params.vad_model_path
221+
@params.vad_model_path = "silero-v6.2.0"
222+
assert_equal Whisper::Model.pre_converted_models["silero-v6.2.0"].to_path, @params.vad_model_path
223223
end
224224

225225
def test_vad_model_path_with_nil
226-
@params.vad_model_path = "silero-v5.1.2"
226+
@params.vad_model_path = "silero-v6.2.0"
227227
@params.vad_model_path = nil
228228
assert_nil @params.vad_model_path
229229
end
@@ -235,13 +235,13 @@ def test_vad_model_path_with_invalid
235235
end
236236

237237
def test_vad_model_path_with_URI_string
238-
@params.vad_model_path = "https://huggingface.co/ggml-org/whisper-vad/resolve/main/ggml-silero-v5.1.2.bin"
239-
assert_equal @params.vad_model_path, Whisper::Model.pre_converted_models["silero-v5.1.2"].to_path
238+
@params.vad_model_path = "https://huggingface.co/ggml-org/whisper-vad/resolve/main/ggml-silero-v6.2.0.bin"
239+
assert_equal @params.vad_model_path, Whisper::Model.pre_converted_models["silero-v6.2.0"].to_path
240240
end
241241

242242
def test_vad_model_path_with_URI
243-
@params.vad_model_path = URI("https://huggingface.co/ggml-org/whisper-vad/resolve/main/ggml-silero-v5.1.2.bin")
244-
assert_equal @params.vad_model_path, Whisper::Model.pre_converted_models["silero-v5.1.2"].to_path
243+
@params.vad_model_path = URI("https://huggingface.co/ggml-org/whisper-vad/resolve/main/ggml-silero-v6.2.0.bin")
244+
assert_equal @params.vad_model_path, Whisper::Model.pre_converted_models["silero-v6.2.0"].to_path
245245
end
246246

247247
def test_vad_params
@@ -289,7 +289,7 @@ def test_new_with_kw_args_default_values(param)
289289
in [/_user_data\Z/, *]
290290
Object.new
291291
in [:vad_model_path, *]
292-
Whisper::Model.pre_converted_models["silero-v5.1.2"].to_path
292+
Whisper::Model.pre_converted_models["silero-v6.2.0"].to_path
293293
in [:vad_params, *]
294294
Whisper::VAD::Params.new
295295
end

bindings/ruby/test/test_vad.rb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ def setup
66
vad_params = Whisper::VAD::Params.new
77
@params = Whisper::Params.new(
88
vad: true,
9-
vad_model_path: "silero-v5.1.2",
9+
vad_model_path: "silero-v6.2.0",
1010
vad_params:
1111
)
1212
end

bindings/ruby/test/test_vad_context.rb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,12 +2,12 @@
22

33
class TestVADContext < TestBase
44
def test_initialize
5-
context = Whisper::VAD::Context.new("silero-v5.1.2")
5+
context = Whisper::VAD::Context.new("silero-v6.2.0")
66
assert_instance_of Whisper::VAD::Context, context
77
end
88

99
def test_detect
10-
context = Whisper::VAD::Context.new("silero-v5.1.2")
10+
context = Whisper::VAD::Context.new("silero-v6.2.0")
1111
segments = context.detect(AUDIO, Whisper::VAD::Params.new)
1212
assert_instance_of Whisper::VAD::Segments, segments
1313

@@ -32,7 +32,7 @@ def test_detect
3232
assert_equal segment.start_time, start_time
3333
assert_equal segment.end_time, end_time
3434

35-
assert_equal 5, segments.length
35+
assert_equal 4, segments.length
3636
end
3737

3838
def test_invalid_model_type

examples/addon.node/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -54,7 +54,7 @@ Before using VAD, download a VAD model:
5454

5555
```shell
5656
# From the whisper.cpp root directory
57-
./models/download-vad-model.sh silero-v5.1.2
57+
./models/download-vad-model.sh silero-v6.2.0
5858
```
5959

6060
### VAD Parameters
@@ -85,7 +85,7 @@ const vadParams = {
8585
model: path.join(__dirname, "../../models/ggml-base.en.bin"),
8686
fname_inp: path.join(__dirname, "../../samples/jfk.wav"),
8787
vad: true,
88-
vad_model: path.join(__dirname, "../../models/ggml-silero-v5.1.2.bin"),
88+
vad_model: path.join(__dirname, "../../models/ggml-silero-v6.2.0.bin"),
8989
vad_threshold: 0.5,
9090
progress_callback: (progress) => console.log(`Progress: ${progress}%`)
9191
};

examples/addon.node/vad-example.js

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,7 @@ const vadParams = {
2323
max_len: 0,
2424
// VAD parameters
2525
vad: true,
26-
vad_model: path.join(__dirname, "../../models/ggml-silero-v5.1.2.bin"), // You need to download this model
26+
vad_model: path.join(__dirname, "../../models/ggml-silero-v6.2.0.bin"), // You need to download this model
2727
vad_threshold: 0.5,
2828
vad_min_speech_duration_ms: 250,
2929
vad_min_silence_duration_ms: 100,
@@ -63,7 +63,7 @@ async function runVADExample() {
6363
const fs = require('fs');
6464
if (!fs.existsSync(vadParams.vad_model)) {
6565
console.log("⚠️ VAD model not found. Please download the VAD model first:");
66-
console.log(" ./models/download-vad-model.sh silero-v5.1.2");
66+
console.log(" ./models/download-vad-model.sh silero-v6.2.0");
6767
console.log(" Or run: python models/convert-silero-vad-to-ggml.py");
6868
console.log("\n Falling back to traditional transcription without VAD...\n");
6969

examples/vad-speech-segments/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ The examples can be run using the following command, which uses a model
1515
that we use internally for testing:
1616
```console
1717
./build/bin/vad-speech-segments \
18-
-vad-model models/for-tests-silero-v5.1.2-ggml.bin \
18+
-vad-model models/for-tests-silero-v6.2.0-ggml.bin \
1919
--file samples/jfk.wav \
2020
--no-prints
2121

models/download-vad-model.cmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ rem Count number of arguments passed to script
2525
set argc=0
2626
for %%x in (%*) do set /A argc+=1
2727

28-
set models=silero-v5.1.2
28+
set models=silero-v5.1.2 silero-v6.2.0
2929

3030
rem If argc is not equal to 1 or 2, print usage information and exit
3131
if %argc% NEQ 1 (

0 commit comments

Comments
 (0)