1 of 8

An update on the

Web Speech API

Evan Liu (evliu@google.com) & Paul Adenot (padenot@mozilla.com)

Last updated: 11/11/2025

2 of 8

Background

  • Web Speech API provides speech-to-text and text-to-speech functionality to websites
  • Supported by Chrome, Safari, and several mobile browsers
  • Implementation on Firefox behind a flag, shipping soon
  • Initially proposed in 2011 by the Speech API Community Group
  • Adopted by the Audio Working Group in 2024

3 of 8

TPAC 2024 Proposals

Offline Speech Recognition

Enable offline speech recognition support and allow websites to choose on-device vs. cloud speech recognition.

MediaStreamTrack Support

Enable captioning of other audio sources via MediaStreamTrack support.

Spoken Punctuation Parameter

Allow websites to select how spoken punctuation is treated for speech recognition.

Biasing Support

Allow websites to bias speech recognition.

Remove SpeechGrammar

Remove the section on "grammar" from the Web Speech API

4 of 8

Offline Speech Recognition

4.1.1. SpeechRecognition Attributes

processLocally attribute, of type boolean

Controls whether speech recognition happens on-device. When set to true, speech recognition must happen on-device. When set to false, speech recognition may happen on-device or in the cloud. The default value is false.

4.1.2. SpeechRecognition Methods

static Promise<AvailabilityStatus> available(SpeechRecognitionOptions options) method

The available method is used to get the availability of on-device speech recognition matching the given options.

static Promise<boolean> install(SpeechRecognitionOptions options)) method

The install method is used to install on-device speech recognition matching the given options.

Github Issue #108

5 of 8

MediaStreamTrack Support

4.1. SpeechRecognition Interface

start(MediaStreamTrack audioTrack);

An overloaded start() method allows the Web Speech API to use an audio track instead of the microphone for speech recognition. The kind attribute of the audio track must be "audio" and the readyState attribute must be "live".

Github Issue #66

6 of 8

Biasing Support

4.1.1. SpeechRecognition Attributes

ObservableArray<SpeechRecognitionPhrase> phrases attribute The phrases attribute stores the collection of SpeechRecognitionPhrase objects which are used to bias the speech recognition results.

// The object representing a phrase for contextual biasing.

[SecureContext, Exposed=Window]

interface SpeechRecognitionPhrase {

constructor(DOMString phrase, optional float boost = 1.0);

readonly attribute DOMString phrase;

readonly attribute float boost;

};

7 of 8

Demo!

8 of 8

2025 Proposals

Model quality hint

Update the SpeechRecognitionOptions struct to include a quality hint (e.g. "standard" or "high").

Faster than realtime recognition

Enable speech recognition faster than realtime