Google Streaming STT does not return is_final = True

Question

I'm building a phone call application using Twilio Media Streams.

The workflow is as follows:

Twilio Media Stream → Google STT (Streaming) → LLM → TTS

I'm using the sample code from the following GitHub repository: https://github.com/twilio/media-streams/tree/master/python/realtime-transcriptions

I've modified the on_transcription_response function as shown below:

def on_transcription_response(response):
    if not response.results:
        return

    result = response.results[0]
    if not result.alternatives:
        return

    transcription = result.alternatives[0].transcript
    print("Transcription: " + transcription + " is_final: " + str(result.is_final))

The issue is that result.is_final never returns True, which prevents me from sending the transcription to the LLM.

I tried adding an is_silence function to pause when silence is detected, but is_final still always returns False.

import audioop

def is_silence(buffer, threshold=500):
    pcm = audioop.ulaw2lin(buffer, 2)  # Convert to 16-bit PCM
    rms = audioop.rms(pcm, 2)          # Calculate root mean square amplitude
    return rms < threshold

def add_request(self, buffer):
    if is_silence(buffer):
        print("Skipping silence based on amplitude")
        return
    self._queue.put(bytes(buffer), block=False)

Additionally, I need to continuously recognize speech with language_code="yue-Hant-HK", as the caller may speak at any time during the call. I’m not looking to stop recognition after a single utterance—the STT should stay active and detect complete sentences dynamically.

Any suggestions on how to handle this with Google STT streaming while keeping is_final working properly?

cheers

shiro · Accepted Answer · 2025-07-30 12:44:49Z

0

Try using single_utterance=true or manually half-close the stream when the API sends an END_OF_SINGLE_UTTERANCE response.

If this will not work, the issue needs to be investigated further. Please open a new issue on the issue tracker, describing your problem.

answered Jul 30 at 12:44

shiro

3851 silver badge4 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

J7er Aug 1 at 2:28

Thank you, @shiro. I’m currently working on setting single_utterance=true, checking for the end_of_file utterance, and then closing and rebuilding the stream to flush the recognized sentence. However, it still never turns True for me.

shiro Aug 1 at 8:49

In that case, you need to open an issue on the issue tracker

Collectives™ on Stack Overflow

Google Streaming STT does not return is_final = True

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related