2

[Q] How to record and transcribe (STT) audio at the same time on Android?

I'm building a feature in an Android app that allows users to speak a sentence — the app needs to recognize the speech in real time and save a recording of the same input simultaneously.

However, I'm facing a limitation:
Android does not allow simultaneous access to the microphone (AudioSource.MIC) from both SpeechRecognizer and MediaRecorder, as only one component can own the mic at a time.

I explored alternatives like:

  • Vosk (on-device STT): works offline but accuracy is poor for proper nouns (e.g., cities like "Seoul", "Busan")
  • Whisper: better accuracy but needs server-side infrastructure

Is there any known workaround or architectural pattern that can:

  • Split or duplicate the microphone stream (AudioRecord) for both transcription and saving
  • Use a custom SpeechRecognizer-like implementation (maybe passing buffers manually)
  • Any working solution or OSS repo?

Would really appreciate any ideas, examples, or tips 🙏

1
  • Can't you just do the STT and save the speech at the end of the session? Commented May 7 at 9:38

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.