[Q] How to record and transcribe (STT) audio at the same time on Android?
I'm building a feature in an Android app that allows users to speak a sentence — the app needs to recognize the speech in real time and save a recording of the same input simultaneously.
However, I'm facing a limitation:
Android does not allow simultaneous access to the microphone (AudioSource.MIC) from both SpeechRecognizer and MediaRecorder, as only one component can own the mic at a time.
I explored alternatives like:
- Vosk (on-device STT): works offline but accuracy is poor for proper nouns (e.g., cities like "Seoul", "Busan")
- Whisper: better accuracy but needs server-side infrastructure
Is there any known workaround or architectural pattern that can:
- Split or duplicate the microphone stream (AudioRecord) for both transcription and saving
- Use a custom
SpeechRecognizer-like implementation (maybe passing buffers manually) - Any working solution or OSS repo?
Would really appreciate any ideas, examples, or tips 🙏