←── back to feed
/topics/microsoft-vibevoice-speech-to-text-model

Microsoft VibeVoice speech-to-text model

2 items2 sourcesupdated 2h agotrend 2

Microsoft released VibeVoice, an open-source speech-to-text model under MIT license that combines speech recognition with speaker diarization capabilities. The model runs efficiently on consumer hardware, with a 4-bit quantized version consuming 5.71GB and transcribing one hour of audio in approximately 9 minutes on an M5 MacBook.