Parakeet by NVIDIA — is a modern automatic speech recognition (ASR) model designed for accurate and efficient conversion of English speech to text. Unlike Whisper, this model works only with English speech, but delivers higher quality results for English. It also generates quite accurate timestamps. Quality metric WER: 6.03 on Huggingface Open ASR Leaderboard.
Model page: https://huggingface.co/nvidia/parakeet-tdt-0.6b-v2