Kyutai's Speech-To-Text and Text-To-Speech models based on the Delayed Streams Modeling framework.
Updated 2026-01-26 09:28:20 +00:00