delayed-streams-modeling

mirror of https://github.com/kyutai-labs/delayed-streams-modeling.git synced 2025-12-22 19:09:57 +00:00

Author	SHA1	Message	Date
Václav Volhejn	13f343f88c	Document TTS configuration better (#142 ) * Document TTS configuration better * Improve CFG documentation	2025-11-26 19:19:28 +01:00
Bissmella Bahaduri	8fb54b2b07	Fix typo in README.md (#161 ) Just a small typo fix: "an voice AI system" -> "a voice AI system".	2025-11-20 10:15:45 +01:00
Alexandre Défossez	f6074cb684	fix notebook for safetensors	2025-09-11 17:22:30 +02:00
Vaclav Volhejn	22e0b400e8	Replace "pre-print coming soon"	2025-09-11 13:00:03 +02:00
Alexandre Défossez	f741fc473f	Add citation section to README	2025-09-11 10:43:02 +02:00
Mohaidoss	263ed6ac04	Fix reading file encoding with UTF-8 (#130 )	2025-09-01 10:17:51 +02:00
Laurent Mazare	ada17799b8	Bugfix. (#126 )	2025-08-26 14:40:43 +02:00
Laurent Mazare	b331930f8e	Refactor frame callback to remove last_time tracking (#125 ) * Refactor frame callback to remove last_time tracking Removed unnecessary time tracking in the frame callback. * Formatting.	2025-08-26 14:36:27 +02:00
Laurent Mazare	2d301c9da0	Enable ruff in the pre-commit hooks (#124 ) * Enable ruff in the pre-commit hooks. * Disable the old hooks. * Install uv in the CI.	2025-08-26 13:48:11 +02:00
Laurent Mazare	f4016a8844	Clippy fix. (#122 )	2025-08-25 09:33:07 +02:00
Bai Li	7d1e4b703a	Add quantization support and GGUF loading to standalone STT Rust script (#120 ) * scripts to int8 quantize the thing * target bf16 to uint8, 2x reduction * able to load the model * quantized working * remove unused scripts * conditional init depending on quantized	2025-08-25 09:28:48 +02:00
Laurent Mazare	affc0a052b	Display the generated audio length in the mlx script. (#114 )	2025-08-13 07:17:04 +02:00
Laurent Mazare	cf97f8d863	Workaround for the mlx kv-cache bug. (#108 )	2025-08-04 16:37:00 +02:00
Laurent Mazare	09468c239a	Print the duration of the audio generated so far. (#107 )	2025-08-04 09:24:31 +02:00
Laurent Mazare	07729ed47e	Use the proper repos when vad is on. (#103 )	2025-08-01 15:55:49 +02:00
Laurent Mazare	af2283de3f	Use a streaming input in the rust example. (#102 ) * Use a streaming input in the rust example. * Formatting. * Another formatting tweak.	2025-07-31 17:41:57 +02:00
Laurent Mazare	7dc926d50c	Allow for using local voices in the pytorch examples. (#100 )	2025-07-31 12:48:05 +02:00
Laurent Mazare	ab8e8c59b7	Bump the version numbers. (#91 )	2025-07-19 15:57:53 +02:00
laurent	5f17114618	More faq.	2025-07-18 08:43:10 +02:00
laurent	405a82ba3f	FAQ tweaks.	2025-07-18 08:37:51 +02:00
Laurent Mazare	3b584b100c	Sketch a FAQ and add some issue templates. (#88 )	2025-07-18 08:31:31 +02:00
Laurent Mazare	a98eb94ade	Add a streaming example for the mlx tts. (#85 ) * Add a streaming example for the mlx tts. * Fix the CI. * Formatting fix. * Yet another CI fix.	2025-07-16 22:35:44 +02:00
Laurent Mazare	a2f031deb5	Fix the pytorch tts streaming example. (#84 ) * Fix the pytorch tts streaming example. * Edit the readme too.	2025-07-16 21:07:02 +02:00
Laurent Mazare	66a33c989f	Add a TTS streaming example. (#83 ) * Add a TTS streaming example. * Get the streaming example to work.	2025-07-16 20:54:13 +02:00
laurent	89a2ced839	Bump the moshi version.	2025-07-16 16:02:28 +02:00
Laurent	baf0c75bba	VAD support in the mlx-stt example that uses the microphone.	2025-07-08 16:08:32 +02:00
Laurent Mazare	952319de90	Add a MLX STT example that uses VAD. (#70 ) * Add a MLX STT example that uses VAD. * VAD support. * More MLX VAD example. * Use the latest moshi-mlx.	2025-07-08 16:04:50 +02:00
laurent	6d3bb6b1f1	Avoid the config override for the extra-heads.	2025-07-08 15:39:17 +02:00
Laurent Mazare	12dbe36b0b	Add some VAD to the pytorch speech-to-text example. (#68 )	2025-07-08 11:30:34 +02:00
Václav Volhejn	cafac63222	Run pre-commit correctly in CI (#66 ) * fix and break * Remove intentional error	2025-07-08 10:11:52 +02:00
Václav Volhejn	7336d7a3da	Fix instructions on how to install the Rust server (#65 ) * Fix instructions on Rust server installation * Plug Unmute	2025-07-07 18:00:23 +02:00
Laurent Mazare	70500c620e	Add a device argument to the tts pytorch script. (#62 )	2025-07-07 08:36:47 +02:00
Chenghao Mou	f8e97aa4f3	fix minor issues with readme commands (#55 )	2025-07-07 08:18:05 +02:00
laurent	91a4d120cb	Use moshi 0.2.8.	2025-07-07 08:12:16 +02:00
Laurent	bfc200f6ee	Use bfloat16 rather than half by default.	2025-07-05 23:02:58 +02:00
laurent	f9739881e6	Typo.	2025-07-03 19:02:42 +02:00
Alexandre Défossez	99599fa408	Update README.md	2025-07-03 16:15:01 +02:00
Pierre-Hugues HUSSON	3a4165a84f	Fix stt_from_file_pytorch (#39 ) 1. argparse declares in_file, but code reads file 2. text_tokens.numpy().tolist() is a list of list of list of int instead of the supported list of list of int. this is a debugging print just drop it Co-authored-by: Pierre-Hugues Husson <phhusson@freebox.fr>	2025-07-03 15:26:34 +02:00
Alexandre Défossez	e9bac066ea	Update README.md	2025-07-03 15:09:41 +02:00
Alexandre Défossez	eae5e17975	Some updates to the colab and script (#38 ) * changing streaming to be robust to repeated generation * some changes * plop * plop * plop * plop	2025-07-03 15:06:37 +02:00
Václav Volhejn	c1d248abba	Fix text tokenizer path (#36 )	2025-07-03 14:27:06 +02:00
Václav Volhejn	c6f262346f	Don't install moshi from Git (#37 ) * Don't install moshi from Git * Remove commented-out invalid message send in websocket_client	2025-07-03 13:37:38 +02:00
laurent	3573ee90af	Oops.	2025-07-03 13:08:00 +02:00
laurent	25574aa104	Fixes for the notebook.	2025-07-03 13:05:00 +02:00
laurent	1cd9529f65	Json fix.	2025-07-03 12:57:22 +02:00
laurent	0ee2354176	Chunk decoding in the pth notebook.	2025-07-03 12:56:00 +02:00
laurent	dc8bffabe0	Remove the dataset bit.	2025-07-03 12:48:04 +02:00
Laurent Mazare	5f8e924176	Streaming output for the pytorch tts example. (#33 ) * Streaming output for the pytorch tts example. * Run the pre-commit hooks.	2025-07-03 11:05:06 +02:00
Laurent Mazare	d3bed09f9a	Pin the moshi_mlx version. (#35 )	2025-07-03 09:53:53 +02:00
Václav Volhejn	ef52b8ef0f	Add Rust server usage example (#32 ) * Run Ruff on tts_mlx.py * Add tts_rust_server.py example * Remove unused HF repo arguments and reset audio output data in TTS server script	2025-07-03 09:47:50 +02:00

1 2 3

104 Commits