Compare commits

...

418 Commits

Author SHA1 Message Date
tuna2134
dc5fa88432 write license 2025-08-21 12:15:22 +09:00
tuna2134
a5f45cd2ef fix: license 2025-08-21 12:14:44 +09:00
tuna2134
84e9118d99 Merge pull request #237 from neodyland/dependabot/cargo/ureq-3.1.0
build(deps): bump ureq from 3.0.12 to 3.1.0
2025-08-18 22:58:45 +09:00
tuna2134
3050cc1e99 Merge pull request #238 from neodyland/dependabot/cargo/thiserror-2.0.15
build(deps): bump thiserror from 2.0.12 to 2.0.15
2025-08-18 22:58:16 +09:00
tuna2134
d5fcacd799 Merge pull request #239 from neodyland/dependabot/cargo/ort-d269461
build(deps): bump ort from `f4ab181` to `d269461`
2025-08-18 22:58:01 +09:00
tuna2134
25ca89e341 Merge pull request #240 from neodyland/dependabot/cargo/anyhow-1.0.99
build(deps): bump anyhow from 1.0.98 to 1.0.99
2025-08-18 22:57:45 +09:00
dependabot[bot]
0c2a397775 build(deps): bump anyhow from 1.0.98 to 1.0.99
Bumps [anyhow](https://github.com/dtolnay/anyhow) from 1.0.98 to 1.0.99.
- [Release notes](https://github.com/dtolnay/anyhow/releases)
- [Commits](https://github.com/dtolnay/anyhow/compare/1.0.98...1.0.99)

---
updated-dependencies:
- dependency-name: anyhow
  dependency-version: 1.0.99
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-18 11:08:26 +00:00
dependabot[bot]
470a0348fe build(deps): bump ort from f4ab181 to d269461
Bumps [ort](https://github.com/pykeio/ort) from `f4ab181` to `d269461`.
- [Release notes](https://github.com/pykeio/ort/releases)
- [Commits](f4ab181702...d269461e21)

---
updated-dependencies:
- dependency-name: ort
  dependency-version: d269461e2130b407589feff404025df25faeb3bb
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-18 11:06:31 +00:00
dependabot[bot]
9a99b88b00 build(deps): bump thiserror from 2.0.12 to 2.0.15
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 2.0.12 to 2.0.15.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](https://github.com/dtolnay/thiserror/compare/2.0.12...2.0.15)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-version: 2.0.15
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-18 10:49:47 +00:00
dependabot[bot]
29f39f0795 build(deps): bump ureq from 3.0.12 to 3.1.0
Bumps [ureq](https://github.com/algesten/ureq) from 3.0.12 to 3.1.0.
- [Changelog](https://github.com/algesten/ureq/blob/main/CHANGELOG.md)
- [Commits](https://github.com/algesten/ureq/compare/3.0.12...3.1.0)

---
updated-dependencies:
- dependency-name: ureq
  dependency-version: 3.1.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-18 10:49:32 +00:00
tuna2134
9f22694df0 Merge pull request #236 from neodyland/dependabot/cargo/ort-f4ab181 2025-08-12 07:48:08 +09:00
tuna2134
62ba2c802f Merge pull request #235 from kono-dada/fix/inplace-model-load 2025-08-11 23:46:42 +09:00
dependabot[bot]
4f5b936f6f build(deps): bump ort from 5f96a2d to f4ab181
Bumps [ort](https://github.com/pykeio/ort) from `5f96a2d` to `f4ab181`.
- [Release notes](https://github.com/pykeio/ort/releases)
- [Commits](5f96a2d585...f4ab181702)

---
updated-dependencies:
- dependency-name: ort
  dependency-version: f4ab181702495bff99a488322d3a8de0d7050349
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-11 12:22:23 +00:00
kono-dada
3c8efc716c Fix: Load model in-place and safely evict sessions without removing entries
- Avoid removing and re-inserting model entries during load
- Preserve metadata (bytes, style_vectors) when evicting
- Ensure eviction targets a different loaded model, not always the first
- Reduce unnecessary memory allocations and keep list order stable
2025-08-11 16:31:57 +08:00
tuna2134
e9ced32b70 fix: streamline tone value handling in JTalkProcess 2025-08-11 17:30:46 +09:00
tuna2134
e7a1575cbc Merge pull request #233 from kono-dada/feature/stereo-output
feat: add stereo synthesis option via SBV2_FORCE_STEREO env var
2025-08-11 17:13:19 +09:00
kono-dada
873bbb77b6 feat: add stereo synthesis option via SBV2_FORCE_STEREO env var
Previously, synthesis output was fixed to mono (channels=1).
Now, setting the environment variable SBV2_FORCE_STEREO=1 forces stereo (2-channel) output.

This allows generating stereo audio without changing the code, useful for users needing dual-channel output.
2025-08-11 11:38:32 +08:00
tuna2134
1725863fca Merge pull request #228 from neodyland/dependabot/cargo/serde_json-1.0.142
build(deps): bump serde_json from 1.0.141 to 1.0.142
2025-08-04 22:12:36 +09:00
tuna2134
55f05580e4 Merge pull request #229 from neodyland/dependabot/cargo/tokenizers-0.21.4
build(deps): bump tokenizers from 0.21.2 to 0.21.4
2025-08-04 22:12:24 +09:00
tuna2134
320664eae2 Merge pull request #231 from neodyland/dependabot/cargo/tokio-1.47.1
build(deps): bump tokio from 1.47.0 to 1.47.1
2025-08-04 22:12:07 +09:00
tuna2134
87903827fa Merge pull request #230 from neodyland/dependabot/cargo/ort-5f96a2d
build(deps): bump ort from `d28c835` to `5f96a2d`
2025-08-04 22:11:55 +09:00
dependabot[bot]
9b8e9dc39d build(deps): bump tokio from 1.47.0 to 1.47.1
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.47.0 to 1.47.1.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.47.0...tokio-1.47.1)

---
updated-dependencies:
- dependency-name: tokio
  dependency-version: 1.47.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-04 11:10:28 +00:00
dependabot[bot]
bbc38081b6 build(deps): bump ort from d28c835 to 5f96a2d
Bumps [ort](https://github.com/pykeio/ort) from `d28c835` to `5f96a2d`.
- [Release notes](https://github.com/pykeio/ort/releases)
- [Commits](d28c835c3c...5f96a2d585)

---
updated-dependencies:
- dependency-name: ort
  dependency-version: 5f96a2d5857c3fe9f06282dbf4bdcddbca6c5fe6
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-04 10:24:07 +00:00
dependabot[bot]
0b822f704a build(deps): bump tokenizers from 0.21.2 to 0.21.4
Bumps [tokenizers](https://github.com/huggingface/tokenizers) from 0.21.2 to 0.21.4.
- [Release notes](https://github.com/huggingface/tokenizers/releases)
- [Changelog](https://github.com/huggingface/tokenizers/blob/main/RELEASE.md)
- [Commits](https://github.com/huggingface/tokenizers/compare/v0.21.2...v0.21.4)

---
updated-dependencies:
- dependency-name: tokenizers
  dependency-version: 0.21.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-04 10:18:59 +00:00
dependabot[bot]
132eb6386d build(deps): bump serde_json from 1.0.141 to 1.0.142
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.141 to 1.0.142.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.141...v1.0.142)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-version: 1.0.142
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-04 10:18:20 +00:00
tuna2134
ee56e9591d Merge pull request #227 from neodyland/dependabot/cargo/tokio-1.47.0
build(deps): bump tokio from 1.46.1 to 1.47.0
2025-07-28 21:02:19 +09:00
dependabot[bot]
3194e599b2 build(deps): bump tokio from 1.46.1 to 1.47.0
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.46.1 to 1.47.0.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.46.1...tokio-1.47.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-version: 1.47.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-28 10:53:38 +00:00
tuna2134
00f4787f6e Merge pull request #225 from neodyland/dependabot/cargo/serde_json-1.0.141
build(deps): bump serde_json from 1.0.140 to 1.0.141
2025-07-22 14:28:18 +09:00
tuna2134
4b6c72aa51 Merge pull request #226 from neodyland/dependabot/cargo/ort-d28c835
build(deps): bump ort from `1e6f7ee` to `d28c835`
2025-07-22 14:28:09 +09:00
dependabot[bot]
7db6bb67a4 build(deps): bump ort from 1e6f7ee to d28c835
Bumps [ort](https://github.com/pykeio/ort) from `1e6f7ee` to `d28c835`.
- [Release notes](https://github.com/pykeio/ort/releases)
- [Commits](1e6f7ee1c8...d28c835c3c)

---
updated-dependencies:
- dependency-name: ort
  dependency-version: d28c835c3cc98bcbefc208dc26c8618ccbadec3f
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-22 05:24:07 +00:00
dependabot[bot]
b3c75f973e build(deps): bump serde_json from 1.0.140 to 1.0.141
Bumps [serde_json](https://github.com/serde-rs/json) from 1.0.140 to 1.0.141.
- [Release notes](https://github.com/serde-rs/json/releases)
- [Commits](https://github.com/serde-rs/json/compare/v1.0.140...v1.0.141)

---
updated-dependencies:
- dependency-name: serde_json
  dependency-version: 1.0.141
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-22 05:23:53 +00:00
tuna2134
e9529be559 Update dependabot.yml 2025-07-22 14:21:56 +09:00
tuna2134
a6694b5d81 Update dependabot.yml 2025-07-22 06:13:03 +09:00
tuna2134
096859de66 Merge pull request #209 from neodyland/dependabot/cargo/num_cpus-1.17.0
Bump num_cpus from 1.16.0 to 1.17.0
2025-07-21 12:27:18 +09:00
tuna2134
dabdc6712f Merge pull request #224 from neodyland/main
merge test
2025-07-21 12:27:08 +09:00
tuna2134
45c3255a91 Merge pull request #221 from neodyland/dependabot/cargo/ort-1e6f7ee
build(deps): bump ort from `af63cea` to `1e6f7ee`
2025-07-21 11:48:27 +09:00
tuna2134
bf39890b3d Merge pull request #223 from neodyland/main
Check fail
2025-07-21 11:48:10 +09:00
tuna2134
120bc608d7 Merge pull request #217 from neodyland/dependabot/cargo/tokenizers-0.21.2
build(deps): bump tokenizers from 0.21.1 to 0.21.2
2025-07-21 11:44:43 +09:00
tuna2134
2fc547e38b Merge pull request #222 from neodyland/main
merge
2025-07-21 11:42:01 +09:00
tuna2134
98ddaa3c58 format by cargo fmt 2025-07-21 11:38:11 +09:00
dependabot[bot]
656e405cd7 build(deps): bump ort from af63cea to 1e6f7ee
Bumps [ort](https://github.com/pykeio/ort) from `af63cea` to `1e6f7ee`.
- [Release notes](https://github.com/pykeio/ort/releases)
- [Commits](af63cea854...1e6f7ee1c8)

---
updated-dependencies:
- dependency-name: ort
  dependency-version: 1e6f7ee1c8b056b00d280167ba172c96e78fcd1c
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-14 09:44:51 +00:00
tuna2134
9d6aa46fdf Merge pull request #220 from neodyland/dependabot/cargo/tokio-1.46.1 2025-07-07 19:38:53 +09:00
dependabot[bot]
2fe90c6ede build(deps): bump tokio from 1.45.1 to 1.46.1
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.45.1 to 1.46.1.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.45.1...tokio-1.46.1)

---
updated-dependencies:
- dependency-name: tokio
  dependency-version: 1.46.1
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-07 09:11:15 +00:00
dependabot[bot]
7faba2447b build(deps): bump tokenizers from 0.21.1 to 0.21.2
Bumps [tokenizers](https://github.com/huggingface/tokenizers) from 0.21.1 to 0.21.2.
- [Release notes](https://github.com/huggingface/tokenizers/releases)
- [Changelog](https://github.com/huggingface/tokenizers/blob/main/RELEASE.md)
- [Commits](https://github.com/huggingface/tokenizers/compare/v0.21.1...v0.21.2)

---
updated-dependencies:
- dependency-name: tokenizers
  dependency-version: 0.21.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-30 09:33:46 +00:00
tuna2134
02ac0885e0 Merge pull request #216 from neodyland/dependabot/cargo/utoipa-5.4.0
Bump utoipa from 5.3.1 to 5.4.0
2025-06-23 18:13:42 +09:00
tuna2134
1f96b09f3b Merge pull request #215 from neodyland/dependabot/cargo/ort-af63cea
Bump ort from `fd73862` to `af63cea`
2025-06-23 18:13:31 +09:00
tuna2134
d583c1ca1c Merge pull request #214 from neodyland/dependabot/cargo/ureq-3.0.12
Bump ureq from 3.0.11 to 3.0.12
2025-06-23 18:13:19 +09:00
dependabot[bot]
c135aac852 Bump utoipa from 5.3.1 to 5.4.0
Bumps [utoipa](https://github.com/juhaku/utoipa) from 5.3.1 to 5.4.0.
- [Release notes](https://github.com/juhaku/utoipa/releases)
- [Changelog](https://github.com/juhaku/utoipa/blob/master/utoipa-rapidoc/CHANGELOG.md)
- [Commits](https://github.com/juhaku/utoipa/compare/utoipa-5.3.1...utoipa-5.4.0)

---
updated-dependencies:
- dependency-name: utoipa
  dependency-version: 5.4.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-23 09:01:28 +00:00
dependabot[bot]
f31fa1d4f9 Bump ort from fd73862 to af63cea
Bumps [ort](https://github.com/pykeio/ort) from `fd73862` to `af63cea`.
- [Release notes](https://github.com/pykeio/ort/releases)
- [Commits](fd738622d7...af63cea854)

---
updated-dependencies:
- dependency-name: ort
  dependency-version: af63cea8546438576f7fc32c935d779bf0882826
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-23 08:48:56 +00:00
dependabot[bot]
efec7cce14 Bump ureq from 3.0.11 to 3.0.12
Bumps [ureq](https://github.com/algesten/ureq) from 3.0.11 to 3.0.12.
- [Changelog](https://github.com/algesten/ureq/blob/main/CHANGELOG.md)
- [Commits](https://github.com/algesten/ureq/compare/3.0.11...3.0.12)

---
updated-dependencies:
- dependency-name: ureq
  dependency-version: 3.0.12
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-23 08:38:51 +00:00
tuna2134@コマリン親衛隊
61914129dc Merge pull request #212 from neodyland/dependabot/cargo/ort-fd73862
Bump ort from `2d49e05` to `fd73862`
2025-06-16 18:01:48 +09:00
tuna2134@コマリン親衛隊
97c63a2e23 Merge pull request #213 from neodyland/dependabot/cargo/pyo3-0.25.1
Bump pyo3 from 0.25.0 to 0.25.1
2025-06-16 18:01:38 +09:00
dependabot[bot]
3475f47305 Bump pyo3 from 0.25.0 to 0.25.1
Bumps [pyo3](https://github.com/pyo3/pyo3) from 0.25.0 to 0.25.1.
- [Release notes](https://github.com/pyo3/pyo3/releases)
- [Changelog](https://github.com/PyO3/pyo3/blob/v0.25.1/CHANGELOG.md)
- [Commits](https://github.com/pyo3/pyo3/compare/v0.25.0...v0.25.1)

---
updated-dependencies:
- dependency-name: pyo3
  dependency-version: 0.25.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-16 08:20:04 +00:00
dependabot[bot]
5493b91a84 Bump ort from 2d49e05 to fd73862
Bumps [ort](https://github.com/pykeio/ort) from `2d49e05` to `fd73862`.
- [Release notes](https://github.com/pykeio/ort/releases)
- [Commits](2d49e052dc...fd738622d7)

---
updated-dependencies:
- dependency-name: ort
  dependency-version: fd738622d708d0b7da536812e20a9e63adeaa70d
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-16 08:16:14 +00:00
tuna2134@コマリン親衛隊
bca6d04e7b Merge pull request #211 from neodyland/dependabot/cargo/ort-2d49e05 2025-06-09 20:13:11 +09:00
dependabot[bot]
d44ebe873e Bump ort from d1ebde9 to 2d49e05
Bumps [ort](https://github.com/pykeio/ort) from `d1ebde9` to `2d49e05`.
- [Release notes](https://github.com/pykeio/ort/releases)
- [Commits](d1ebde95d3...2d49e052dc)

---
updated-dependencies:
- dependency-name: ort
  dependency-version: 2d49e052dc14134e7c28a7b3e0878870710cd759
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-09 08:04:14 +00:00
tuna2134@コマリン親衛隊
96b53d42cd Merge pull request #210 from neodyland/dependabot/cargo/ort-d1ebde9 2025-06-02 17:24:06 +09:00
dependabot[bot]
9765ef51d2 Bump ort from 4745bb3 to d1ebde9
Bumps [ort](https://github.com/pykeio/ort) from `4745bb3` to `d1ebde9`.
- [Release notes](https://github.com/pykeio/ort/releases)
- [Commits](4745bb3a4a...d1ebde95d3)

---
updated-dependencies:
- dependency-name: ort
  dependency-version: d1ebde95d386513fea836593815e8f86f7b96a85
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-02 08:07:32 +00:00
dependabot[bot]
655be55605 Bump num_cpus from 1.16.0 to 1.17.0
Bumps [num_cpus](https://github.com/seanmonstar/num_cpus) from 1.16.0 to 1.17.0.
- [Release notes](https://github.com/seanmonstar/num_cpus/releases)
- [Changelog](https://github.com/seanmonstar/num_cpus/blob/master/CHANGELOG.md)
- [Commits](https://github.com/seanmonstar/num_cpus/compare/v1.16.0...v1.17.0)

---
updated-dependencies:
- dependency-name: num_cpus
  dependency-version: 1.17.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-02 08:06:11 +00:00
tuna2134@コマリン親衛隊
e68f58d698 Fix year 2025-05-27 07:25:04 +09:00
tuna2134@コマリン親衛隊
2124fe4650 Merge pull request #207 from neodyland/dependabot/cargo/ort-4745bb3 2025-05-27 06:57:31 +09:00
tuna2134@コマリン親衛隊
0217c0a4d5 Merge pull request #205 from neodyland/dependabot/cargo/tokio-1.45.1 2025-05-27 06:57:06 +09:00
tuna2134@コマリン親衛隊
1de09597f5 Merge pull request #206 from neodyland/dependabot/cargo/pyo3-0.25.0 2025-05-27 06:56:53 +09:00
tuna2134@コマリン親衛隊
38d86c9249 Merge pull request #208 from neodyland/dependabot/cargo/npyz-0.8.4 2025-05-27 06:56:34 +09:00
dependabot[bot]
ddc132b27b Bump npyz from 0.8.3 to 0.8.4
Bumps [npyz](https://github.com/ExpHP/npyz) from 0.8.3 to 0.8.4.
- [Changelog](https://github.com/ExpHP/npyz/blob/master/CHANGELOG.md)
- [Commits](https://github.com/ExpHP/npyz/compare/0.8.3...0.8.4)

---
updated-dependencies:
- dependency-name: npyz
  dependency-version: 0.8.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-26 21:53:38 +00:00
dependabot[bot]
558cd24677 Bump ort from 90afc70 to 4745bb3
Bumps [ort](https://github.com/pykeio/ort) from `90afc70` to `4745bb3`.
- [Release notes](https://github.com/pykeio/ort/releases)
- [Commits](90afc700d0...4745bb3a4a)

---
updated-dependencies:
- dependency-name: ort
  dependency-version: 4745bb3a4a1b5ab7f2c807b8989638627068cdf9
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-26 21:52:29 +00:00
dependabot[bot]
6657b06786 Bump pyo3 from 0.24.2 to 0.25.0
Bumps [pyo3](https://github.com/pyo3/pyo3) from 0.24.2 to 0.25.0.
- [Release notes](https://github.com/pyo3/pyo3/releases)
- [Changelog](https://github.com/PyO3/pyo3/blob/main/CHANGELOG.md)
- [Commits](https://github.com/pyo3/pyo3/compare/v0.24.2...v0.25.0)

---
updated-dependencies:
- dependency-name: pyo3
  dependency-version: 0.25.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-26 21:52:04 +00:00
dependabot[bot]
2a8c9bafde Bump tokio from 1.45.0 to 1.45.1
Bumps [tokio](https://github.com/tokio-rs/tokio) from 1.45.0 to 1.45.1.
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.45.0...tokio-1.45.1)

---
updated-dependencies:
- dependency-name: tokio
  dependency-version: 1.45.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-05-26 21:51:36 +00:00
tuna2134@コマリン親衛隊
d7065ac6eb Merge pull request #204 from neodyland/tuna2134-patch-2
add: dependabot
2025-05-27 06:48:32 +09:00
tuna2134@コマリン親衛隊
0b1dbe4991 Create dependabot.yml 2025-05-27 06:47:06 +09:00
googlefan256
1ad588bfcf hotfix: path miss 2025-05-10 11:33:07 +09:00
googlefan256
9733ba95fa fix: the unknown error 2025-05-09 20:28:04 +09:00
googlefan256
843c16995c fix: cuda typo 2025-05-09 20:19:30 +09:00
googlefan256
f0821ea957 Merge pull request #201 from neodyland/googlefan256/fix-lot
fix: lot of
2025-05-09 20:10:08 +09:00
googlefan256
abc9cec7c7 fix: reduced actions 2025-05-09 20:07:18 +09:00
googlefan256
19e6b7f0e6 fix: ci 2025-05-09 20:05:46 +09:00
googlefan256
451f4497b6 fix: lot of 2025-05-09 17:01:02 +09:00
tuna2134@コマリン親衛隊
e5e92f6211 Merge pull request #141 from neodyland/renovate/typescript-5.x-lockfile
chore(deps): update dependency typescript to v5.8.3
2025-04-28 16:10:28 +09:00
tuna2134@コマリン親衛隊
b835577325 Merge pull request #190 from neodyland/renovate/anyhow-1.x-lockfile
chore(deps): update rust crate anyhow to v1.0.98
2025-04-28 16:08:54 +09:00
tuna2134@コマリン親衛隊
3caf93441a Merge pull request #191 from neodyland/renovate/thiserror-2.x-lockfile
fix(deps): update rust crate thiserror to v2.0.12
2025-04-28 16:08:30 +09:00
tuna2134@コマリン親衛隊
4deefc596b Merge pull request #192 from neodyland/renovate/serde_json-1.x-lockfile
fix(deps): update rust crate serde_json to v1.0.140
2025-04-22 06:42:24 +09:00
tuna2134@コマリン親衛隊
9174aa9b11 Merge pull request #196 from neodyland/renovate/pyo3-0.x
fix(deps): update rust crate pyo3 to 0.24.0
2025-04-22 06:33:40 +09:00
tuna2134@コマリン親衛隊
6bccf0468b Merge pull request #199 from neodyland/renovate/env_logger-0.x-lockfile
chore(deps): update rust crate env_logger to v0.11.8
2025-04-22 06:33:22 +09:00
tuna2134@コマリン親衛隊
bbb3f0003b Merge pull request #200 from neodyland/renovate/ureq-3.x-lockfile
chore(deps): update rust crate ureq to v3.0.11
2025-04-22 06:33:02 +09:00
googlefan256
46de7a9d3f Update README.md for notice 2025-04-18 09:47:02 +09:00
renovate[bot]
252b27de48 chore(deps): update rust crate ureq to v3.0.11 2025-04-15 22:47:28 +00:00
renovate[bot]
1dd3e02562 chore(deps): update rust crate anyhow to v1.0.98 2025-04-14 03:33:49 +00:00
renovate[bot]
4990261ecd chore(deps): update dependency typescript to v5.8.3 2025-04-05 01:50:04 +00:00
renovate[bot]
e873892223 chore(deps): update rust crate env_logger to v0.11.8 2025-04-01 22:07:52 +00:00
tuna2134@コマリン親衛隊
f081b2ed22 Merge pull request #198 from tuna2134/voicevox
Editor APIを追加
2025-03-31 23:56:25 +09:00
Masato Kikuchi
103eb51ca8 delete 2025-03-31 23:39:44 +09:00
Masato Kikuchi
01541ff381 delete unimport 2025-03-31 23:36:10 +09:00
Masato Kikuchi
70c2341afd format 2025-03-31 23:35:51 +09:00
Masato Kikuchi
a5d783bd65 fix: bug 2025-03-31 23:35:39 +09:00
Masato Kikuchi
633dfc305e delete mut 2025-03-31 23:04:23 +09:00
Masato Kikuchi
53d7daf11a fix 2025-03-31 23:03:30 +09:00
Masato Kikuchi
5abfe732e4 fix bug 2025-03-31 22:45:55 +09:00
tuna2134@コマリン親衛隊
48aef6cef4 tts.rs を更新 2025-03-29 11:02:23 +09:00
tuna2134@コマリン親衛隊
64fc74eee6 fix: bug 2025-03-29 10:58:24 +09:00
Masato Kikuchi
6e01103c5d format 2025-03-29 10:50:40 +09:00
Masato Kikuchi
00e95cd77c feat: synthesis 2025-03-29 10:50:30 +09:00
Masato Kikuchi
01f2aaa406 no voicevox 2025-03-28 20:14:51 +09:00
Masato Kikuchi
3785faf81e fix 2025-03-28 20:08:07 +09:00
Masato Kikuchi
70e16f95ad fix: voicevox化は難しいので、独自のエディター開発をする。 2025-03-28 20:06:00 +09:00
Masato Kikuchi
a67df43fc7 fix 2025-03-27 14:42:43 +09:00
Masato Kikuchi
472d1c600f fix: add route 2025-03-27 13:59:00 +09:00
Masato Kikuchi
acf94a1283 format 2025-03-27 13:53:52 +09:00
Masato Kikuchi
dd5c536f39 feat: g2kana_tone 2025-03-27 13:53:27 +09:00
Masato Kikuchi
07637f587d fix: type 2025-03-27 13:23:53 +09:00
Masato Kikuchi
e8dbf956e1 fix: forget to give return 2025-03-27 13:21:07 +09:00
Masato Kikuchi
2687af1a9b clippy 2025-03-27 13:18:22 +09:00
Masato Kikuchi
e915e2bc84 feat: phone_tone_to_kana 2025-03-27 13:17:20 +09:00
Masato Kikuchi
22ed557395 oh 2025-03-27 09:59:08 +09:00
Masato Kikuchi
b8f0477318 feat: audio query request 2025-03-26 16:30:31 +09:00
Masato Kikuchi
f4de3e15ae initial commit: voicevox 2025-03-26 16:14:29 +09:00
Masato Kikuchi
fc944b9d33 split the code for support voicevox 2025-03-26 15:14:22 +09:00
renovate[bot]
4255e15748 fix(deps): update rust crate pyo3 to 0.24.0 2025-03-10 03:15:24 +00:00
renovate[bot]
8bf3906105 fix(deps): update rust crate serde_json to v1.0.140 2025-03-03 11:01:43 +00:00
renovate[bot]
1d80eda325 fix(deps): update rust crate thiserror to v2.0.12 2025-03-03 06:07:28 +00:00
tuna2134@コマリン親衛隊
99a4b130af Merge pull request #189 from tuna2134/renovate/ureq-3.x-lockfile 2025-03-01 16:59:08 +09:00
renovate[bot]
d430a6cb51 chore(deps): update rust crate ureq to v3.0.8 2025-02-28 18:57:35 +00:00
tuna2134@コマリン親衛隊
61aae68d2d Merge pull request #188 from tuna2134/tuna2134-patch-1
Fix example
2025-02-26 15:03:18 +09:00
tuna2134@コマリン親衛隊
abb40d4d2d Fix example 2025-02-26 14:38:09 +09:00
tuna2134@コマリン親衛隊
adb699efe7 Merge pull request #186 from tuna2134/renovate/pyo3-0.x-lockfile 2025-02-26 05:35:36 +09:00
renovate[bot]
00fa8025d7 fix(deps): update rust crate pyo3 to v0.23.5 2025-02-25 13:49:05 +00:00
googlefan256
38c5471dcc Merge pull request #184 from googlefan256/main
Pythonビルドの修正
2025-02-23 17:07:05 +09:00
googlefan256
28e116e67d Update CI.yml 2025-02-23 16:19:20 +09:00
Googlefan
5127d48260 fix 2025-02-23 07:04:20 +00:00
Googlefan
f6e9a52b13 fix 2025-02-23 06:49:49 +00:00
Googlefan
9b7de85c46 fix 2025-02-23 06:36:22 +00:00
Googlefan
45a221af23 fix 2025-02-23 06:22:19 +00:00
Googlefan
97541d6a28 fix 2025-02-23 06:08:26 +00:00
Googlefan
640ef16c4b miss 2025-02-23 05:49:06 +00:00
Googlefan
2b5bc27db7 tell me that I can use arm64 runner, f*** microsoft, update the doc NOW 2025-02-23 05:45:23 +00:00
Googlefan
4d00fcd0bc fix 2025-02-23 05:14:56 +00:00
Googlefan
6fc0a47a78 fix 2025-02-23 05:08:13 +00:00
Googlefan
80e5ddee5b fix 2025-02-23 05:05:40 +00:00
Googlefan
143d05c068 fix 2025-02-23 05:02:36 +00:00
Googlefan
14d604091b fix: shortcut 2025-02-23 04:58:12 +00:00
Googlefan
6fc97b1f33 revert 2025-02-23 04:52:02 +00:00
Googlefan
6c5ea9adce fix: reduce download 2025-02-23 04:48:52 +00:00
Googlefan
e262694702 fix 2025-02-23 04:45:58 +00:00
Googlefan
554b82a504 fix 2025-02-23 04:34:48 +00:00
Googlefan
0a911105a3 fix 2025-02-23 04:32:22 +00:00
Googlefan
e5a4774e1a fix 2025-02-23 04:29:15 +00:00
Googlefan
f036417046 fix 2025-02-23 04:27:14 +00:00
Googlefan
3e0c24e0ec fix 2025-02-23 04:16:23 +00:00
Googlefan
17c1a3467a fix 2025-02-23 04:14:49 +00:00
Googlefan
db954ff710 fix 2025-02-23 04:13:51 +00:00
Googlefan
3d2f36a0bf fix 2025-02-23 04:01:47 +00:00
Googlefan
6a8b64208c fix 2025-02-23 04:00:08 +00:00
Googlefan
f33791cf67 fix 2025-02-23 03:56:22 +00:00
Googlefan
96eb51cf04 fix 2025-02-23 03:53:46 +00:00
Googlefan
0f11b9a192 fix 2025-02-23 03:49:53 +00:00
Googlefan
eefd5b723c fix 2025-02-23 03:46:48 +00:00
Googlefan
c86a79cce5 fix 2025-02-23 03:22:13 +00:00
Googlefan
fd7ba84eef fix 2025-02-23 03:19:36 +00:00
googlefan256
6afa667f2e Update and rename build.yml to CI.yml 2025-02-23 12:17:53 +09:00
googlefan256
70b5852d1b Update build.yml 2025-02-23 12:12:48 +09:00
googlefan256
208ac216b5 Update and rename CI.yml to build.yml 2025-02-23 12:10:06 +09:00
googlefan256
5f0c836a66 Update CI.yml 2025-02-23 11:55:59 +09:00
googlefan256
baebe4efd6 Update .env.sample 2025-02-23 11:11:25 +09:00
googlefan256
024751cb71 Update README.md 2025-02-23 11:11:16 +09:00
tuna2134@コマリン親衛隊
12269d9b86 Fix: syntax 2025-02-23 08:20:06 +09:00
tuna2134@コマリン親衛隊
d13b0f6952 Merge pull request #183 from googlefan256/main
LGPL問題の解決
2025-02-22 20:25:12 +09:00
Googlefan
ba40e89411 small fix 2025-02-22 10:50:32 +00:00
googlefan256
52f7a1779b Merge branch 'tuna2134:main' into main 2025-02-22 19:42:26 +09:00
Googlefan
e6932aeeae fix: agpl issue 2025-02-22 10:42:11 +00:00
tuna2134@コマリン親衛隊
881b431e7b Merge pull request #180 from googlefan256/main
細かい修正の集合
2025-02-22 19:23:19 +09:00
Googlefan
755876605e feat: wasm would work 2025-02-22 10:04:40 +00:00
Googlefan
9a5da399be fix 2025-02-22 09:42:19 +00:00
Googlefan
c1c1bbe69a fix 2025-02-22 09:40:47 +00:00
Googlefan
9654ca4781 fix 2025-02-22 09:39:17 +00:00
Googlefan
5fed7e8f41 fix 2025-02-22 09:37:41 +00:00
Googlefan
8ae456b20b fix 2025-02-22 09:35:54 +00:00
Googlefan
908e384cfa fix 2025-02-22 09:29:15 +00:00
Googlefan
932af5cbfb fix 2025-02-22 09:23:08 +00:00
Googlefan
635dacb653 fix 2025-02-22 09:08:49 +00:00
Googlefan
cb884297c7 fix: python ci 2025-02-22 09:01:35 +00:00
Googlefan
b9aa8d1b7c fix 2025-02-22 08:56:10 +00:00
Googlefan
5b364a3c10 fix: build script 2025-02-22 08:26:59 +00:00
Googlefan
a11e57d175 remove unused files 2025-02-22 08:15:35 +00:00
Googlefan
2ddc6e57eb fix: some errors 2025-02-22 08:11:49 +00:00
Googlefan
ddfaf7f28b merge 2025-02-22 08:07:04 +00:00
Googlefan
506ee4d883 refactor 2025-02-22 08:00:17 +00:00
tuna2134@コマリン親衛隊
a1198d6380 CI.yml を更新 2025-01-22 23:18:34 +09:00
tuna2134@コマリン親衛隊
c2430fc794 Update CI.yml 2025-01-20 09:49:47 +09:00
tuna2134@コマリン親衛隊
2bfca72f41 Update CI.yml 2025-01-20 09:40:32 +09:00
tuna2134@コマリン親衛隊
95b84ca55b Update CI.yml 2025-01-20 09:26:59 +09:00
tuna2134@コマリン親衛隊
033dd99fb6 Update CI.yml 2025-01-20 09:21:38 +09:00
tuna2134@コマリン親衛隊
15aef30867 Update CI.yml 2025-01-20 09:10:07 +09:00
tuna2134@コマリン親衛隊
adca252272 Merge pull request #164 from tuna2134/renovate/tokio-1.x-lockfile
fix(deps): update rust crate tokio to v1.43.0
2025-01-20 08:55:49 +09:00
tuna2134@コマリン親衛隊
5c74773754 Merge pull request #166 from tuna2134/renovate/pyo3-0.x-lockfile
fix(deps): update rust crate pyo3 to v0.23.4
2025-01-20 08:55:38 +09:00
tuna2134@コマリン親衛隊
2e5edcfb32 Merge pull request #168 from tuna2134/renovate/serde_json-1.x-lockfile
fix(deps): update rust crate serde_json to v1.0.137
2025-01-20 08:55:25 +09:00
tuna2134@コマリン親衛隊
184080bec8 Merge pull request #169 from tuna2134/renovate/utoipa-monorepo
fix(deps): update rust crate utoipa-scalar to 0.3.0
2025-01-20 08:55:13 +09:00
renovate[bot]
d83fcb9f2c fix(deps): update rust crate utoipa-scalar to 0.3.0 2025-01-19 23:52:40 +00:00
renovate[bot]
7095e0ea89 fix(deps): update rust crate serde_json to v1.0.137 2025-01-19 23:51:54 +00:00
tuna2134@コマリン親衛隊
aa07496a08 Merge pull request #163 from tuna2134/renovate/axum-monorepo
fix(deps): update rust crate axum to 0.8.0
2025-01-20 08:51:27 +09:00
tuna2134@コマリン親衛隊
bf276f51e7 Merge pull request #167 from tuna2134/renovate/log-0.x-lockfile
fix(deps): update rust crate log to v0.4.25
2025-01-20 08:51:14 +09:00
tuna2134@コマリン親衛隊
cc664fae2d Update CI.yml 2025-01-20 08:43:05 +09:00
tuna2134@コマリン親衛隊
71ec658772 CI.yml を更新 2025-01-20 07:55:03 +09:00
renovate[bot]
5ea2dcff0f fix(deps): update rust crate log to v0.4.25 2025-01-14 12:51:47 +00:00
renovate[bot]
a9ea47dc51 fix(deps): update rust crate pyo3 to v0.23.4 2025-01-11 22:40:29 +00:00
コマリン親衛隊
dff939091c Merge pull request #165 from tuna2134/renovate/log-0.x-lockfile
fix(deps): update rust crate log to v0.4.24
2025-01-11 11:55:02 +09:00
renovate[bot]
8a28a4e7a5 fix(deps): update rust crate log to v0.4.24 2025-01-11 02:50:24 +00:00
コマリン親衛隊
21f845a799 Merge pull request #162 from tuna2134/renovate/serde_json-1.x-lockfile
fix(deps): update rust crate serde_json to v1.0.135
2025-01-09 09:40:07 +09:00
renovate[bot]
69015bdf81 fix(deps): update rust crate tokio to v1.43.0 2025-01-08 18:20:47 +00:00
renovate[bot]
c6e5b73128 fix(deps): update rust crate axum to 0.8.0 2025-01-08 15:24:14 +00:00
renovate[bot]
4ff9a38a80 fix(deps): update rust crate serde_json to v1.0.135 2025-01-08 15:24:08 +00:00
コマリン親衛隊
20cc0573b5 Merge pull request #161 from tuna2134/renovate/utoipa-monorepo
fix(deps): update utoipa monorepo
2025-01-06 13:28:46 +09:00
renovate[bot]
4b932d568d fix(deps): update utoipa monorepo 2025-01-06 00:39:56 +00:00
コマリン親衛隊
6237cd0fec Merge pull request #159 from tuna2134/renovate/serde-monorepo
fix(deps): update rust crate serde to v1.0.217
2024-12-28 13:09:10 +09:00
renovate[bot]
35fabdf681 fix(deps): update rust crate serde to v1.0.217 2024-12-27 22:05:25 +00:00
コマリン親衛隊
f09343c97f Merge pull request #155 from tuna2134/renovate/env_logger-0.x-lockfile
chore(deps): update rust crate env_logger to v0.11.6
2024-12-23 23:38:00 +09:00
コマリン親衛隊
f2570d89d0 Merge pull request #158 from tuna2134/renovate/anyhow-1.x-lockfile
chore(deps): update rust crate anyhow to v1.0.95
2024-12-22 21:52:38 +09:00
renovate[bot]
ac2a09d6af chore(deps): update rust crate anyhow to v1.0.95 2024-12-22 12:39:36 +00:00
コマリン親衛隊
c6eaf9cb9f Merge pull request #156 from tuna2134/renovate/serde_json-1.x-lockfile
fix(deps): update rust crate serde_json to v1.0.134
2024-12-22 10:30:46 +09:00
renovate[bot]
f2395096ca fix(deps): update rust crate serde_json to v1.0.134 2024-12-21 21:03:32 +00:00
renovate[bot]
3f6f4ccb6f chore(deps): update rust crate env_logger to v0.11.6 2024-12-20 20:48:36 +00:00
コマリン親衛隊
67eba8ee6c Merge pull request #152 from tuna2134/renovate/utoipa-monorepo
fix(deps): update rust crate utoipa to v5.3.0
2024-12-20 09:03:53 +09:00
renovate[bot]
0aa1bc8733 fix(deps): update rust crate utoipa to v5.3.0 2024-12-19 18:06:44 +00:00
コマリン親衛隊
d1970d99be Merge pull request #151 from tuna2134/renovate/pyo3-0.x-lockfile
fix(deps): update rust crate pyo3 to v0.23.3
2024-12-19 08:30:37 +09:00
コマリン親衛隊
fddb35e592 Update CI.yml 2024-12-19 08:28:53 +09:00
renovate[bot]
e26715c809 fix(deps): update rust crate pyo3 to v0.23.3 2024-12-18 23:20:53 +00:00
コマリン親衛隊
26aa4b7df0 Merge pull request #150 from tuna2134/next
Next
2024-12-19 08:19:48 +09:00
コマリン親衛隊
de18846280 Update sbv2.rs 2024-12-19 07:50:13 +09:00
コマリン親衛隊
38c2e69648 Merge pull request #133 from tuna2134/renovate/pyo3-0.x
fix(deps): update rust crate pyo3 to 0.23.0
2024-12-19 07:47:03 +09:00
コマリン親衛隊
593dbaf19d Merge pull request #144 from tuna2134/renovate/tokenizers-0.x
fix(deps): update rust crate tokenizers to 0.21.0
2024-12-19 07:44:45 +09:00
コマリン親衛隊
bf44b07be1 Merge pull request #146 from tuna2134/renovate/tokio-1.x-lockfile
fix(deps): update rust crate tokio to v1.42.0
2024-12-19 07:44:30 +09:00
コマリン親衛隊
102a8eb065 Merge pull request #149 from tuna2134/renovate/serde-monorepo
fix(deps): update rust crate serde to v1.0.216
2024-12-19 07:44:21 +09:00
renovate[bot]
68edb3187f fix(deps): update rust crate serde to v1.0.216 2024-12-11 03:08:44 +00:00
コマリン親衛隊
4a81a06faf Merge pull request #147 from tuna2134/renovate/anyhow-1.x-lockfile
chore(deps): update rust crate anyhow to v1.0.94
2024-12-07 14:27:55 +09:00
renovate[bot]
caf541ef65 chore(deps): update rust crate anyhow to v1.0.94 2024-12-03 23:23:13 +00:00
renovate[bot]
05c3846b7b fix(deps): update rust crate tokio to v1.42.0 2024-12-03 18:00:26 +00:00
renovate[bot]
1b2054c4b8 fix(deps): update rust crate tokenizers to 0.21.0 2024-11-27 14:26:20 +00:00
コマリン親衛隊
a7fbfa2017 Merge pull request #138 from tuna2134/aivmx
support aivmx
2024-11-20 16:10:06 +09:00
tuna2134
db09b73b32 support aivmx 2024-11-20 07:01:43 +00:00
tuna2134
843ef36148 Merge branch 'main' of https://github.com/tuna2134/sbv2-api into aivmx 2024-11-20 04:15:44 +00:00
コマリン親衛隊
aa7fc2e3b0 Delete convert/LICENSE 2024-11-20 13:13:17 +09:00
コマリン親衛隊
fc4a79c111 Create LICENSE 2024-11-20 13:12:40 +09:00
コマリン親衛隊
4db7f49fa5 Update and rename LICENSE to convert/LICENSE 2024-11-20 13:11:58 +09:00
tuna2134
edee0710aa support noise_scale 2024-11-20 02:53:14 +00:00
tuna2134
9bcbd496e5 fix 2024-11-20 02:42:33 +00:00
tuna2134
90b3ba2e40 fix bug 2024-11-20 02:42:19 +00:00
tuna2134
9ceec03bd0 fix bug 2024-11-20 02:39:38 +00:00
tuna2134
5e9df65656 add aivmx test 2024-11-20 02:36:42 +00:00
tuna2134
2eda2fe9ca fix 2024-11-20 02:14:59 +00:00
tuna2134
9c9119a107 support aivmx 2024-11-20 01:42:04 +00:00
コマリン親衛隊
2c1a1dffc0 Merge pull request #135 from tuna2134/renovate/serde_json-1.x-lockfile
fix(deps): update rust crate serde_json to v1.0.133
2024-11-17 14:50:23 +09:00
renovate[bot]
ed7bf53b89 fix(deps): update rust crate serde_json to v1.0.133 2024-11-17 03:44:30 +00:00
コマリン親衛隊
4375df2689 Merge pull request #134 from tuna2134/renovate/axum-0.x-lockfile
fix(deps): update rust crate axum to v0.7.9
2024-11-17 12:43:38 +09:00
renovate[bot]
789cef74ce fix(deps): update rust crate axum to v0.7.9 2024-11-16 22:43:20 +00:00
コマリン親衛隊
5b403a2255 Merge pull request #132 from tuna2134/renovate/axum-0.x-lockfile
fix(deps): update rust crate axum to v0.7.8
2024-11-16 09:22:17 +09:00
renovate[bot]
62653ec1c3 fix(deps): update rust crate pyo3 to 0.23.0 2024-11-15 23:06:13 +00:00
renovate[bot]
83076227e7 fix(deps): update rust crate axum to v0.7.8 2024-11-15 18:22:03 +00:00
tuna2134
f90904a337 fix version 2024-11-13 12:02:36 +00:00
tuna2134
4e0c8591cd fix 2024-11-13 12:00:59 +00:00
コマリン親衛隊
997b562682 Merge pull request #131 from tuna2134/add-spealer
話者指定を追加
2024-11-13 20:58:11 +09:00
tuna2134
fbd62315d0 clippy 2024-11-13 11:46:47 +00:00
tuna2134
060af0c187 format 2024-11-13 11:43:52 +00:00
tuna2134
b76738f467 add speaker id code 2024-11-13 11:39:05 +00:00
コマリン親衛隊
8598167114 Merge pull request #130 from tuna2134/tuna2134-patch-1
Style ID指定できるようにした
2024-11-13 11:46:17 +09:00
tuna2134
001f61bb6a fix types 2024-11-13 02:24:09 +00:00
コマリン親衛隊
9b9962ed29 Style ID指定できるようにした 2024-11-13 11:16:24 +09:00
コマリン親衛隊
b414d22a3b Merge pull request #129 from tuna2134/renovate/serde-monorepo
fix(deps): update rust crate serde to v1.0.215
2024-11-13 11:09:25 +09:00
renovate[bot]
248363ae4a fix(deps): update rust crate serde to v1.0.215 2024-11-12 00:53:00 +00:00
コマリン親衛隊
c4b61a36db Merge pull request #128 from tuna2134/renovate/thiserror-1.x-lockfile
fix(deps): update rust crate thiserror to v1.0.69
2024-11-10 20:38:21 +09:00
renovate[bot]
35d16d88a8 fix(deps): update rust crate thiserror to v1.0.69 2024-11-10 07:11:07 +00:00
コマリン親衛隊
fe48d6a034 Merge pull request #127 from tuna2134/renovate/tokio-1.x-lockfile
fix(deps): update rust crate tokio to v1.41.1
2024-11-08 09:55:35 +09:00
renovate[bot]
bca4b2053f fix(deps): update rust crate tokio to v1.41.1 2024-11-07 13:40:20 +00:00
コマリン親衛隊
3330242cd8 Merge pull request #120 from tuna2134/renovate/tokenizers-0.x-lockfile 2024-11-07 00:17:47 +08:00
コマリン親衛隊
f10f71f29b Merge pull request #124 from tuna2134/renovate/anyhow-1.x-lockfile 2024-11-06 21:12:26 +08:00
renovate[bot]
7bd39b7182 chore(deps): update rust crate anyhow to v1.0.93 2024-11-06 13:01:27 +00:00
コマリン親衛隊
2d557fb0ee Merge pull request #123 from Googlefan256/main 2024-11-06 21:00:38 +08:00
Googlefan
14d631eeaa wip: max loaded models 2024-11-06 10:43:41 +00:00
コマリン親衛隊
380daf479c Merge pull request #122 from tuna2134/renovate/pyo3-0.x-lockfile 2024-11-06 09:57:35 +08:00
renovate[bot]
cb814a9952 fix(deps): update rust crate pyo3 to v0.22.6 2024-11-06 01:24:53 +00:00
renovate[bot]
795caf626c fix(deps): update rust crate tokenizers to v0.20.3 2024-11-05 18:02:17 +00:00
コマリン親衛隊
fb32357f31 Merge pull request #119 from tuna2134/renovate/thiserror-1.x-lockfile 2024-11-05 09:49:04 +08:00
renovate[bot]
e4010b3b83 fix(deps): update rust crate thiserror to v1.0.68 2024-11-04 19:39:16 +00:00
コマリン親衛隊
17244a9ede Merge pull request #118 from tuna2134/renovate/thiserror-1.x-lockfile
fix(deps): update rust crate thiserror to v1.0.67
2024-11-04 01:23:40 +09:00
renovate[bot]
61b04fd3d7 fix(deps): update rust crate thiserror to v1.0.67 2024-11-03 16:01:48 +00:00
コマリン親衛隊
4e57a22a40 Merge pull request #117 from tuna2134/renovate/utoipa-5.x-lockfile
fix(deps): update rust crate utoipa to v5.2.0
2024-11-03 08:01:41 +09:00
renovate[bot]
8e10057882 fix(deps): update rust crate utoipa to v5.2.0 2024-11-02 15:44:49 +00:00
コマリン親衛隊
0222b9a189 Merge pull request #116 from tuna2134/renovate/tar-0.x-lockfile
fix(deps): update rust crate tar to v0.4.43
2024-11-02 16:14:41 +09:00
renovate[bot]
5e96d5aef7 fix(deps): update rust crate tar to v0.4.43 2024-11-02 06:41:37 +00:00
コマリン親衛隊
234120f510 Merge pull request #115 from tuna2134/renovate/thiserror-1.x-lockfile 2024-11-02 07:08:57 +09:00
コマリン親衛隊
08f7ab88ec Merge pull request #114 from tuna2134/renovate/anyhow-1.x-lockfile 2024-11-02 07:08:40 +09:00
renovate[bot]
005c67c9b6 fix(deps): update rust crate thiserror to v1.0.66 2024-11-01 17:30:59 +00:00
renovate[bot]
cb08b5b582 chore(deps): update rust crate anyhow to v1.0.92 2024-11-01 17:30:55 +00:00
コマリン親衛隊
105b3ce8de Merge pull request #113 from tuna2134/renovate/onnxruntime-web-1.x-lockfile
fix(deps): update dependency onnxruntime-web to v1.20.0
2024-10-31 12:55:53 +09:00
renovate[bot]
78a5016abc fix(deps): update dependency onnxruntime-web to v1.20.0 2024-10-31 01:30:18 +00:00
コマリン親衛隊
7e6bd4ad0a Merge pull request #112 from tuna2134/renovate/serde-monorepo 2024-10-29 07:47:34 +09:00
renovate[bot]
e1c6cd04b7 fix(deps): update rust crate serde to v1.0.214 2024-10-28 19:40:13 +00:00
コマリン親衛隊
a15efdff09 Merge pull request #110 from tuna2134/renovate/node-22.x-lockfile
chore(deps): update dependency @types/node to v22.8.1
2024-10-28 15:33:34 +09:00
コマリン親衛隊
21823721d0 Merge pull request #111 from tuna2134/renovate/utoipa-5.x-lockfile
fix(deps): update rust crate utoipa to v5.1.3
2024-10-28 15:33:24 +09:00
renovate[bot]
aad978be4b fix(deps): update rust crate utoipa to v5.1.3 2024-10-27 15:20:25 +00:00
renovate[bot]
6dd2cbd991 chore(deps): update dependency @types/node to v22.8.0 2024-10-25 13:50:26 +00:00
コマリン親衛隊
d7b76cc207 Merge pull request #109 from tuna2134/renovate/regex-1.x-lockfile
fix(deps): update rust crate regex to v1.11.1
2024-10-25 02:00:29 +09:00
renovate[bot]
ae0ccb29d2 fix(deps): update rust crate regex to v1.11.1 2024-10-24 16:27:25 +00:00
tuna2134
4bcde2e4b4 bump library version 2024-10-24 08:01:09 +00:00
コマリン親衛隊
2356c896f6 Merge pull request #108 from tuna2134/renovate/utoipa-5.x-lockfile
fix(deps): update rust crate utoipa to v5.1.2
2024-10-24 00:55:42 +09:00
renovate[bot]
d5445abeee fix(deps): update rust crate utoipa to v5.1.2 2024-10-23 15:50:59 +00:00
コマリン親衛隊
673ec0067d Merge pull request #107 from tuna2134/renovate/node-22.x-lockfile
chore(deps): update dependency @types/node to v22.7.9
2024-10-23 19:05:43 +09:00
renovate[bot]
74f657cb33 chore(deps): update dependency @types/node to v22.7.9 2024-10-23 04:51:07 +00:00
コマリン親衛隊
08be778cc5 Merge pull request #105 from tuna2134/renovate/thiserror-1.x-lockfile
fix(deps): update rust crate thiserror to v1.0.65
2024-10-23 08:08:57 +09:00
コマリン親衛隊
6da2f5a0bb Merge pull request #104 from tuna2134/renovate/serde-monorepo
fix(deps): update rust crate serde to v1.0.213
2024-10-23 08:08:50 +09:00
コマリン親衛隊
107190765f Merge pull request #106 from tuna2134/renovate/anyhow-1.x-lockfile
chore(deps): update rust crate anyhow to v1.0.91
2024-10-23 08:08:41 +09:00
renovate[bot]
df726e6f7b fix(deps): update rust crate serde to v1.0.213 2024-10-22 22:07:24 +00:00
renovate[bot]
e5b1ccc36b chore(deps): update rust crate anyhow to v1.0.91 2024-10-22 22:07:14 +00:00
renovate[bot]
40cb604c57 fix(deps): update rust crate thiserror to v1.0.65 2024-10-22 18:02:37 +00:00
コマリン親衛隊
9152c80c76 Merge pull request #102 from tuna2134/renovate/serde-monorepo
fix(deps): update rust crate serde to v1.0.211
2024-10-22 19:30:19 +09:00
コマリン親衛隊
574092562e Merge pull request #103 from tuna2134/renovate/tokio-1.x-lockfile
fix(deps): update rust crate tokio to v1.41.0
2024-10-22 19:30:09 +09:00
renovate[bot]
2e931adce7 fix(deps): update rust crate tokio to v1.41.0 2024-10-22 10:09:19 +00:00
renovate[bot]
e36c395db1 fix(deps): update rust crate serde to v1.0.211 2024-10-22 10:09:13 +00:00
コマリン親衛隊
cfe88629ab Merge pull request #101 from tuna2134/renovate/node-22.x-lockfile
chore(deps): update dependency @types/node to v22.7.8
2024-10-22 13:18:01 +09:00
renovate[bot]
30a98f0968 chore(deps): update dependency @types/node to v22.7.8 2024-10-22 03:41:35 +00:00
コマリン親衛隊
92ae4bc300 Merge pull request #100 from tuna2134/renovate/serde_json-1.x-lockfile
fix(deps): update rust crate serde_json to v1.0.132
2024-10-20 18:30:12 +09:00
renovate[bot]
b6a9bea7ea fix(deps): update rust crate serde_json to v1.0.132 2024-10-19 19:02:47 +00:00
コマリン親衛隊
8c88dd7c87 Merge pull request #98 from Mofa-Xingche/patch-2
Create Colab-sbv2_bindings-CPU.ipynb
2024-10-19 15:56:16 +09:00
コマリン親衛隊
61760b8d7d Merge pull request #99 from tuna2134/renovate/node-22.x-lockfile
chore(deps): update dependency @types/node to v22.7.7
2024-10-19 13:19:28 +09:00
renovate[bot]
5bbc247a89 chore(deps): update dependency @types/node to v22.7.7 2024-10-19 03:48:15 +00:00
コマリン親衛隊
b6f36def58 Merge pull request #96 from tuna2134/renovate/anyhow-1.x-lockfile 2024-10-19 12:47:58 +09:00
コマリン親衛隊
664176a11b Merge pull request #97 from tuna2134/renovate/serde_json-1.x-lockfile 2024-10-19 12:47:51 +09:00
renovate[bot]
432b68590c fix(deps): update rust crate serde_json to v1.0.131 2024-10-19 00:47:24 +00:00
魔法星辰
6283cfedfe Create Colab-sbv2_bindings-CPU.ipynb
すみません 失礼します
ひとまず、ColabのCPU sbv2_bindingsのジュピターノートブックの追加
2024-10-19 05:00:28 +09:00
renovate[bot]
df9c5d792d chore(deps): update rust crate anyhow to v1.0.90 2024-10-18 17:53:27 +00:00
コマリン親衛隊
d1cc8de976 Merge pull request #94 from tuna2134/refine
コードのリファイン
2024-10-18 22:49:35 +09:00
tuna2134
c7d911220b bump 2024-10-18 13:46:22 +00:00
tuna2134
e73514e5d3 bump cersion 2024-10-18 13:37:33 +00:00
tuna2134
45a671cf52 fix compile 2024-10-18 13:35:23 +00:00
tuna2134
c4005808bd fixed 2024-10-18 13:32:35 +00:00
コマリン親衛隊
c312fb0ce4 Merge pull request #89 from tuna2134/renovate/biomejs-biome-1.x-lockfile 2024-10-18 17:41:57 +09:00
コマリン親衛隊
4b4ce82654 Merge pull request #90 from tuna2134/renovate/serde_json-1.x-lockfile 2024-10-18 13:06:55 +09:00
renovate[bot]
3ff226659b fix(deps): update rust crate serde_json to v1.0.129 2024-10-17 20:25:12 +00:00
renovate[bot]
86d0e60eec chore(deps): update dependency @biomejs/biome to v1.9.4 2024-10-17 20:24:07 +00:00
コマリン親衛隊
d337d7caf8 Merge pull request #87 from tuna2134/renovate/node-22.x-lockfile
chore(deps): update dependency @types/node to v22.7.6
2024-10-17 16:21:41 +09:00
renovate[bot]
cbd12a369b chore(deps): update dependency @types/node to v22.7.6 2024-10-17 02:57:14 +00:00
コマリン親衛隊
4a09b50a59 Merge pull request #86 from tuna2134/renovate/utoipa-5.x-lockfile
fix(deps): update rust crate utoipa to v5.1.1
2024-10-17 06:54:35 +09:00
renovate[bot]
1c5863441c fix(deps): update rust crate utoipa to v5.1.1 2024-10-16 16:00:07 +00:00
コマリン親衛隊
42c5e32a5a Merge pull request #85 from tuna2134/renovate/pyo3-0.x-lockfile
fix(deps): update rust crate pyo3 to v0.22.5
2024-10-16 12:04:22 +09:00
renovate[bot]
76bdd8f025 fix(deps): update rust crate pyo3 to v0.22.5 2024-10-15 23:27:29 +00:00
コマリン親衛隊
8e14e0b942 Merge pull request #84 from tuna2134/renovate/utoipa-5.x
fix(deps): update rust crate utoipa to v5
2024-10-15 09:36:02 +09:00
renovate[bot]
378f7d7095 fix(deps): update rust crate utoipa to v5 2024-10-14 21:31:48 +00:00
コマリン親衛隊
b63a3ccf78 Merge pull request #83 from tuna2134/renovate/utoipa-scalar-0.x
fix(deps): update rust crate utoipa-scalar to 0.2.0
2024-10-15 06:30:57 +09:00
renovate[bot]
5238640144 fix(deps): update rust crate utoipa-scalar to 0.2.0 2024-10-14 18:51:30 +00:00
コマリン親衛隊
da3a61a5e7 Merge pull request #82 from tuna2134/renovate/pyo3-0.x-lockfile 2024-10-12 19:22:14 +09:00
renovate[bot]
74043c636f fix(deps): update rust crate pyo3 to v0.22.4 2024-10-12 09:44:38 +00:00
コマリン親衛隊
7663a754a6 Merge pull request #81 from tuna2134/renovate/rust-wasm-bindgen-monorepo
fix(deps): update rust-wasm-bindgen monorepo
2024-10-11 08:41:52 +09:00
renovate[bot]
cb2e52fb18 fix(deps): update rust-wasm-bindgen monorepo 2024-10-10 23:11:03 +00:00
コマリン親衛隊
ac3945748a Merge pull request #80 from tuna2134/renovate/tokenizers-0.x-lockfile 2024-10-10 20:46:36 +09:00
renovate[bot]
1e2cde365f fix(deps): update rust crate tokenizers to v0.20.1 2024-10-10 11:29:48 +00:00
コマリン親衛隊
eecf6d90f7 Merge pull request #79 from tuna2134/renovate/rust-wasm-bindgen-monorepo
fix(deps): update rust-wasm-bindgen monorepo
2024-10-10 09:47:47 +09:00
renovate[bot]
e154fbf493 fix(deps): update rust-wasm-bindgen monorepo 2024-10-09 22:56:05 +00:00
tuna2134
f5de643a21 Merge branch 'main' of https://github.com/tuna2134/sbv2-api 2024-10-09 11:54:07 +00:00
コマリン親衛隊
4b661e3b5f Merge pull request #78 from tuna2134/tuna2134-patch-3
Add sponsor button
2024-10-09 19:21:23 +09:00
コマリン親衛隊
055c08b5d0 Create FUNDING.yml 2024-10-09 19:20:51 +09:00
コマリン親衛隊
cdbcbde04c Merge pull request #77 from tuna2134/renovate/typescript-5.x-lockfile
chore(deps): update dependency typescript to v5.6.3
2024-10-09 19:19:21 +09:00
renovate[bot]
cfd30764d0 chore(deps): update dependency typescript to v5.6.3 2024-10-08 22:34:55 +00:00
コマリン親衛隊
3708d9fec3 Merge pull request #76 from tuna2134/renovate/node-22.x-lockfile 2024-10-08 15:45:09 +09:00
renovate[bot]
065a7b9215 chore(deps): update dependency @types/node to v22.7.5 2024-10-08 00:47:20 +00:00
コマリン親衛隊
dc88251d41 Update README.md 2024-10-06 21:27:27 +09:00
コマリン親衛隊
1550ce6ee4 Merge pull request #73 from tuna2134/renovate/once_cell-1.x-lockfile 2024-10-06 20:29:38 +09:00
renovate[bot]
c1bebea69b chore(deps): update rust crate once_cell to v1.20.2 2024-10-05 16:59:26 +00:00
コマリン親衛隊
af5a550b8f Merge pull request #72 from tuna2134/renovate/biomejs-biome-1.x-lockfile 2024-10-02 08:04:30 +09:00
renovate[bot]
febfd0d84f chore(deps): update dependency @biomejs/biome to v1.9.3 2024-10-01 15:57:52 +00:00
コマリン親衛隊
55698f4a61 Merge pull request #71 from tuna2134/tuna2134-patch-2
プルリクエストのテンプレートの変更
2024-10-01 14:50:28 +09:00
コマリン親衛隊
b0155f5ffa Merge pull request #70 from tuna2134/tuna2134-patch-1
GitHub actionsをまとめた
2024-10-01 14:50:17 +09:00
コマリン親衛隊
0e9c7b6522 プルリクなどが作成された時に動かないようにした 2024-10-01 01:52:09 +09:00
コマリン親衛隊
b0d8be32b6 Update pull_request_template.md 2024-10-01 01:49:30 +09:00
コマリン親衛隊
f76f5e6d1c Delete .github/workflows/build.yml 2024-10-01 01:45:16 +09:00
コマリン親衛隊
e8cc450693 Update CI.yml 2024-10-01 01:44:53 +09:00
コマリン親衛隊
6f0fcd491c Merge pull request #68 from Googlefan256/main
remove webgl support
2024-09-30 22:15:25 +09:00
Googlefan
5cf4149024 feat: web example 2024-09-30 11:53:58 +00:00
Googlefan
65303173a8 fix: wasm webgl 2024-09-30 10:35:37 +00:00
コマリン親衛隊
30e4cde3ed Merge pull request #66 from Googlefan256/main
WASM version finished
2024-09-30 19:29:10 +09:00
Googlefan
596eec654d feat: sbv2 wasm 2024-09-30 08:04:37 +00:00
コマリン親衛隊
ee292315e1 Merge pull request #65 from tuna2134/renovate/regex-1.x-lockfile
fix(deps): update rust crate regex to v1.11.0
2024-09-30 00:15:13 +09:00
コマリン親衛隊
731c751455 Merge pull request #64 from tuna2134/renovate/once_cell-1.x-lockfile
chore(deps): update rust crate once_cell to v1.20.1
2024-09-30 00:14:45 +09:00
renovate[bot]
497bdd79ea fix(deps): update rust crate regex to v1.11.0 2024-09-29 15:09:37 +00:00
renovate[bot]
b887fae47b chore(deps): update rust crate once_cell to v1.20.1 2024-09-29 15:09:32 +00:00
コマリン親衛隊
ca0b8553e4 Merge pull request #63 from tuna2134/renovate/axum-0.x-lockfile
fix(deps): update rust crate axum to v0.7.7
2024-09-28 07:58:06 +09:00
renovate[bot]
29b14895bb fix(deps): update rust crate axum to v0.7.7 2024-09-27 22:47:20 +00:00
tuna2134
c2910ad9e8 add content_type 2024-09-27 12:43:41 +00:00
tuna2134
5c092e8cbb format 2024-09-27 12:40:51 +00:00
tuna2134
d380e549c4 fix bug 2024-09-27 12:40:38 +00:00
tuna2134
395f5b0004 add scalar 2024-09-27 12:35:33 +00:00
tuna2134
f5609035b7 Merge branch 'main' of https://github.com/tuna2134/sbv2-api 2024-09-27 12:30:58 +00:00
tuna2134
1e9f25dcb1 add utoipa 2024-09-27 12:30:56 +00:00
コマリン親衛隊
321ca4e749 Merge pull request #62 from Googlefan256/main
WIP wasm support
2024-09-27 21:30:36 +09:00
Googlefan
bb23bd145b wip: wasm 2024-09-27 12:20:34 +00:00
tuna2134
30e79d0df6 delete 2024-09-27 10:32:39 +00:00
tuna2134
04c21aa97c bumped 2024-09-27 10:30:33 +00:00
tuna2134
6f388052ae bump version 2024-09-27 10:26:41 +00:00
tuna2134
04af3abad5 delete comment 2024-09-27 10:26:11 +00:00
tuna2134
414e42db50 format 2024-09-27 10:24:05 +00:00
tuna2134
b8b0198ca8 fix: bug 2024-09-27 10:23:44 +00:00
コマリン親衛隊
a99fd39834 Merge pull request #60 from tuna2134/label
正規表現使うのやめた
2024-09-25 22:32:08 +09:00
tuna2134
886ab78eeb Merge branch 'label' of https://github.com/tuna2134/sbv2-api into label 2024-09-25 13:22:59 +00:00
コマリン親衛隊
c85f474dbf Update jtalk.rs 2024-09-25 22:22:52 +09:00
tuna2134
6d160d7ae8 remove 2024-09-25 13:16:09 +00:00
tuna2134
ee927d65cb remove e3 2024-09-25 12:59:12 +00:00
tuna2134
6e7d641ecb fix bug 2024-09-25 12:56:13 +00:00
tuna2134
eb249aad81 Merge branch 'main' of https://github.com/tuna2134/sbv2-api 2024-09-25 12:53:26 +00:00
tuna2134
f79a67138f fix stop to use re 2024-09-25 12:53:23 +00:00
コマリン親衛隊
09945e2c1c Merge pull request #59 from tuna2134/renovate/tar-0.x-lockfile
fix(deps): update rust crate tar to v0.4.42
2024-09-25 17:25:04 +09:00
renovate[bot]
821b4c7fb3 fix(deps): update rust crate tar to v0.4.42 2024-09-25 03:03:08 +00:00
コマリン親衛隊
ec06c35929 Merge pull request #56 from tuna2134/fix-coreml
fix coremlのビルド失敗を修正
2024-09-24 06:42:45 +09:00
コマリン親衛隊
1373aef4b2 Merge pull request #57 from tuna2134/renovate/thiserror-1.x-lockfile
fix(deps): update rust crate thiserror to v1.0.64
2024-09-23 07:43:50 +09:00
renovate[bot]
e2e49fd0e8 fix(deps): update rust crate thiserror to v1.0.64 2024-09-22 19:16:03 +00:00
tuna2134
0cf9f87cc9 fix build 2024-09-22 14:26:15 +00:00
コマリン親衛隊
5e500b2c42 Support arm64 2024-09-22 19:12:29 +09:00
コマリン親衛隊
136375e5b6 Merge pull request #48 from tuna2134/renovate/pyo3-0.x-lockfile
fix(deps): update rust crate pyo3 to v0.22.3
2024-09-22 18:56:40 +09:00
tuna2134
aade119ddb add stripe 2024-09-22 08:05:48 +00:00
tuna2134
55cedb2f6d fix dists path 2024-09-22 07:48:53 +00:00
tuna2134
f2940f4ebe bump version 2024-09-22 07:41:58 +00:00
tuna2134
96a5ab0672 fix returns type 2024-09-22 07:40:46 +00:00
renovate[bot]
0bb3c5b8ea Update Rust crate pyo3 to v0.22.3 2024-09-16 09:25:40 +00:00
83 changed files with 5517 additions and 1818 deletions

View File

@@ -3,4 +3,5 @@ MODEL_PATH=models/tsukuyomi.sbv2
MODELS_PATH=models
TOKENIZER_PATH=models/tokenizer.json
ADDR=localhost:3000
RUST_LOG=warn
RUST_LOG=warn
HOLDER_MAX_LOADED_MODElS=20

3
.github/FUNDING.yml vendored Normal file
View File

@@ -0,0 +1,3 @@
# These are supported funding model platforms
github: [tuna2134]

15
.github/dependabot.yml vendored Normal file
View File

@@ -0,0 +1,15 @@
# To get started with Dependabot version updates, you'll need to specify which
# package ecosystems to update and where the package manifests are located.
# Please see the documentation for all configuration options:
# https://docs.github.com/code-security/dependabot/dependabot-version-updates/configuration-options-for-the-dependabot.yml-file
version: 2
updates:
- package-ecosystem: "cargo" # See documentation for possible values
directory: "/" # Location of package manifests
schedule:
interval: "weekly"
- package-ecosystem: "npm" # See documentation for possible values
directory: "/" # Location of package manifests
schedule:
interval: "weekly"

View File

@@ -1,8 +1,13 @@
## 概要
(ここに本PRの説明をしてください。)
<!--
ここに本PRの説明をしてください。
-->
## 関連issue
(ここに該当するissueの番号を書いてください。)
<!--
ここに該当するissueの番号を書いてください。
#nの前にfixesを置くとプルリクが閉じた時に自動的に該当issueもクローズします、
-->
## 確認
- [ ] 動作確認しましたか?

View File

@@ -1,132 +0,0 @@
# This file is autogenerated by maturin v1.7.1
# To update, run
#
# maturin generate-ci github
#
name: CI
on:
push:
branches:
- main
- master
tags:
- '*'
pull_request:
workflow_dispatch:
permissions:
contents: read
id-token: write
jobs:
linux:
runs-on: ${{ matrix.platform.runner }}
strategy:
matrix:
platform:
- runner: ubuntu-latest
target: x86_64
- runner: ubuntu-latest
target: aarch64
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: 3.x
- name: Build wheels
uses: PyO3/maturin-action@v1
with:
target: ${{ matrix.platform.target }}
args: --release --out dist --find-interpreter
sccache: 'true'
manylinux: auto
working-directory: sbv2_bindings
- name: Upload wheels
uses: actions/upload-artifact@v4
with:
name: wheels-linux-${{ matrix.platform.target }}
path: dist
windows:
runs-on: ${{ matrix.platform.runner }}
strategy:
matrix:
platform:
- runner: windows-latest
target: x64
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: 3.x
architecture: ${{ matrix.platform.target }}
- name: Build wheels
uses: PyO3/maturin-action@v1
with:
target: ${{ matrix.platform.target }}
args: --release --out dist --find-interpreter
sccache: 'true'
working-directory: sbv2_bindings
- name: Upload wheels
uses: actions/upload-artifact@v4
with:
name: wheels-windows-${{ matrix.platform.target }}
path: dist
macos:
runs-on: ${{ matrix.platform.runner }}
strategy:
matrix:
platform:
- runner: macos-12
target: x86_64
- runner: macos-14
target: aarch64
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: 3.x
- name: Build wheels
uses: PyO3/maturin-action@v1
with:
target: ${{ matrix.platform.target }}
args: --release --out dist --find-interpreter
sccache: 'true'
working-directory: sbv2_bindings
- name: Upload wheels
uses: actions/upload-artifact@v4
with:
name: wheels-macos-${{ matrix.platform.target }}
path: dist
sdist:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build sdist
uses: PyO3/maturin-action@v1
with:
command: sdist
args: --out dist
working-directory: sbv2_bindings
- name: Upload sdist
uses: actions/upload-artifact@v4
with:
name: wheels-sdist
path: dist
release:
name: Release
runs-on: ubuntu-latest
if: "startsWith(github.ref, 'refs/tags/')"
needs: [linux, windows, macos, sdist]
environment: release
steps:
- uses: actions/download-artifact@v4
- name: Publish to PyPI
uses: PyO3/maturin-action@v1
with:
command: upload
args: --non-interactive --skip-existing wheels-*/*

4
.github/workflows/build.Dockerfile vendored Normal file
View File

@@ -0,0 +1,4 @@
FROM ubuntu:latest
RUN apt update && apt install openssl libssl-dev curl pkg-config software-properties-common -y && add-apt-repository ppa:deadsnakes/ppa && apt update && apt install python3.7 python3.8 python3.9 python3.10 python3.11 python3.12 python3.13 python3-pip python3 -y
ENV PIP_BREAK_SYSTEM_PACKAGES=1
RUN mkdir -p /root/.cache/sbv2 && curl https://huggingface.co/neody/sbv2-api-assets/resolve/main/dic/all.bin -o /root/.cache/sbv2/all.bin -L

View File

@@ -1,40 +1,220 @@
name: Push to github container register
name: Build
on:
release:
types: [created]
push:
branches:
- main
tags:
- '*'
workflow_dispatch:
permissions:
contents: write
id-token: write
packages: write
jobs:
push-docker:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
python-linux:
runs-on: ${{ matrix.platform.runner }}
strategy:
matrix:
tag: [cpu, cuda]
platform:
- linux/amd64
- linux/arm64
- runner: ubuntu-latest
target: x86_64
- runner: ubuntu-24.04-arm
target: aarch64
steps:
- uses: actions/checkout@v4
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Login to GitHub Container Registry
- uses: actions/setup-python@v5
with:
python-version: 3.x
- run: docker build . -f .github/workflows/build.Dockerfile --tag ci
- name: Build wheels
uses: PyO3/maturin-action@v1
with:
target: ${{ matrix.platform.target }}
args: --release --out dist --find-interpreter
sccache: 'true'
manylinux: auto
container: ci
working-directory: ./crates/sbv2_bindings
- name: Upload wheels
uses: actions/upload-artifact@v4
with:
name: wheels-linux-${{ matrix.platform.target }}
path: ./crates/sbv2_bindings/dist
python-windows:
runs-on: ${{ matrix.platform.runner }}
strategy:
matrix:
platform:
- runner: windows-latest
target: x64
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: 3.x
architecture: ${{ matrix.platform.target }}
- name: Build wheels
uses: PyO3/maturin-action@v1
with:
target: ${{ matrix.platform.target }}
args: --release --out dist --find-interpreter
sccache: 'true'
working-directory: ./crates/sbv2_bindings
- name: Upload wheels
uses: actions/upload-artifact@v4
with:
name: wheels-windows-${{ matrix.platform.target }}
path: ./crates/sbv2_bindings/dist
python-macos:
runs-on: ${{ matrix.platform.runner }}
strategy:
matrix:
platform:
- runner: macos-14
target: aarch64
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: 3.x
- name: Build wheels
uses: PyO3/maturin-action@v1
with:
target: ${{ matrix.platform.target }}
args: --release --out dist --find-interpreter
sccache: 'true'
working-directory: ./crates/sbv2_bindings
- name: Upload wheels
uses: actions/upload-artifact@v4
with:
name: wheels-macos-${{ matrix.platform.target }}
path: ./crates/sbv2_bindings/dist
python-sdist:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build sdist
uses: PyO3/maturin-action@v1
with:
command: sdist
args: --out dist
working-directory: ./crates/sbv2_bindings
- name: Upload sdist
uses: actions/upload-artifact@v4
with:
name: wheels-sdist
path: ./crates/sbv2_bindings/dist
python-wheel:
name: Wheel Upload
runs-on: ubuntu-latest
needs: [python-linux, python-windows, python-macos, python-sdist]
env:
GH_TOKEN: ${{ github.token }}
steps:
- uses: actions/checkout@v4
- run: gh run download ${{ github.run_id }} -p wheels-*
- name: release
run: |
gh release create commit-${GITHUB_SHA:0:8} --prerelease wheels-*/*
python-release:
name: Release
runs-on: ubuntu-latest
if: "startsWith(github.ref, 'refs/tags/')"
needs: [python-linux, python-windows, python-macos, python-sdist]
environment: release
env:
GH_TOKEN: ${{ github.token }}
steps:
- uses: actions/checkout@v4
- run: gh run download ${{ github.run_id }} -p wheels-*
- name: Publish to PyPI
uses: PyO3/maturin-action@v1
with:
command: upload
args: --non-interactive --skip-existing wheels-*/*
docker:
runs-on: ${{ matrix.machine.runner }}
strategy:
fail-fast: false
matrix:
machine:
- platform: amd64
runner: ubuntu-latest
- platform: arm64
runner: ubuntu-24.04-arm
tag: [cpu, cuda]
steps:
- name: Prepare
run: |
platform=${{ matrix.machine.platform }}
echo "PLATFORM_PAIR=${platform//\//-}" >> $GITHUB_ENV
- name: Docker meta
id: meta
uses: docker/metadata-action@v5
with:
images: |
ghcr.io/${{ github.repository }}
- name: Login to GHCR
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and push image
- name: Set up QEMU
uses: docker/setup-qemu-action@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build and push by digest
id: build
uses: docker/build-push-action@v6
with:
context: .
labels: ${{ steps.meta.outputs.labels }}
file: ./scripts/docker/${{ matrix.tag }}.Dockerfile
push: true
tags: |
ghcr.io/${{ github.repository }}:${{ matrix.tag }}
file: docker/${{ matrix.tag }}.Dockerfile
platforms: ${{ matrix.platform }}
ghcr.io/${{ github.repository }}:latest-${{ matrix.tag }}-${{ matrix.machine.platform }}
docker-merge:
runs-on: ubuntu-latest
needs:
- docker
steps:
- name: Download digests
uses: actions/download-artifact@v4
with:
path: ${{ runner.temp }}/digests
pattern: digests-*
merge-multiple: true
- name: Login to GHCR
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.repository_owner }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Merge
run: |
docker buildx imagetools create -t ghcr.io/${{ github.repository }}:cuda \
ghcr.io/${{ github.repository }}:latest-cuda-amd64 \
ghcr.io/${{ github.repository }}:latest-cuda-arm64
docker buildx imagetools create -t ghcr.io/${{ github.repository }}:cpu \
ghcr.io/${{ github.repository }}:latest-cpu-amd64 \
ghcr.io/${{ github.repository }}:latest-cpu-arm64

26
.github/workflows/lint.yml vendored Normal file
View File

@@ -0,0 +1,26 @@
name: Lint
on:
pull_request:
jobs:
check:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
components:
- rustfmt
- clippy
steps:
- name: Setup
uses: actions/checkout@v4
- uses: actions-rust-lang/setup-rust-toolchain@v1
with:
components: ${{ matrix.components }}
- name: Format
if: ${{ matrix.components == 'rustfmt' }}
run: cargo fmt --all -- --check
- name: Lint
if: ${{ matrix.components == 'clippy' }}
run: cargo clippy --all-targets --all-features -- -D warnings

9
.gitignore vendored
View File

@@ -1,7 +1,10 @@
target
target/
models/
!models/.gitkeep
venv/
.env
output.wav
node_modules
*.wav
node_modules/
dist/
*.csv
*.bin

2278
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,14 +1,25 @@
[workspace]
resolver = "2"
members = ["sbv2_api", "sbv2_core", "sbv2_bindings"]
resolver = "3"
members = ["./crates/sbv2_api", "./crates/sbv2_core", "./crates/sbv2_bindings", "./crates/sbv2_wasm"]
[workspace.package]
version = "0.2.0-alpha6"
edition = "2021"
description = "Style-Bert-VITSの推論ライブラリ"
license = "MIT"
readme = "./README.md"
repository = "https://github.com/neodyland/sbv2-api"
documentation = "https://docs.rs/sbv2_core"
[workspace.dependencies]
anyhow = "1.0.86"
anyhow = "1.0.99"
dotenvy = "0.15.7"
env_logger = "0.11.5"
env_logger = "0.11.6"
ndarray = "0.16.1"
once_cell = "1.20.3"
[profile.release]
strip = true
opt-level = "z"
lto = true
debug = false
strip = true
codegen-units = 1

View File

@@ -1,6 +1,7 @@
MIT License
Copyright (c) 2024 tuna2134
Copyright (c) 2025- neodyland
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal

View File

@@ -1,8 +1,23 @@
# SBV2-API
> [!CAUTION]
> 本バージョンはアルファ版です。
>
> 安定版を利用したい場合は[こちら](https://github.com/neodyland/sbv2-api/tree/v0.1.x)をご覧ください。
> [!CAUTION]
> オプションの辞書はLGPLです。
>
> オプションの辞書を使用する場合、バイナリの内部の辞書部分について、LGPLが適用されます。
> [!NOTE]
> このレポジトリはメンテナンスの都合上、[tuna2134](https:://github.com/tuna2134)氏の所属する[Neodyland](https://neody.land/)へとリポジトリ所在地を移動しました。
>
> 引き続きtuna2134氏がメインメンテナとして管理しています。
## プログラミングに詳しくない方向け
[こちら](https://github.com/tuna2134/sbv2-gui?tab=readme-ov-file)を参照してください。
[こちら](https://github.com/tuna2134/sbv2-gui)を参照してください。
コマンドやpythonの知識なしで簡単に使えるバージョンです。(できることはほぼ同じ)
@@ -14,7 +29,7 @@ JP-Extra しか対応していません。(基本的に対応する予定もあ
## 変換方法
[こちら](https://github.com/tuna2134/sbv2-api/tree/main/convert)を参照してください。
[こちら](https://github.com/neodyland/sbv2-api/tree/main/scripts/convert)を参照してください。
## Todo
@@ -26,22 +41,27 @@ JP-Extra しか対応していません。(基本的に対応する予定もあ
- [x] GPU 対応(CUDA)
- [x] GPU 対応(DirectML)
- [x] GPU 対応(CoreML)
- [ ] WASM 変換(依存ライブラリの関係により現在は不可)
- [ ] arm64のdockerサポート
- [x] WASM 変換
- [x] arm64のdockerサポート
- [x] aivis形式のサポート
- [ ] MeCabを利用する
## 構造説明
- `sbv2_api` - 推論用 REST API
- `sbv2_core` - 推論コア部分
- `docker` - docker ビルドスクリプト
- `convert` - onnx, sbv2フォーマットへの変換スクリプト
- `crates/sbv2_api` - 推論用 REST API
- `crates/sbv2_core` - 推論コア部分
- `scripts/docker` - docker ビルドスクリプト
- `scripts/convert` - onnx, sbv2フォーマットへの変換スクリプト
## プログラミングある程度できる人向けREST API起動方法
### models をインストール
https://huggingface.co/googlefan/sbv2_onnx_models/tree/main
`tokenizer.json`,`debert.onnx`,`tsukuyomi.sbv2`を models フォルダに配置
https://huggingface.co/neody/sbv2-api-assets/tree/main/deberta
から`tokenizer.json`,`debert.onnx`
https://huggingface.co/neody/sbv2-api-assets/tree/main/model
から`tsukuyomi.sbv2`
を models フォルダに配置
### .env ファイルの作成
@@ -55,7 +75,7 @@ CPUの場合は
```sh
docker run -it --rm -p 3000:3000 --name sbv2 \
-v ./models:/work/models --env-file .env \
ghcr.io/tuna2134/sbv2-api:cpu
ghcr.io/neodyland/sbv2-api:cpu
```
<details>
@@ -70,7 +90,7 @@ CPUの場合は
```bash
docker run --platform linux/amd64 -it --rm -p 3000:3000 --name sbv2 \
-v ./models:/work/models --env-file .env \
ghcr.io/tuna2134/sbv2-api:cpu
ghcr.io/neodyland/sbv2-api:cpu
```
</details>
@@ -79,7 +99,7 @@ CUDAの場合は
docker run -it --rm -p 3000:3000 --name sbv2 \
-v ./models:/work/models --env-file .env \
--gpus all \
ghcr.io/tuna2134/sbv2-api:cuda
ghcr.io/neodyland/sbv2-api:cuda
```
### 起動確認
@@ -110,8 +130,10 @@ curl http://localhost:3000/models
- `ADDR` `localhost:3000`などのようにサーバー起動アドレスをコントロールできます。
- `MODELS_PATH` sbv2モデルの存在するフォルダを指定できます。
- `RUST_LOG` おなじみlog levelです。
- `HOLDER_MAX_LOADED_MODElS` RAMにロードされるモデルの最大数を指定します。
## 謝辞
- [litagin02/Style-Bert-VITS2](https://github.com/litagin02/Style-Bert-VITS2) - このコード書くにあたり、ベースとなる部分を参考にさせていただきました。
- [litagin02/Style-Bert-VITS2](https://github.com/litagin02/Style-Bert-VITS2) - このコード書くにあたり、ベースとなる部分を参考にさせていただきました。
- [Googlefan](https://github.com/Googlefan256) - 彼にモデルを ONNX ヘ変換および効率化をする方法を教わりました。
- [Aivis Project](https://github.com/Aivis-Project/AivisSpeech-Engine) - 辞書部分

View File

@@ -1 +0,0 @@
10,000年前までコロナが流行っていました

View File

@@ -1,5 +0,0 @@
style-bert-vits2
onnxsim
numpy<2
zstandard
onnxruntime

View File

@@ -0,0 +1,29 @@
[package]
name = "sbv2_api"
version.workspace = true
edition.workspace = true
description.workspace = true
readme.workspace = true
repository.workspace = true
documentation.workspace = true
license.workspace = true
[dependencies]
anyhow.workspace = true
axum = "0.8.0"
dotenvy.workspace = true
env_logger.workspace = true
log = "0.4.22"
sbv2_core = { version = "0.2.0-alpha6", path = "../sbv2_core", features = ["aivmx"] }
serde = { version = "1.0.210", features = ["derive"] }
tokio = { version = "1.47.1", features = ["full"] }
utoipa = { version = "5.4.0", features = ["axum_extras"] }
utoipa-scalar = { version = "0.3.0", features = ["axum"] }
[features]
coreml = ["sbv2_core/coreml"]
cuda = ["sbv2_core/cuda"]
cuda_tf32 = ["sbv2_core/cuda_tf32"]
dynamic = ["sbv2_core/dynamic"]
directml = ["sbv2_core/directml"]
tensorrt = ["sbv2_core/tensorrt"]

5
crates/sbv2_api/build.rs Normal file
View File

@@ -0,0 +1,5 @@
fn main() {
if cfg!(feature = "coreml") {
println!("cargo:rustc-link-arg=-fapple-link-rtlib");
}
}

View File

@@ -11,10 +11,23 @@ use std::env;
use std::sync::Arc;
use tokio::fs;
use tokio::sync::Mutex;
use utoipa::{OpenApi, ToSchema};
use utoipa_scalar::{Scalar, Servable};
mod error;
use crate::error::AppResult;
#[derive(OpenApi)]
#[openapi(paths(models, synthesize), components(schemas(SynthesizeRequest)))]
struct ApiDoc;
#[utoipa::path(
get,
path = "/models",
responses(
(status = 200, description = "Return model list", body = Vec<String>),
)
)]
async fn models(State(state): State<AppState>) -> AppResult<impl IntoResponse> {
Ok(Json(state.tts_model.lock().await.models()))
}
@@ -27,16 +40,40 @@ fn length_default() -> f32 {
1.0
}
#[derive(Deserialize)]
fn style_id_default() -> i32 {
0
}
fn speaker_id_default() -> i64 {
0
}
#[derive(Deserialize, ToSchema)]
struct SynthesizeRequest {
text: String,
ident: String,
#[serde(default = "sdp_default")]
#[schema(example = 0.0_f32)]
sdp_ratio: f32,
#[serde(default = "length_default")]
#[schema(example = 1.0_f32)]
length_scale: f32,
#[serde(default = "style_id_default")]
#[schema(example = 0_i32)]
style_id: i32,
#[serde(default = "speaker_id_default")]
#[schema(example = 0_i64)]
speaker_id: i64,
}
#[utoipa::path(
post,
path = "/synthesize",
request_body = SynthesizeRequest,
responses(
(status = 200, description = "Return audio/wav", body = Vec<u8>, content_type = "audio/wav")
)
)]
async fn synthesize(
State(state): State<AppState>,
Json(SynthesizeRequest {
@@ -44,15 +81,18 @@ async fn synthesize(
ident,
sdp_ratio,
length_scale,
style_id,
speaker_id,
}): Json<SynthesizeRequest>,
) -> AppResult<impl IntoResponse> {
log::debug!("processing request: text={text}, ident={ident}, sdp_ratio={sdp_ratio}, length_scale={length_scale}");
let buffer = {
let tts_model = state.tts_model.lock().await;
let mut tts_model = state.tts_model.lock().await;
tts_model.easy_synthesize(
&ident,
&text,
0,
style_id,
speaker_id,
SynthesizeOptions {
sdp_ratio,
length_scale,
@@ -73,6 +113,9 @@ impl AppState {
let mut tts_model = TTSModelHolder::new(
&fs::read(env::var("BERT_MODEL_PATH")?).await?,
&fs::read(env::var("TOKENIZER_PATH")?).await?,
env::var("HOLDER_MAX_LOADED_MODElS")
.ok()
.and_then(|x| x.parse().ok()),
)?;
let models = env::var("MODELS_PATH").unwrap_or("models".to_string());
let mut f = fs::read_dir(&models).await?;
@@ -101,6 +144,20 @@ impl AppState {
log::warn!("Error loading {entry}: {e}");
};
log::info!("Loaded: {entry}");
} else if name.ends_with(".aivmx") {
let entry = &name[..name.len() - 6];
log::info!("Try loading: {entry}");
let aivmx_bytes = match fs::read(format!("{models}/{entry}.aivmx")).await {
Ok(b) => b,
Err(e) => {
log::warn!("Error loading aivmx bytes from file {entry}: {e}");
continue;
}
};
if let Err(e) = tts_model.load_aivmx(entry, aivmx_bytes) {
log::error!("Error loading {entry}: {e}");
}
log::info!("Loaded: {entry}");
}
}
for entry in entries {
@@ -139,7 +196,8 @@ async fn main() -> anyhow::Result<()> {
.route("/", get(|| async { "Hello, World!" }))
.route("/synthesize", post(synthesize))
.route("/models", get(models))
.with_state(AppState::new().await?);
.with_state(AppState::new().await?)
.merge(Scalar::with_url("/docs", ApiDoc::openapi()));
let addr = env::var("ADDR").unwrap_or("0.0.0.0:3000".to_string());
let listener = tokio::net::TcpListener::bind(&addr).await?;
log::info!("Listening on {addr}");

View File

@@ -0,0 +1,24 @@
[package]
name = "sbv2_bindings"
version.workspace = true
edition.workspace = true
description.workspace = true
readme.workspace = true
repository.workspace = true
documentation.workspace = true
license.workspace = true
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[lib]
name = "sbv2_bindings"
crate-type = ["cdylib"]
[dependencies]
anyhow.workspace = true
ndarray.workspace = true
pyo3 = { version = "0.25.1", features = ["anyhow"] }
sbv2_core = { path = "../sbv2_core", features = ["std"], default-features = false }
[features]
agpl_dict = ["sbv2_core/agpl_dict"]
default = ["agpl_dict"]

View File

@@ -11,5 +11,7 @@ classifiers = [
"Programming Language :: Python :: Implementation :: PyPy",
]
dynamic = ["version"]
[tool.maturin]
features = ["pyo3/extension-module"]
strip = true

View File

@@ -1,6 +1,6 @@
use pyo3::prelude::*;
use pyo3::types::PyBytes;
use sbv2_core::tts::{TTSModelHolder, SynthesizeOptions};
use sbv2_core::tts::{SynthesizeOptions, TTSModelHolder};
use crate::style::StyleVector;
@@ -23,10 +23,15 @@ pub struct TTSModel {
#[pymethods]
impl TTSModel {
#[pyo3(signature = (bert_model_bytes, tokenizer_bytes, max_loaded_models=None))]
#[new]
fn new(bert_model_bytes: Vec<u8>, tokenizer_bytes: Vec<u8>) -> anyhow::Result<Self> {
fn new(
bert_model_bytes: Vec<u8>,
tokenizer_bytes: Vec<u8>,
max_loaded_models: Option<usize>,
) -> anyhow::Result<Self> {
Ok(Self {
model: TTSModelHolder::new(bert_model_bytes, tokenizer_bytes)?,
model: TTSModelHolder::new(bert_model_bytes, tokenizer_bytes, max_loaded_models)?,
})
}
@@ -38,10 +43,21 @@ impl TTSModel {
/// BERTモデルのパス
/// tokenizer_path : str
/// トークナイザーのパス
/// max_loaded_models: int | None
/// 同時にVRAMに存在するモデルの数
#[pyo3(signature = (bert_model_path, tokenizer_path, max_loaded_models=None))]
#[staticmethod]
fn from_path(bert_model_path: String, tokenizer_path: String) -> anyhow::Result<Self> {
fn from_path(
bert_model_path: String,
tokenizer_path: String,
max_loaded_models: Option<usize>,
) -> anyhow::Result<Self> {
Ok(Self {
model: TTSModelHolder::new(fs::read(bert_model_path)?, fs::read(tokenizer_path)?)?,
model: TTSModelHolder::new(
fs::read(bert_model_path)?,
fs::read(tokenizer_path)?,
max_loaded_models,
)?,
})
}
@@ -91,7 +107,7 @@ impl TTSModel {
/// style_vector : StyleVector
/// スタイルベクトル
fn get_style_vector(
&self,
&mut self,
ident: String,
style_id: i32,
weight: f32,
@@ -120,25 +136,32 @@ impl TTSModel {
/// -------
/// voice_data : bytes
/// 音声データ
#[allow(clippy::too_many_arguments)]
fn synthesize<'p>(
&'p self,
&'p mut self,
py: Python<'p>,
text: String,
ident: String,
style_id: i32,
speaker_id: i64,
sdp_ratio: f32,
length_scale: f32,
) -> anyhow::Result<Bound<PyBytes>> {
) -> anyhow::Result<Bound<'p, PyBytes>> {
let data = self.model.easy_synthesize(
ident.as_str(),
&text,
style_id,
speaker_id,
SynthesizeOptions {
sdp_ratio,
length_scale,
..Default::default()
},
)?;
Ok(PyBytes::new_bound(py, &data))
Ok(PyBytes::new(py, &data))
}
fn unload(&mut self, ident: String) -> bool {
self.model.unload(ident)
}
}

View File

@@ -0,0 +1,47 @@
[package]
name = "sbv2_core"
version.workspace = true
edition.workspace = true
description.workspace = true
readme.workspace = true
repository.workspace = true
documentation.workspace = true
license.workspace = true
[dependencies]
anyhow.workspace = true
base64 = { version = "0.22.1", optional = true }
dotenvy.workspace = true
env_logger.workspace = true
hound = "3.5.1"
jpreprocess = { version = "0.12.0", features = ["naist-jdic"] }
ndarray.workspace = true
npyz = { version = "0.8.4", optional = true }
num_cpus = "1.17.0"
once_cell.workspace = true
ort = { git = "https://github.com/pykeio/ort.git", version = "2.0.0-rc.9", optional = true }
regex = "1.10.6"
serde = { version = "1.0.210", features = ["derive"] }
serde_json = "1.0.142"
tar = "0.4.41"
thiserror = "2.0.15"
tokenizers = { version = "0.21.4", default-features = false }
zstd = "0.13.2"
[features]
cuda = ["ort/cuda", "std"]
cuda_tf32 = ["std", "cuda"]
agpl_dict = []
std = ["dep:ort", "tokenizers/progressbar", "tokenizers/onig", "tokenizers/esaxx_fast"]
dynamic = ["ort/load-dynamic", "std"]
directml = ["ort/directml", "std"]
tensorrt = ["ort/tensorrt", "std"]
coreml = ["ort/coreml", "std"]
default = ["std", "agpl_dict"]
no_std = ["tokenizers/unstable_wasm"]
aivmx = ["npyz", "base64"]
base64 = ["dep:base64"]
[build-dependencies]
dirs = "6.0.0"
ureq = "3.1.0"

31
crates/sbv2_core/build.rs Normal file
View File

@@ -0,0 +1,31 @@
use dirs::home_dir;
use std::env;
use std::fs;
use std::io::copy;
use std::path::PathBuf;
fn main() -> Result<(), Box<dyn std::error::Error>> {
let static_dir = home_dir().unwrap().join(".cache/sbv2");
let static_path = static_dir.join("all.bin");
let out_path = PathBuf::from(&env::var("OUT_DIR").unwrap()).join("all.bin");
println!("cargo:rerun-if-changed=build.rs");
if static_path.exists() {
println!("cargo:info=Dictionary file already exists, skipping download.");
} else {
println!("cargo:warning=Downloading dictionary file...");
let mut response =
ureq::get("https://huggingface.co/neody/sbv2-api-assets/resolve/main/dic/all.bin")
.call()?;
let mut response = response.body_mut().as_reader();
if !static_dir.exists() {
fs::create_dir_all(static_dir)?;
}
let mut file = fs::File::create(&static_path)?;
copy(&mut response, &mut file)?;
}
if !out_path.exists() && fs::hard_link(&static_path, &out_path).is_err() {
println!("cargo:warning=Failed to create hard link, copying instead.");
fs::copy(static_path, out_path)?;
}
Ok(())
}

View File

@@ -0,0 +1,22 @@
use crate::error::Result;
use ndarray::{Array2, Ix2};
use ort::session::Session;
use ort::value::TensorRef;
pub fn predict(
session: &mut Session,
token_ids: Vec<i64>,
attention_masks: Vec<i64>,
) -> Result<Array2<f32>> {
let outputs = session.run(
ort::inputs! {
"input_ids" => TensorRef::from_array_view((vec![1, token_ids.len() as i64], token_ids.as_slice()))?,
"attention_mask" => TensorRef::from_array_view((vec![1, attention_masks.len() as i64], attention_masks.as_slice()))?,
}
)?;
let output = outputs["output"]
.try_extract_array::<f32>()?
.into_dimensionality::<Ix2>()?
.to_owned();
Ok(output)
}

View File

@@ -6,6 +6,9 @@ pub enum Error {
TokenizerError(#[from] tokenizers::Error),
#[error("JPreprocess error: {0}")]
JPreprocessError(#[from] jpreprocess::error::JPreprocessError),
#[error("Lindera error: {0}")]
LinderaError(String),
#[cfg(feature = "std")]
#[error("ONNX error: {0}")]
OrtError(#[from] ort::Error),
#[error("NDArray error: {0}")]
@@ -20,6 +23,13 @@ pub enum Error {
HoundError(#[from] hound::Error),
#[error("model not found error")]
ModelNotFoundError(String),
#[cfg(feature = "base64")]
#[error("base64 error")]
Base64Error(#[from] base64::DecodeError),
#[error("other")]
OtherError(String),
#[error("Style error: {0}")]
StyleError(String),
}
pub type Result<T> = std::result::Result<T, Error>;

View File

@@ -0,0 +1,621 @@
/*
このファイルのコードは
https://github.com/litagin02/Style-Bert-VITS2/blob/master/style_bert_vits2/nlp/japanese/g2p.py
を参考にRustに書き換えています。
以下はライセンスです。
GNU LESSER GENERAL PUBLIC LICENSE
Version 3, 29 June 2007
Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
This version of the GNU Lesser General Public License incorporates
the terms and conditions of version 3 of the GNU General Public
License, supplemented by the additional permissions listed below.
0. Additional Definitions.
As used herein, "this License" refers to version 3 of the GNU Lesser
General Public License, and the "GNU GPL" refers to version 3 of the GNU
General Public License.
"The Library" refers to a covered work governed by this License,
other than an Application or a Combined Work as defined below.
An "Application" is any work that makes use of an interface provided
by the Library, but which is not otherwise based on the Library.
Defining a subclass of a class defined by the Library is deemed a mode
of using an interface provided by the Library.
A "Combined Work" is a work produced by combining or linking an
Application with the Library. The particular version of the Library
with which the Combined Work was made is also called the "Linked
Version".
The "Minimal Corresponding Source" for a Combined Work means the
Corresponding Source for the Combined Work, excluding any source code
for portions of the Combined Work that, considered in isolation, are
based on the Application, and not on the Linked Version.
The "Corresponding Application Code" for a Combined Work means the
object code and/or source code for the Application, including any data
and utility programs needed for reproducing the Combined Work from the
Application, but excluding the System Libraries of the Combined Work.
1. Exception to Section 3 of the GNU GPL.
You may convey a covered work under sections 3 and 4 of this License
without being bound by section 3 of the GNU GPL.
2. Conveying Modified Versions.
If you modify a copy of the Library, and, in your modifications, a
facility refers to a function or data to be supplied by an Application
that uses the facility (other than as an argument passed when the
facility is invoked), then you may convey a copy of the modified
version:
a) under this License, provided that you make a good faith effort to
ensure that, in the event an Application does not supply the
function or data, the facility still operates, and performs
whatever part of its purpose remains meaningful, or
b) under the GNU GPL, with none of the additional permissions of
this License applicable to that copy.
3. Object Code Incorporating Material from Library Header Files.
The object code form of an Application may incorporate material from
a header file that is part of the Library. You may convey such object
code under terms of your choice, provided that, if the incorporated
material is not limited to numerical parameters, data structure
layouts and accessors, or small macros, inline functions and templates
(ten or fewer lines in length), you do both of the following:
a) Give prominent notice with each copy of the object code that the
Library is used in it and that the Library and its use are
covered by this License.
b) Accompany the object code with a copy of the GNU GPL and this license
document.
4. Combined Works.
You may convey a Combined Work under terms of your choice that,
taken together, effectively do not restrict modification of the
portions of the Library contained in the Combined Work and reverse
engineering for debugging such modifications, if you also do each of
the following:
a) Give prominent notice with each copy of the Combined Work that
the Library is used in it and that the Library and its use are
covered by this License.
b) Accompany the Combined Work with a copy of the GNU GPL and this license
document.
c) For a Combined Work that displays copyright notices during
execution, include the copyright notice for the Library among
these notices, as well as a reference directing the user to the
copies of the GNU GPL and this license document.
d) Do one of the following:
0) Convey the Minimal Corresponding Source under the terms of this
License, and the Corresponding Application Code in a form
suitable for, and under terms that permit, the user to
recombine or relink the Application with a modified version of
the Linked Version to produce a modified Combined Work, in the
manner specified by section 6 of the GNU GPL for conveying
Corresponding Source.
1) Use a suitable shared library mechanism for linking with the
Library. A suitable mechanism is one that (a) uses at run time
a copy of the Library already present on the user's computer
system, and (b) will operate properly with a modified version
of the Library that is interface-compatible with the Linked
Version.
e) Provide Installation Information, but only if you would otherwise
be required to provide such information under section 6 of the
GNU GPL, and only to the extent that such information is
necessary to install and execute a modified version of the
Combined Work produced by recombining or relinking the
Application with a modified version of the Linked Version. (If
you use option 4d0, the Installation Information must accompany
the Minimal Corresponding Source and Corresponding Application
Code. If you use option 4d1, you must provide the Installation
Information in the manner specified by section 6 of the GNU GPL
for conveying Corresponding Source.)
5. Combined Libraries.
You may place library facilities that are a work based on the
Library side by side in a single library together with other library
facilities that are not Applications and are not covered by this
License, and convey such a combined library under terms of your
choice, if you do both of the following:
a) Accompany the combined library with a copy of the same work based
on the Library, uncombined with any other library facilities,
conveyed under the terms of this License.
b) Give prominent notice with the combined library that part of it
is a work based on the Library, and explaining where to find the
accompanying uncombined form of the same work.
6. Revised Versions of the GNU Lesser General Public License.
The Free Software Foundation may publish revised and/or new versions
of the GNU Lesser General Public License from time to time. Such new
versions will be similar in spirit to the present version, but may
differ in detail to address new problems or concerns.
Each version is given a distinguishing version number. If the
Library as you received it specifies that a certain numbered version
of the GNU Lesser General Public License "or any later version"
applies to it, you have the option of following the terms and
conditions either of that published version or of any later version
published by the Free Software Foundation. If the Library as you
received it does not specify a version number of the GNU Lesser
General Public License, you may choose any version of the GNU Lesser
General Public License ever published by the Free Software Foundation.
If the Library as you received it specifies that a proxy can decide
whether future versions of the GNU Lesser General Public License shall
apply, that proxy's public statement of acceptance of any version is
permanent authorization for you to choose that version for the
Library.
*/
use crate::error::{Error, Result};
use crate::mora::{CONSONANTS, MORA_KATA_TO_MORA_PHONEMES, MORA_PHONEMES_TO_MORA_KATA, VOWELS};
use crate::norm::{replace_punctuation, PUNCTUATIONS};
use jpreprocess::{kind, DefaultTokenizer, JPreprocess, SystemDictionaryConfig, UserDictionary};
use once_cell::sync::Lazy;
use regex::Regex;
use std::cmp::Reverse;
use std::collections::HashSet;
use std::sync::Arc;
type JPreprocessType = JPreprocess<DefaultTokenizer>;
#[cfg(feature = "agpl_dict")]
fn agpl_dict() -> Result<Option<UserDictionary>> {
Ok(Some(
UserDictionary::load(include_bytes!(concat!(env!("OUT_DIR"), "/all.bin")))
.map_err(|e| Error::LinderaError(e.to_string()))?,
))
}
#[cfg(not(feature = "agpl_dict"))]
fn agpl_dict() -> Result<Option<UserDictionary>> {
Ok(None)
}
fn initialize_jtalk() -> Result<JPreprocessType> {
let sdic =
SystemDictionaryConfig::Bundled(kind::JPreprocessDictionaryKind::NaistJdic).load()?;
let jpreprocess = JPreprocess::with_dictionaries(sdic, agpl_dict()?);
Ok(jpreprocess)
}
macro_rules! hash_set {
($($elem:expr),* $(,)?) => {{
let mut set = HashSet::new();
$(
set.insert($elem);
)*
set
}};
}
pub struct JTalk {
pub jpreprocess: Arc<JPreprocessType>,
}
impl JTalk {
pub fn new() -> Result<Self> {
let jpreprocess = Arc::new(initialize_jtalk()?);
Ok(Self { jpreprocess })
}
pub fn num2word(&self, text: &str) -> Result<String> {
let mut parsed = self.jpreprocess.text_to_njd(text)?;
parsed.preprocess();
let texts: Vec<String> = parsed
.nodes
.iter()
.map(|x| x.get_string().to_string())
.collect();
Ok(texts.join(""))
}
pub fn process_text(&self, text: &str) -> Result<JTalkProcess> {
let parsed = self.jpreprocess.run_frontend(text)?;
let jtalk_process = JTalkProcess::new(Arc::clone(&self.jpreprocess), parsed);
Ok(jtalk_process)
}
}
static KATAKANA_PATTERN: Lazy<Regex> = Lazy::new(|| Regex::new(r"[\u30A0-\u30FF]+").unwrap());
static MORA_PATTERN: Lazy<Vec<String>> = Lazy::new(|| {
let mut sorted_keys: Vec<String> = MORA_KATA_TO_MORA_PHONEMES.keys().cloned().collect();
sorted_keys.sort_by_key(|b| Reverse(b.len()));
sorted_keys
});
static LONG_PATTERN: Lazy<Regex> = Lazy::new(|| Regex::new(r"(\w)(ー*)").unwrap());
fn phone_tone_to_kana(phones: Vec<String>, tones: Vec<i32>) -> Vec<(String, i32)> {
let phones = &phones[1..];
let tones = &tones[1..];
let mut results = Vec::new();
let mut current_mora = String::new();
for ((phone, _next_phone), (&tone, &next_tone)) in phones
.iter()
.zip(phones.iter().skip(1))
.zip(tones.iter().zip(tones.iter().skip(1)))
{
if PUNCTUATIONS.contains(&phone.clone().as_str()) {
results.push((phone.to_string(), tone));
continue;
}
if CONSONANTS.contains(&phone.clone()) {
assert_eq!(current_mora, "");
assert_eq!(tone, next_tone);
current_mora = phone.to_string()
} else {
current_mora += phone;
let kana = MORA_PHONEMES_TO_MORA_KATA.get(&current_mora).unwrap();
results.push((kana.to_string(), tone));
current_mora = String::new();
}
}
results
}
pub struct JTalkProcess {
jpreprocess: Arc<JPreprocessType>,
parsed: Vec<String>,
}
impl JTalkProcess {
fn new(jpreprocess: Arc<JPreprocessType>, parsed: Vec<String>) -> Self {
Self {
jpreprocess,
parsed,
}
}
fn fix_phone_tone(&self, phone_tone_list: Vec<(String, i32)>) -> Result<Vec<(String, i32)>> {
let tone_values: HashSet<i32> = phone_tone_list
.iter()
.map(|(_letter, tone)| *tone)
.collect();
if tone_values.len() == 1 {
assert!(tone_values == hash_set![0], "{tone_values:?}");
Ok(phone_tone_list)
} else if tone_values.len() == 2 {
if tone_values == hash_set![0, 1] {
Ok(phone_tone_list)
} else if tone_values == hash_set![-1, 0] {
Ok(phone_tone_list
.iter()
.map(|x| {
let new_tone = if x.1 == -1 { 0 } else { 1 };
(x.0.clone(), new_tone)
})
.collect())
} else {
Err(Error::ValueError("Invalid tone values 0".to_string()))
}
} else {
Err(Error::ValueError("Invalid tone values 1".to_string()))
}
}
pub fn g2p(&self) -> Result<(Vec<String>, Vec<i32>, Vec<i32>)> {
let phone_tone_list_wo_punct = self.g2phone_tone_wo_punct()?;
let (seq_text, seq_kata) = self.text_to_seq_kata()?;
let sep_phonemes = JTalkProcess::handle_long(
seq_kata
.iter()
.map(|x| JTalkProcess::kata_to_phoneme_list(x.clone()).unwrap())
.collect(),
);
let phone_w_punct: Vec<String> = sep_phonemes
.iter()
.flat_map(|x| x.iter())
.cloned()
.collect();
let mut phone_tone_list =
JTalkProcess::align_tones(phone_w_punct, phone_tone_list_wo_punct)?;
let mut sep_tokenized: Vec<Vec<String>> = Vec::new();
for seq_text_item in &seq_text {
let text = seq_text_item.clone();
if !PUNCTUATIONS.contains(&text.as_str()) {
sep_tokenized.push(text.chars().map(|x| x.to_string()).collect());
} else {
sep_tokenized.push(vec![text]);
}
}
let mut word2ph = Vec::new();
for (token, phoneme) in sep_tokenized.iter().zip(sep_phonemes.iter()) {
let phone_len = phoneme.len() as i32;
let word_len = token.len() as i32;
word2ph.append(&mut JTalkProcess::distribute_phone(phone_len, word_len));
}
let mut new_phone_tone_list = vec![("_".to_string(), 0)];
new_phone_tone_list.append(&mut phone_tone_list);
new_phone_tone_list.push(("_".to_string(), 0));
let mut new_word2ph = vec![1];
new_word2ph.extend(word2ph.clone());
new_word2ph.push(1);
let phones: Vec<String> = new_phone_tone_list.iter().map(|(x, _)| x.clone()).collect();
let tones: Vec<i32> = new_phone_tone_list.iter().map(|(_, x)| *x).collect();
Ok((phones, tones, new_word2ph))
}
pub fn g2kana_tone(&self) -> Result<Vec<(String, i32)>> {
let (phones, tones, _) = self.g2p()?;
Ok(phone_tone_to_kana(phones, tones))
}
fn distribute_phone(n_phone: i32, n_word: i32) -> Vec<i32> {
let mut phones_per_word = vec![0; n_word as usize];
for _ in 0..n_phone {
let min_task = phones_per_word.iter().min().unwrap();
let min_index = phones_per_word
.iter()
.position(|&x| x == *min_task)
.unwrap();
phones_per_word[min_index] += 1;
}
phones_per_word
}
fn align_tones(
phone_with_punct: Vec<String>,
phone_tone_list: Vec<(String, i32)>,
) -> Result<Vec<(String, i32)>> {
let mut result: Vec<(String, i32)> = Vec::new();
let mut tone_index = 0;
for phone in phone_with_punct.clone() {
if tone_index >= phone_tone_list.len() {
result.push((phone, 0));
} else if phone == phone_tone_list[tone_index].0 {
result.push((phone, phone_tone_list[tone_index].1));
tone_index += 1;
} else if PUNCTUATIONS.contains(&phone.as_str()) {
result.push((phone, 0));
} else {
println!("phones {phone_with_punct:?}");
println!("phone_tone_list: {phone_tone_list:?}");
println!("result: {result:?}");
println!("tone_index: {tone_index:?}");
println!("phone: {phone:?}");
return Err(Error::ValueError(format!("Mismatched phoneme: {phone}")));
}
}
Ok(result)
}
fn handle_long(mut sep_phonemes: Vec<Vec<String>>) -> Vec<Vec<String>> {
for i in 0..sep_phonemes.len() {
if sep_phonemes[i].is_empty() {
continue;
}
if sep_phonemes[i][0] == "" {
if i != 0 {
let prev_phoneme = sep_phonemes[i - 1].last().unwrap();
if VOWELS.contains(&prev_phoneme.as_str()) {
sep_phonemes[i][0] = prev_phoneme.clone();
} else {
sep_phonemes[i][0] = "".to_string();
}
} else {
sep_phonemes[i][0] = "".to_string();
}
}
if sep_phonemes[i].contains(&"".to_string()) {
for e in 0..sep_phonemes[i].len() {
if sep_phonemes[i][e] == "" {
sep_phonemes[i][e] =
sep_phonemes[i][e - 1].chars().last().unwrap().to_string();
}
}
}
}
sep_phonemes
}
fn kata_to_phoneme_list(mut text: String) -> Result<Vec<String>> {
let chars: HashSet<String> = text.chars().map(|x| x.to_string()).collect();
if chars.is_subset(&HashSet::from_iter(
PUNCTUATIONS.iter().map(|x| x.to_string()),
)) {
return Ok(text.chars().map(|x| x.to_string()).collect());
}
if !KATAKANA_PATTERN.is_match(&text) {
return Err(Error::ValueError(format!(
"Input must be katakana only: {text}"
)));
}
for mora in MORA_PATTERN.iter() {
let mora = mora.to_string();
let (consonant, vowel) = MORA_KATA_TO_MORA_PHONEMES.get(&mora).unwrap();
if consonant.is_none() {
text = text.replace(&mora, &format!(" {vowel}"));
} else {
text = text.replace(
&mora,
&format!(" {} {}", consonant.as_ref().unwrap(), vowel),
);
}
}
let long_replacement = |m: &regex::Captures| {
let result = m.get(1).unwrap().as_str().to_string();
let mut second = String::new();
for _ in 0..m.get(2).unwrap().as_str().char_indices().count() {
second += &format!(" {}", m.get(1).unwrap().as_str());
}
result + &second
};
text = LONG_PATTERN
.replace_all(&text, long_replacement)
.to_string();
let data = text.trim().split(' ').map(|x| x.to_string()).collect();
Ok(data)
}
pub fn text_to_seq_kata(&self) -> Result<(Vec<String>, Vec<String>)> {
let mut seq_kata = vec![];
let mut seq_text = vec![];
for parts in &self.parsed {
let (string, pron) = self.parse_to_string_and_pron(parts.clone());
let mut yomi = pron.replace('', "");
let word = replace_punctuation(string);
assert!(!yomi.is_empty(), "Empty yomi: {word}");
if yomi == "" {
if !word
.chars()
.all(|x| PUNCTUATIONS.contains(&x.to_string().as_str()))
{
yomi = "'".repeat(word.len());
} else {
yomi = word.clone();
}
} else if yomi == "" {
assert!(word == "?", "yomi `` comes from: {word}");
yomi = "?".to_string();
}
seq_text.push(word);
seq_kata.push(yomi);
}
Ok((seq_text, seq_kata))
}
fn parse_to_string_and_pron(&self, parts: String) -> (String, String) {
let part_lists: Vec<String> = parts.split(',').map(|x| x.to_string()).collect();
(part_lists[0].clone(), part_lists[9].clone())
}
fn g2phone_tone_wo_punct(&self) -> Result<Vec<(String, i32)>> {
let prosodies = self.g2p_prosody()?;
let mut results: Vec<(String, i32)> = Vec::new();
let mut current_phrase: Vec<(String, i32)> = Vec::new();
let mut current_tone = 0;
for (i, letter) in prosodies.iter().enumerate() {
if letter == "^" {
assert!(i == 0);
} else if ["$", "?", "_", "#"].contains(&letter.as_str()) {
results.extend(self.fix_phone_tone(current_phrase.clone())?);
if ["$", "?"].contains(&letter.as_str()) {
assert!(i == prosodies.len() - 1);
}
current_phrase = Vec::new();
current_tone = 0;
} else if letter == "[" {
current_tone += 1;
} else if letter == "]" {
current_tone -= 1;
} else {
let new_letter = if letter == "cl" {
"q".to_string()
} else {
letter.clone()
};
current_phrase.push((new_letter, current_tone));
}
}
Ok(results)
}
fn g2p_prosody(&self) -> Result<Vec<String>> {
let labels = self.jpreprocess.make_label(self.parsed.clone());
let mut phones: Vec<String> = Vec::new();
for (i, label) in labels.iter().enumerate() {
let mut p3 = label.phoneme.c.clone().unwrap();
if "AIUEO".contains(&p3) {
// 文字をlowerする
p3 = p3.to_lowercase();
}
if p3 == "sil" {
assert!(i == 0 || i == labels.len() - 1);
if i == 0 {
phones.push("^".to_string());
} else if i == labels.len() - 1 {
let e3 = label.accent_phrase_prev.clone().unwrap().is_interrogative;
if e3 {
phones.push("$".to_string());
} else {
phones.push("?".to_string());
}
}
continue;
} else if p3 == "pau" {
phones.push("_".to_string());
continue;
} else {
phones.push(p3.clone());
}
let a1 = if let Some(mora) = &label.mora {
mora.relative_accent_position as i32
} else {
-50
};
let a2 = if let Some(mora) = &label.mora {
mora.position_forward as i32
} else {
-50
};
let a3 = if let Some(mora) = &label.mora {
mora.position_backward as i32
} else {
-50
};
let f1 = if let Some(accent_phrase) = &label.accent_phrase_curr {
accent_phrase.mora_count as i32
} else {
-50
};
let a2_next = if let Some(mora) = &labels[i + 1].mora {
mora.position_forward as i32
} else {
-50
};
if a3 == 1 && a2_next == 1 && "aeiouAEIOUNcl".contains(&p3) {
phones.push("#".to_string());
} else if a1 == 0 && a2_next == a2 + 1 && a2 != f1 {
phones.push("]".to_string());
} else if a2 == 1 && a2_next == 2 {
phones.push("[".to_string());
}
}
Ok(phones)
}
}

View File

@@ -1,11 +1,16 @@
#[cfg(feature = "std")]
pub mod bert;
pub mod error;
pub mod jtalk;
#[cfg(feature = "std")]
pub mod model;
pub mod mora;
pub mod nlp;
pub mod norm;
pub mod sbv2file;
pub mod style;
pub mod tokenizer;
#[cfg(feature = "std")]
pub mod tts;
pub mod tts_util;
pub mod utils;

View File

@@ -0,0 +1,48 @@
use std::env;
use std::fs;
#[cfg(feature = "std")]
fn main_inner() -> anyhow::Result<()> {
use sbv2_core::tts;
dotenvy::dotenv_override().ok();
env_logger::init();
let text = "今日の天気は快晴です。";
let ident = "aaa";
let mut tts_holder = tts::TTSModelHolder::new(
&fs::read(env::var("BERT_MODEL_PATH")?)?,
&fs::read(env::var("TOKENIZER_PATH")?)?,
env::var("HOLDER_MAX_LOADED_MODElS")
.ok()
.and_then(|x| x.parse().ok()),
)?;
let mp = env::var("MODEL_PATH")?;
let b = fs::read(&mp)?;
#[cfg(not(feature = "aivmx"))]
{
tts_holder.load_sbv2file(ident, b)?;
}
#[cfg(feature = "aivmx")]
{
if mp.ends_with(".sbv2") {
tts_holder.load_sbv2file(ident, b)?;
} else {
tts_holder.load_aivmx(ident, b)?;
}
}
let audio = tts_holder.easy_synthesize(ident, text, 0, 0, tts::SynthesizeOptions::default())?;
fs::write("output.wav", audio)?;
Ok(())
}
#[cfg(not(feature = "std"))]
fn main_inner() -> anyhow::Result<()> {
Ok(())
}
fn main() {
if let Err(e) = main_inner() {
println!("Error: {e}");
}
}

View File

@@ -0,0 +1,106 @@
use crate::error::Result;
use ndarray::{array, Array1, Array2, Array3, Axis, Ix3};
use ort::session::{builder::GraphOptimizationLevel, Session};
#[allow(clippy::vec_init_then_push, unused_variables)]
pub fn load_model<P: AsRef<[u8]>>(model_file: P, bert: bool) -> Result<Session> {
let mut exp = Vec::new();
#[cfg(feature = "tensorrt")]
{
if bert {
exp.push(
ort::execution_providers::TensorRTExecutionProvider::default()
.with_fp16(true)
.with_profile_min_shapes("input_ids:1x1,attention_mask:1x1")
.with_profile_max_shapes("input_ids:1x100,attention_mask:1x100")
.with_profile_opt_shapes("input_ids:1x25,attention_mask:1x25")
.build(),
);
}
}
#[cfg(feature = "cuda")]
{
#[allow(unused_mut)]
let mut cuda = ort::execution_providers::CUDAExecutionProvider::default();
#[cfg(feature = "cuda_tf32")]
{
cuda = cuda.with_tf32(true);
}
exp.push(cuda.build());
}
#[cfg(feature = "directml")]
{
exp.push(ort::execution_providers::DirectMLExecutionProvider::default().build());
}
#[cfg(feature = "coreml")]
{
exp.push(ort::execution_providers::CoreMLExecutionProvider::default().build());
}
exp.push(ort::execution_providers::CPUExecutionProvider::default().build());
Ok(Session::builder()?
.with_execution_providers(exp)?
.with_optimization_level(GraphOptimizationLevel::Level3)?
.with_intra_threads(num_cpus::get_physical())?
.with_parallel_execution(true)?
.with_inter_threads(num_cpus::get_physical())?
.commit_from_memory(model_file.as_ref())?)
}
#[allow(clippy::too_many_arguments)]
pub fn synthesize(
session: &mut Session,
bert_ori: Array2<f32>,
x_tst: Array1<i64>,
mut spk_ids: Array1<i64>,
tones: Array1<i64>,
lang_ids: Array1<i64>,
style_vector: Array1<f32>,
sdp_ratio: f32,
length_scale: f32,
noise_scale: f32,
noise_scale_w: f32,
) -> Result<Array3<f32>> {
let bert_ori = bert_ori.insert_axis(Axis(0));
let bert_ori = bert_ori.as_standard_layout();
let bert = ort::value::TensorRef::from_array_view(&bert_ori)?;
let mut x_tst_lengths = array![x_tst.shape()[0] as i64];
let x_tst_lengths = ort::value::TensorRef::from_array_view(&mut x_tst_lengths)?;
let mut x_tst = x_tst.insert_axis(Axis(0));
let x_tst = ort::value::TensorRef::from_array_view(&mut x_tst)?;
let mut lang_ids = lang_ids.insert_axis(Axis(0));
let lang_ids = ort::value::TensorRef::from_array_view(&mut lang_ids)?;
let mut tones = tones.insert_axis(Axis(0));
let tones = ort::value::TensorRef::from_array_view(&mut tones)?;
let mut style_vector = style_vector.insert_axis(Axis(0));
let style_vector = ort::value::TensorRef::from_array_view(&mut style_vector)?;
let sid = ort::value::TensorRef::from_array_view(&mut spk_ids)?;
let sdp_ratio = vec![sdp_ratio];
let sdp_ratio = ort::value::TensorRef::from_array_view((vec![1_i64], sdp_ratio.as_slice()))?;
let length_scale = vec![length_scale];
let length_scale =
ort::value::TensorRef::from_array_view((vec![1_i64], length_scale.as_slice()))?;
let noise_scale = vec![noise_scale];
let noise_scale =
ort::value::TensorRef::from_array_view((vec![1_i64], noise_scale.as_slice()))?;
let noise_scale_w = vec![noise_scale_w];
let noise_scale_w =
ort::value::TensorRef::from_array_view((vec![1_i64], noise_scale_w.as_slice()))?;
let outputs = session.run(ort::inputs! {
"x_tst" => x_tst,
"x_tst_lengths" => x_tst_lengths,
"sid" => sid,
"tones" => tones,
"language" => lang_ids,
"bert" => bert,
"style_vec" => style_vector,
"sdp_ratio" => sdp_ratio,
"length_scale" => length_scale,
"noise_scale" => noise_scale,
"noise_scale_w" => noise_scale_w,
})?;
let audio_array = outputs["output"]
.try_extract_array::<f32>()?
.into_dimensionality::<Ix3>()?
.to_owned();
Ok(audio_array)
}

View File

@@ -25,6 +25,21 @@ static MORA_LIST_ADDITIONAL: Lazy<Vec<Mora>> = Lazy::new(|| {
data.additional
});
pub static MORA_PHONEMES_TO_MORA_KATA: Lazy<HashMap<String, String>> = Lazy::new(|| {
let mut map = HashMap::new();
for mora in MORA_LIST_MINIMUM.iter() {
map.insert(
format!(
"{}{}",
mora.consonant.clone().unwrap_or("".to_string()),
mora.vowel
),
mora.mora.clone(),
);
}
map
});
pub static MORA_KATA_TO_MORA_PHONEMES: Lazy<HashMap<String, (Option<String>, String)>> =
Lazy::new(|| {
let mut map = HashMap::new();
@@ -37,4 +52,12 @@ pub static MORA_KATA_TO_MORA_PHONEMES: Lazy<HashMap<String, (Option<String>, Str
map
});
pub static CONSONANTS: Lazy<Vec<String>> = Lazy::new(|| {
let consonants = MORA_KATA_TO_MORA_PHONEMES
.values()
.filter_map(|(consonant, _)| consonant.clone())
.collect::<Vec<_>>();
consonants
});
pub const VOWELS: [&str; 6] = ["a", "i", "u", "e", "o", "N"];

View File

@@ -0,0 +1,37 @@
use std::io::{Cursor, Read};
use tar::Archive;
use zstd::decode_all;
use crate::error::{Error, Result};
/// Parse a .sbv2 file binary
///
/// # Examples
///
/// ```rs
/// parse_sbv2file("tsukuyomi", std::fs::read("tsukuyomi.sbv2")?)?;
/// ```
pub fn parse_sbv2file<P: AsRef<[u8]>>(sbv2_bytes: P) -> Result<(Vec<u8>, Vec<u8>)> {
let mut arc = Archive::new(Cursor::new(decode_all(Cursor::new(sbv2_bytes.as_ref()))?));
let mut vits2 = None;
let mut style_vectors = None;
let mut et = arc.entries()?;
while let Some(Ok(mut e)) = et.next() {
let pth = String::from_utf8_lossy(&e.path_bytes()).to_string();
let mut b = Vec::with_capacity(e.size() as usize);
e.read_to_end(&mut b)?;
match pth.as_str() {
"model.onnx" => vits2 = Some(b),
"style_vectors.json" => style_vectors = Some(b),
_ => continue,
}
}
if style_vectors.is_none() {
return Err(Error::ModelNotFoundError("style_vectors".to_string()));
}
if vits2.is_none() {
return Err(Error::ModelNotFoundError("vits2".to_string()));
}
Ok((style_vectors.unwrap(), vits2.unwrap()))
}

View File

@@ -1,4 +1,4 @@
use crate::error::Result;
use crate::error::{Error, Result};
use ndarray::{s, Array1, Array2};
use serde::Deserialize;
@@ -21,6 +21,18 @@ pub fn get_style_vector(
style_id: i32,
weight: f32,
) -> Result<Array1<f32>> {
if style_vectors.shape().len() != 2 {
return Err(Error::StyleError(
"Invalid shape for style vectors".to_string(),
));
}
if style_id < 0 || style_id >= style_vectors.shape()[0] as i32 {
return Err(Error::StyleError(format!(
"Invalid style ID: {}. Max ID: {}",
style_id,
style_vectors.shape()[0] - 1
)));
}
let mean = style_vectors.slice(s![0, ..]).to_owned();
let style_vector = style_vectors.slice(s![style_id as usize, ..]).to_owned();
let diff = (style_vector - &mean) * weight;

View File

@@ -1,5 +1,5 @@
use crate::error::Result;
use tokenizers::Tokenizer;
pub use tokenizers::Tokenizer;
pub fn get_tokenizer<P: AsRef<[u8]>>(p: P) -> Result<Tokenizer> {
let tokenizer = Tokenizer::from_bytes(p)?;

470
crates/sbv2_core/src/tts.rs Normal file
View File

@@ -0,0 +1,470 @@
use crate::error::{Error, Result};
use crate::{jtalk, model, style, tokenizer, tts_util};
#[cfg(feature = "aivmx")]
use base64::prelude::{Engine as _, BASE64_STANDARD};
#[cfg(feature = "aivmx")]
use ndarray::ShapeBuilder;
use ndarray::{concatenate, Array1, Array2, Array3, Axis};
use ort::session::Session;
#[cfg(feature = "aivmx")]
use std::io::Cursor;
use tokenizers::Tokenizer;
#[derive(PartialEq, Eq, Clone)]
pub struct TTSIdent(String);
impl std::fmt::Display for TTSIdent {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.write_str(&self.0)?;
Ok(())
}
}
impl<S> From<S> for TTSIdent
where
S: AsRef<str>,
{
fn from(value: S) -> Self {
TTSIdent(value.as_ref().to_string())
}
}
pub struct TTSModel {
vits2: Option<Session>,
style_vectors: Array2<f32>,
ident: TTSIdent,
bytes: Option<Vec<u8>>,
}
/// High-level Style-Bert-VITS2's API
pub struct TTSModelHolder {
tokenizer: Tokenizer,
bert: Session,
models: Vec<TTSModel>,
pub jtalk: jtalk::JTalk,
max_loaded_models: Option<usize>,
}
impl TTSModelHolder {
/// Initialize a new TTSModelHolder
///
/// # Examples
///
/// ```rs
/// let mut tts_holder = TTSModelHolder::new(std::fs::read("deberta.onnx")?, std::fs::read("tokenizer.json")?, None)?;
/// ```
pub fn new<P: AsRef<[u8]>>(
bert_model_bytes: P,
tokenizer_bytes: P,
max_loaded_models: Option<usize>,
) -> Result<Self> {
let bert = model::load_model(bert_model_bytes, true)?;
let jtalk = jtalk::JTalk::new()?;
let tokenizer = tokenizer::get_tokenizer(tokenizer_bytes)?;
Ok(TTSModelHolder {
bert,
models: vec![],
jtalk,
tokenizer,
max_loaded_models,
})
}
/// Return a list of model names
pub fn models(&self) -> Vec<String> {
self.models.iter().map(|m| m.ident.to_string()).collect()
}
#[cfg(feature = "aivmx")]
pub fn load_aivmx<I: Into<TTSIdent>, P: AsRef<[u8]>>(
&mut self,
ident: I,
aivmx_bytes: P,
) -> Result<()> {
let ident = ident.into();
if self.find_model(ident.clone()).is_err() {
let mut load = true;
if let Some(max) = self.max_loaded_models {
if self.models.iter().filter(|x| x.vits2.is_some()).count() >= max {
load = false;
}
}
let model = model::load_model(&aivmx_bytes, false)?;
let metadata = model.metadata()?;
if let Some(aivm_style_vectors) = metadata.custom("aivm_style_vectors")? {
let aivm_style_vectors = BASE64_STANDARD.decode(aivm_style_vectors)?;
let style_vectors = Cursor::new(&aivm_style_vectors);
let reader = npyz::NpyFile::new(style_vectors)?;
let style_vectors = {
let shape = reader.shape().to_vec();
let order = reader.order();
let data = reader.into_vec::<f32>()?;
let shape = match shape[..] {
[i1, i2] => [i1 as usize, i2 as usize],
_ => panic!("expected 2D array"),
};
let true_shape = shape.set_f(order == npyz::Order::Fortran);
ndarray::Array2::from_shape_vec(true_shape, data)?
};
drop(metadata);
self.models.push(TTSModel {
vits2: if load { Some(model) } else { None },
bytes: if self.max_loaded_models.is_some() {
Some(aivmx_bytes.as_ref().to_vec())
} else {
None
},
ident,
style_vectors,
})
}
}
Ok(())
}
/// Load a .sbv2 file binary
///
/// # Examples
///
/// ```rs
/// tts_holder.load_sbv2file("tsukuyomi", std::fs::read("tsukuyomi.sbv2")?)?;
/// ```
pub fn load_sbv2file<I: Into<TTSIdent>, P: AsRef<[u8]>>(
&mut self,
ident: I,
sbv2_bytes: P,
) -> Result<()> {
let (style_vectors, vits2) = crate::sbv2file::parse_sbv2file(sbv2_bytes)?;
self.load(ident, style_vectors, vits2)?;
Ok(())
}
/// Load a style vector and onnx model binary
///
/// # Examples
///
/// ```rs
/// tts_holder.load("tsukuyomi", std::fs::read("style_vectors.json")?, std::fs::read("model.onnx")?)?;
/// ```
pub fn load<I: Into<TTSIdent>, P: AsRef<[u8]>>(
&mut self,
ident: I,
style_vectors_bytes: P,
vits2_bytes: P,
) -> Result<()> {
let ident = ident.into();
if self.find_model(ident.clone()).is_err() {
let mut load = true;
if let Some(max) = self.max_loaded_models {
if self.models.iter().filter(|x| x.vits2.is_some()).count() >= max {
load = false;
}
}
self.models.push(TTSModel {
vits2: if load {
Some(model::load_model(&vits2_bytes, false)?)
} else {
None
},
style_vectors: style::load_style(style_vectors_bytes)?,
ident,
bytes: if self.max_loaded_models.is_some() {
Some(vits2_bytes.as_ref().to_vec())
} else {
None
},
})
}
Ok(())
}
/// Unload a model
pub fn unload<I: Into<TTSIdent>>(&mut self, ident: I) -> bool {
let ident = ident.into();
if let Some((i, _)) = self
.models
.iter()
.enumerate()
.find(|(_, m)| m.ident == ident)
{
self.models.remove(i);
true
} else {
false
}
}
/// Parse text and return the input for synthesize
///
/// # Note
/// This function is for low-level usage, use `easy_synthesize` for high-level usage.
#[allow(clippy::type_complexity)]
pub fn parse_text(
&mut self,
text: &str,
) -> Result<(Array2<f32>, Array1<i64>, Array1<i64>, Array1<i64>)> {
crate::tts_util::parse_text_blocking(
text,
None,
&self.jtalk,
&self.tokenizer,
|token_ids, attention_masks| {
crate::bert::predict(&mut self.bert, token_ids, attention_masks)
},
)
}
#[allow(clippy::type_complexity)]
pub fn parse_text_neo(
&mut self,
text: String,
given_tones: Option<Vec<i32>>,
) -> Result<(Array2<f32>, Array1<i64>, Array1<i64>, Array1<i64>)> {
crate::tts_util::parse_text_blocking(
&text,
given_tones,
&self.jtalk,
&self.tokenizer,
|token_ids, attention_masks| {
crate::bert::predict(&mut self.bert, token_ids, attention_masks)
},
)
}
fn find_model<I: Into<TTSIdent>>(&mut self, ident: I) -> Result<&mut TTSModel> {
let ident = ident.into();
self.models
.iter_mut()
.find(|m| m.ident == ident)
.ok_or(Error::ModelNotFoundError(ident.to_string()))
}
fn find_and_load_model<I: Into<TTSIdent>>(&mut self, ident: I) -> Result<bool> {
let ident = ident.into();
// Locate target model entry
let target_index = self
.models
.iter()
.position(|m| m.ident == ident)
.ok_or(Error::ModelNotFoundError(ident.to_string()))?;
// Already loaded
if self.models[target_index].vits2.is_some() {
return Ok(true);
}
// Get bytes to build a Session
let bytes = self.models[target_index]
.bytes
.clone()
.ok_or(Error::ModelNotFoundError(ident.to_string()))?;
// Enforce max loaded models by evicting a different loaded model's session, not removing the entry
if let Some(max) = self.max_loaded_models {
let loaded_count = self.models.iter().filter(|m| m.vits2.is_some()).count();
if loaded_count >= max {
if let Some(evict_index) = self
.models
.iter()
.position(|m| m.vits2.is_some() && m.ident != ident)
{
// Drop only the session to free memory; keep bytes/style for future reload
self.models[evict_index].vits2 = None;
}
}
}
// Build and set session in-place for the target model
let s = model::load_model(&bytes, false)?;
self.models[target_index].vits2 = Some(s);
Ok(true)
}
/// Get style vector by style id and weight
///
/// # Note
/// This function is for low-level usage, use `easy_synthesize` for high-level usage.
pub fn get_style_vector<I: Into<TTSIdent>>(
&mut self,
ident: I,
style_id: i32,
weight: f32,
) -> Result<Array1<f32>> {
style::get_style_vector(&self.find_model(ident)?.style_vectors, style_id, weight)
}
/// Synthesize text to audio
///
/// # Examples
///
/// ```rs
/// let audio = tts_holder.easy_synthesize("tsukuyomi", "こんにちは", 0, SynthesizeOptions::default())?;
/// ```
pub fn easy_synthesize<I: Into<TTSIdent> + Copy>(
&mut self,
ident: I,
text: &str,
style_id: i32,
speaker_id: i64,
options: SynthesizeOptions,
) -> Result<Vec<u8>> {
self.find_and_load_model(ident)?;
let style_vector = self.get_style_vector(ident, style_id, options.style_weight)?;
let audio_array = if options.split_sentences {
let texts: Vec<&str> = text.split('\n').collect();
let mut audios = vec![];
for (i, t) in texts.iter().enumerate() {
if t.is_empty() {
continue;
}
let (bert_ori, phones, tones, lang_ids) = self.parse_text(t)?;
let vits2 = self
.find_model(ident)?
.vits2
.as_mut()
.ok_or(Error::ModelNotFoundError(ident.into().to_string()))?;
let audio = model::synthesize(
vits2,
bert_ori.to_owned(),
phones,
Array1::from_vec(vec![speaker_id]),
tones,
lang_ids,
style_vector.clone(),
options.sdp_ratio,
options.length_scale,
0.677,
0.8,
)?;
audios.push(audio.clone());
if i != texts.len() - 1 {
audios.push(Array3::zeros((1, 1, 22050)));
}
}
concatenate(
Axis(2),
&audios.iter().map(|x| x.view()).collect::<Vec<_>>(),
)?
} else {
let (bert_ori, phones, tones, lang_ids) = self.parse_text(text)?;
let vits2 = self
.find_model(ident)?
.vits2
.as_mut()
.ok_or(Error::ModelNotFoundError(ident.into().to_string()))?;
model::synthesize(
vits2,
bert_ori.to_owned(),
phones,
Array1::from_vec(vec![speaker_id]),
tones,
lang_ids,
style_vector,
options.sdp_ratio,
options.length_scale,
0.677,
0.8,
)?
};
tts_util::array_to_vec(audio_array)
}
pub fn easy_synthesize_neo<I: Into<TTSIdent> + Copy>(
&mut self,
ident: I,
text: &str,
given_tones: Option<Vec<i32>>,
style_id: i32,
speaker_id: i64,
options: SynthesizeOptions,
) -> Result<Vec<u8>> {
self.find_and_load_model(ident)?;
let style_vector = self.get_style_vector(ident, style_id, options.style_weight)?;
let audio_array = if options.split_sentences {
let texts: Vec<&str> = text.split('\n').collect();
let mut audios = vec![];
for (i, t) in texts.iter().enumerate() {
if t.is_empty() {
continue;
}
let (bert_ori, phones, tones, lang_ids) =
self.parse_text_neo(t.to_string(), given_tones.clone())?;
let vits2 = self
.find_model(ident)?
.vits2
.as_mut()
.ok_or(Error::ModelNotFoundError(ident.into().to_string()))?;
let audio = model::synthesize(
vits2,
bert_ori.to_owned(),
phones,
Array1::from_vec(vec![speaker_id]),
tones,
lang_ids,
style_vector.clone(),
options.sdp_ratio,
options.length_scale,
0.677,
0.8,
)?;
audios.push(audio.clone());
if i != texts.len() - 1 {
audios.push(Array3::zeros((1, 1, 22050)));
}
}
concatenate(
Axis(2),
&audios.iter().map(|x| x.view()).collect::<Vec<_>>(),
)?
} else {
let (bert_ori, phones, tones, lang_ids) = self.parse_text(text)?;
let vits2 = self
.find_model(ident)?
.vits2
.as_mut()
.ok_or(Error::ModelNotFoundError(ident.into().to_string()))?;
model::synthesize(
vits2,
bert_ori.to_owned(),
phones,
Array1::from_vec(vec![speaker_id]),
tones,
lang_ids,
style_vector,
options.sdp_ratio,
options.length_scale,
0.677,
0.8,
)?
};
tts_util::array_to_vec(audio_array)
}
}
/// Synthesize options
///
/// # Fields
/// - `sdp_ratio`: SDP ratio
/// - `length_scale`: Length scale
/// - `style_weight`: Style weight
/// - `split_sentences`: Split sentences
pub struct SynthesizeOptions {
pub sdp_ratio: f32,
pub length_scale: f32,
pub style_weight: f32,
pub split_sentences: bool,
}
impl Default for SynthesizeOptions {
fn default() -> Self {
SynthesizeOptions {
sdp_ratio: 0.0,
length_scale: 1.0,
style_weight: 1.0,
split_sentences: true,
}
}
}

View File

@@ -0,0 +1,227 @@
use std::io::Cursor;
use crate::error::Result;
use crate::jtalk::JTalkProcess;
use crate::mora::MORA_KATA_TO_MORA_PHONEMES;
use crate::norm::PUNCTUATIONS;
use crate::{jtalk, nlp, norm, tokenizer, utils};
use hound::{SampleFormat, WavSpec, WavWriter};
use ndarray::{concatenate, s, Array, Array1, Array2, Array3, Axis};
use tokenizers::Tokenizer;
pub fn preprocess_parse_text(text: &str, jtalk: &jtalk::JTalk) -> Result<(String, JTalkProcess)> {
let text = jtalk.num2word(text)?;
let normalized_text = norm::normalize_text(&text);
let process = jtalk.process_text(&normalized_text)?;
Ok((normalized_text, process))
}
/// Parse text and return the input for synthesize
///
/// # Note
/// This function is for low-level usage, use `easy_synthesize` for high-level usage.
#[allow(clippy::type_complexity)]
pub async fn parse_text(
text: &str,
jtalk: &jtalk::JTalk,
tokenizer: &Tokenizer,
bert_predict: impl FnOnce(
Vec<i64>,
Vec<i64>,
) -> std::pin::Pin<
Box<dyn std::future::Future<Output = Result<ndarray::Array2<f32>>>>,
>,
) -> Result<(Array2<f32>, Array1<i64>, Array1<i64>, Array1<i64>)> {
let (normalized_text, process) = preprocess_parse_text(text, jtalk)?;
let (phones, tones, mut word2ph) = process.g2p()?;
let (phones, tones, lang_ids) = nlp::cleaned_text_to_sequence(phones, tones);
let phones = utils::intersperse(&phones, 0);
let tones = utils::intersperse(&tones, 0);
let lang_ids = utils::intersperse(&lang_ids, 0);
for item in &mut word2ph {
*item *= 2;
}
word2ph[0] += 1;
let text = {
let (seq_text, _) = process.text_to_seq_kata()?;
seq_text.join("")
};
let (token_ids, attention_masks) = tokenizer::tokenize(&text, tokenizer)?;
let bert_content = bert_predict(token_ids, attention_masks).await?;
assert!(
word2ph.len() == text.chars().count() + 2,
"{} {}",
word2ph.len(),
normalized_text.chars().count()
);
let mut phone_level_feature = vec![];
for (i, reps) in word2ph.iter().enumerate() {
let repeat_feature = {
let (reps_rows, reps_cols) = (*reps, 1);
let arr_len = bert_content.slice(s![i, ..]).len();
let mut results: Array2<f32> = Array::zeros((reps_rows as usize, arr_len * reps_cols));
for j in 0..reps_rows {
for k in 0..reps_cols {
let mut view = results.slice_mut(s![j, k * arr_len..(k + 1) * arr_len]);
view.assign(&bert_content.slice(s![i, ..]));
}
}
results
};
phone_level_feature.push(repeat_feature);
}
let phone_level_feature = concatenate(
Axis(0),
&phone_level_feature
.iter()
.map(|x| x.view())
.collect::<Vec<_>>(),
)?;
let bert_ori = phone_level_feature.t();
Ok((
bert_ori.to_owned(),
phones.into(),
tones.into(),
lang_ids.into(),
))
}
/// Parse text and return the input for synthesize
///
/// # Note
/// This function is for low-level usage, use `easy_synthesize` for high-level usage.
#[allow(clippy::type_complexity)]
pub fn parse_text_blocking(
text: &str,
given_tones: Option<Vec<i32>>,
jtalk: &jtalk::JTalk,
tokenizer: &Tokenizer,
bert_predict: impl FnOnce(Vec<i64>, Vec<i64>) -> Result<ndarray::Array2<f32>>,
) -> Result<(Array2<f32>, Array1<i64>, Array1<i64>, Array1<i64>)> {
let text = jtalk.num2word(text)?;
let normalized_text = norm::normalize_text(&text);
let process = jtalk.process_text(&normalized_text)?;
let (phones, mut tones, mut word2ph) = process.g2p()?;
if let Some(given_tones) = given_tones {
tones = given_tones;
}
let (phones, tones, lang_ids) = nlp::cleaned_text_to_sequence(phones, tones);
let phones = utils::intersperse(&phones, 0);
let tones = utils::intersperse(&tones, 0);
let lang_ids = utils::intersperse(&lang_ids, 0);
for item in &mut word2ph {
*item *= 2;
}
word2ph[0] += 1;
let text = {
let (seq_text, _) = process.text_to_seq_kata()?;
seq_text.join("")
};
let (token_ids, attention_masks) = tokenizer::tokenize(&text, tokenizer)?;
let bert_content = bert_predict(token_ids, attention_masks)?;
assert!(
word2ph.len() == text.chars().count() + 2,
"{} {}",
word2ph.len(),
normalized_text.chars().count()
);
let mut phone_level_feature = vec![];
for (i, reps) in word2ph.iter().enumerate() {
let repeat_feature = {
let (reps_rows, reps_cols) = (*reps, 1);
let arr_len = bert_content.slice(s![i, ..]).len();
let mut results: Array2<f32> = Array::zeros((reps_rows as usize, arr_len * reps_cols));
for j in 0..reps_rows {
for k in 0..reps_cols {
let mut view = results.slice_mut(s![j, k * arr_len..(k + 1) * arr_len]);
view.assign(&bert_content.slice(s![i, ..]));
}
}
results
};
phone_level_feature.push(repeat_feature);
}
let phone_level_feature = concatenate(
Axis(0),
&phone_level_feature
.iter()
.map(|x| x.view())
.collect::<Vec<_>>(),
)?;
let bert_ori = phone_level_feature.t();
Ok((
bert_ori.to_owned(),
phones.into(),
tones.into(),
lang_ids.into(),
))
}
pub fn array_to_vec(audio_array: Array3<f32>) -> Result<Vec<u8>> {
// If SBV2_FORCE_STEREO is set ("1"/"true"), duplicate mono to stereo
let force_stereo = std::env::var("SBV2_FORCE_STEREO")
.ok()
.map(|v| matches!(v.as_str(), "1" | "true" | "TRUE" | "True"))
.unwrap_or(false);
let channels: u16 = if force_stereo { 2 } else { 1 };
let spec = WavSpec {
channels,
sample_rate: 44100,
bits_per_sample: 32,
sample_format: SampleFormat::Float,
};
let mut cursor = Cursor::new(Vec::new());
let mut writer = WavWriter::new(&mut cursor, spec)?;
for i in 0..audio_array.shape()[0] {
let output = audio_array.slice(s![i, 0, ..]).to_vec();
if force_stereo {
for sample in output {
// Write to Left and Right channels
writer.write_sample(sample)?;
writer.write_sample(sample)?;
}
} else {
for sample in output {
writer.write_sample(sample)?;
}
}
}
writer.finalize()?;
Ok(cursor.into_inner())
}
pub fn kata_tone2phone_tone(kata_tone: Vec<(String, i32)>) -> Vec<(String, i32)> {
let mut results = vec![("_".to_string(), 0)];
for (mora, tone) in kata_tone {
if PUNCTUATIONS.contains(&mora.as_str()) {
results.push((mora, 0));
continue;
} else {
let (consonant, vowel) = MORA_KATA_TO_MORA_PHONEMES.get(&mora).unwrap();
if let Some(consonant) = consonant {
results.push((consonant.to_string(), tone));
results.push((vowel.to_string(), tone));
} else {
results.push((vowel.to_string(), tone));
}
}
}
results.push(("_".to_string(), 0));
results
}

View File

@@ -0,0 +1,19 @@
[package]
name = "sbv2_editor"
version.workspace = true
edition.workspace = true
description.workspace = true
license.workspace = true
readme.workspace = true
repository.workspace = true
documentation.workspace = true
[dependencies]
anyhow.workspace = true
axum = "0.8.1"
dotenvy.workspace = true
env_logger.workspace = true
log = "0.4.27"
sbv2_core = { version = "0.2.0-alpha6", path = "../sbv2_core", features = ["aivmx"] }
serde = { version = "1.0.219", features = ["derive"] }
tokio = { version = "1.44.1", features = ["full"] }

View File

@@ -0,0 +1,2 @@
# sbv2-voicevox
sbv2-apiをvoicevox化します。

View File

@@ -0,0 +1,226 @@
{
"accent_phrases": [
{
"moras": [
{
"text": "コ",
"consonant": "k",
"consonant_length": 0.10002632439136505,
"vowel": "o",
"vowel_length": 0.15740256011486053,
"pitch": 5.749961853027344
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0.08265873789787292,
"pitch": 5.89122200012207
},
{
"text": "ニ",
"consonant": "n",
"consonant_length": 0.03657080978155136,
"vowel": "i",
"vowel_length": 0.1175866425037384,
"pitch": 5.969866752624512
},
{
"text": "チ",
"consonant": "ch",
"consonant_length": 0.09005842357873917,
"vowel": "i",
"vowel_length": 0.08666137605905533,
"pitch": 5.958892822265625
},
{
"text": "ワ",
"consonant": "w",
"consonant_length": 0.07833231985569,
"vowel": "a",
"vowel_length": 0.21250136196613312,
"pitch": 5.949411392211914
}
],
"accent": 5,
"pause_mora": {
"text": "、",
"consonant": null,
"consonant_length": null,
"vowel": "pau",
"vowel_length": 0.4723339378833771,
"pitch": 0.0
},
"is_interrogative": false
},
{
"moras": [
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.22004225850105286,
"pitch": 5.6870927810668945
},
{
"text": "ン",
"consonant": null,
"consonant_length": null,
"vowel": "N",
"vowel_length": 0.09161105751991272,
"pitch": 5.93472957611084
},
{
"text": "セ",
"consonant": "s",
"consonant_length": 0.08924821764230728,
"vowel": "e",
"vowel_length": 0.14142127335071564,
"pitch": 6.121850490570068
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.10636933892965317,
"pitch": 6.157896041870117
},
{
"text": "ゴ",
"consonant": "g",
"consonant_length": 0.07600915431976318,
"vowel": "o",
"vowel_length": 0.09598273783922195,
"pitch": 6.188933849334717
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.1079121008515358,
"pitch": 6.235202789306641
},
{
"text": "セ",
"consonant": "s",
"consonant_length": 0.09591838717460632,
"vowel": "e",
"vowel_length": 0.10286372154951096,
"pitch": 6.153214454650879
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.08992656320333481,
"pitch": 6.02571439743042
},
{
"text": "",
"consonant": "n",
"consonant_length": 0.05660202354192734,
"vowel": "o",
"vowel_length": 0.09676017612218857,
"pitch": 5.711844444274902
}
],
"accent": 5,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "セ",
"consonant": "s",
"consonant_length": 0.07805486768484116,
"vowel": "e",
"vowel_length": 0.09617523103952408,
"pitch": 5.774399280548096
},
{
"text": "カ",
"consonant": "k",
"consonant_length": 0.06712044775485992,
"vowel": "a",
"vowel_length": 0.148829385638237,
"pitch": 6.063965797424316
},
{
"text": "イ",
"consonant": null,
"consonant_length": null,
"vowel": "i",
"vowel_length": 0.11061104387044907,
"pitch": 6.040698051452637
},
{
"text": "エ",
"consonant": null,
"consonant_length": null,
"vowel": "e",
"vowel_length": 0.13046696782112122,
"pitch": 5.806027889251709
}
],
"accent": 1,
"pause_mora": null,
"is_interrogative": false
},
{
"moras": [
{
"text": "ヨ",
"consonant": "y",
"consonant_length": 0.07194744795560837,
"vowel": "o",
"vowel_length": 0.08622600883245468,
"pitch": 5.694094657897949
},
{
"text": "オ",
"consonant": null,
"consonant_length": null,
"vowel": "o",
"vowel_length": 0.10635452717542648,
"pitch": 5.787222385406494
},
{
"text": "コ",
"consonant": "k",
"consonant_length": 0.07077334076166153,
"vowel": "o",
"vowel_length": 0.09248624742031097,
"pitch": 5.793357849121094
},
{
"text": "ソ",
"consonant": "s",
"consonant_length": 0.08705667406320572,
"vowel": "o",
"vowel_length": 0.2238258570432663,
"pitch": 5.643765449523926
}
],
"accent": 1,
"pause_mora": null,
"is_interrogative": false
}
],
"speedScale": 1.0,
"pitchScale": 0.0,
"intonationScale": 1.0,
"volumeScale": 1.0,
"prePhonemeLength": 0.1,
"postPhonemeLength": 0.1,
"pauseLength": null,
"pauseLengthScale": 1.0,
"outputSamplingRate": 24000,
"outputStereo": false,
"kana": "コンニチワ'、オンセエゴ'オセエノ/セ'カイエ/ヨ'オコソ"
}

View File

@@ -0,0 +1,27 @@
use axum::{
http::StatusCode,
response::{IntoResponse, Response},
};
pub type AppResult<T> = std::result::Result<T, AppError>;
pub struct AppError(anyhow::Error);
impl IntoResponse for AppError {
fn into_response(self) -> Response {
(
StatusCode::INTERNAL_SERVER_ERROR,
format!("Something went wrong: {}", self.0),
)
.into_response()
}
}
impl<E> From<E> for AppError
where
E: Into<anyhow::Error>,
{
fn from(err: E) -> Self {
Self(err.into())
}
}

View File

@@ -0,0 +1,197 @@
use axum::extract::State;
use axum::{
extract::Query,
http::header::CONTENT_TYPE,
response::IntoResponse,
routing::{get, post},
Json, Router,
};
use sbv2_core::tts_util::kata_tone2phone_tone;
use sbv2_core::{
tts::{SynthesizeOptions, TTSModelHolder},
tts_util::preprocess_parse_text,
};
use serde::{Deserialize, Serialize};
use tokio::{fs, net::TcpListener, sync::Mutex};
use std::env;
use std::sync::Arc;
use error::AppResult;
mod error;
#[derive(Deserialize)]
struct RequestCreateAudioQuery {
text: String,
}
#[derive(Serialize, Deserialize)]
struct AudioQuery {
kana: String,
tone: i32,
}
#[derive(Serialize)]
struct ResponseCreateAudioQuery {
audio_query: Vec<AudioQuery>,
text: String,
}
async fn create_audio_query(
State(state): State<AppState>,
Query(request): Query<RequestCreateAudioQuery>,
) -> AppResult<impl IntoResponse> {
let (text, process) = {
let tts_model = state.tts_model.lock().await;
preprocess_parse_text(&request.text, &tts_model.jtalk)?
};
let kana_tone_list = process.g2kana_tone()?;
let audio_query = kana_tone_list
.iter()
.map(|(kana, tone)| AudioQuery {
kana: kana.clone(),
tone: *tone,
})
.collect::<Vec<_>>();
Ok(Json(ResponseCreateAudioQuery { audio_query, text }))
}
#[derive(Deserialize)]
pub struct RequestSynthesis {
text: String,
speaker_id: i64,
sdp_ratio: f32,
length_scale: f32,
style_id: i32,
audio_query: Vec<AudioQuery>,
ident: String,
}
async fn synthesis(
State(state): State<AppState>,
Json(request): Json<RequestSynthesis>,
) -> AppResult<impl IntoResponse> {
let phone_tone = request
.audio_query
.iter()
.map(|query| (query.kana.clone(), query.tone))
.collect::<Vec<_>>();
let phone_tone = kata_tone2phone_tone(phone_tone);
let tones = phone_tone.iter().map(|(_, tone)| *tone).collect::<Vec<_>>();
let buffer = {
let mut tts_model = state.tts_model.lock().await;
tts_model.easy_synthesize_neo(
&request.ident,
&request.text,
Some(tones),
request.style_id,
request.speaker_id,
SynthesizeOptions {
sdp_ratio: request.sdp_ratio,
length_scale: request.length_scale,
..Default::default()
},
)?
};
Ok(([(CONTENT_TYPE, "audio/wav")], buffer))
}
#[derive(Clone)]
struct AppState {
tts_model: Arc<Mutex<TTSModelHolder>>,
}
impl AppState {
pub async fn new() -> anyhow::Result<Self> {
let mut tts_model = TTSModelHolder::new(
&fs::read(env::var("BERT_MODEL_PATH")?).await?,
&fs::read(env::var("TOKENIZER_PATH")?).await?,
env::var("HOLDER_MAX_LOADED_MODElS")
.ok()
.and_then(|x| x.parse().ok()),
)?;
let models = env::var("MODELS_PATH").unwrap_or("models".to_string());
let mut f = fs::read_dir(&models).await?;
let mut entries = vec![];
while let Ok(Some(e)) = f.next_entry().await {
let name = e.file_name().to_string_lossy().to_string();
if name.ends_with(".onnx") && name.starts_with("model_") {
let name_len = name.len();
let name = name.chars();
entries.push(
name.collect::<Vec<_>>()[6..name_len - 5]
.iter()
.collect::<String>(),
);
} else if name.ends_with(".sbv2") {
let entry = &name[..name.len() - 5];
log::info!("Try loading: {entry}");
let sbv2_bytes = match fs::read(format!("{models}/{entry}.sbv2")).await {
Ok(b) => b,
Err(e) => {
log::warn!("Error loading sbv2_bytes from file {entry}: {e}");
continue;
}
};
if let Err(e) = tts_model.load_sbv2file(entry, sbv2_bytes) {
log::warn!("Error loading {entry}: {e}");
};
log::info!("Loaded: {entry}");
} else if name.ends_with(".aivmx") {
let entry = &name[..name.len() - 6];
log::info!("Try loading: {entry}");
let aivmx_bytes = match fs::read(format!("{models}/{entry}.aivmx")).await {
Ok(b) => b,
Err(e) => {
log::warn!("Error loading aivmx bytes from file {entry}: {e}");
continue;
}
};
if let Err(e) = tts_model.load_aivmx(entry, aivmx_bytes) {
log::error!("Error loading {entry}: {e}");
}
log::info!("Loaded: {entry}");
}
}
for entry in entries {
log::info!("Try loading: {entry}");
let style_vectors_bytes =
match fs::read(format!("{models}/style_vectors_{entry}.json")).await {
Ok(b) => b,
Err(e) => {
log::warn!("Error loading style_vectors_bytes from file {entry}: {e}");
continue;
}
};
let vits2_bytes = match fs::read(format!("{models}/model_{entry}.onnx")).await {
Ok(b) => b,
Err(e) => {
log::warn!("Error loading vits2_bytes from file {entry}: {e}");
continue;
}
};
if let Err(e) = tts_model.load(&entry, style_vectors_bytes, vits2_bytes) {
log::warn!("Error loading {entry}: {e}");
};
log::info!("Loaded: {entry}");
}
Ok(Self {
tts_model: Arc::new(Mutex::new(tts_model)),
})
}
}
#[tokio::main]
async fn main() -> anyhow::Result<()> {
dotenvy::dotenv_override().ok();
env_logger::init();
let app = Router::new()
.route("/", get(|| async { "Hello, world!" }))
.route("/audio_query", get(create_audio_query))
.route("/synthesis", post(synthesis))
.with_state(AppState::new().await?);
let listener = TcpListener::bind("0.0.0.0:8080").await?;
axum::serve(listener, app).await?;
Ok(())
}

View File

@@ -0,0 +1,20 @@
[package]
name = "sbv2_wasm"
version.workspace = true
edition.workspace = true
description.workspace = true
readme.workspace = true
repository.workspace = true
documentation.workspace = true
license.workspace = true
[lib]
crate-type = ["cdylib", "rlib"]
[dependencies]
wasm-bindgen = "0.2.93"
sbv2_core = { path = "../sbv2_core", default-features = false, features = ["no_std"] }
once_cell.workspace = true
js-sys = "0.3.70"
ndarray.workspace = true
wasm-bindgen-futures = "0.4.43"

View File

@@ -0,0 +1,2 @@
# StyleBertVITS2 wasm
refer to https://github.com/neodyland/sbv2-api

View File

@@ -0,0 +1,31 @@
{
"$schema": "https://biomejs.dev/schemas/1.9.2/schema.json",
"vcs": {
"enabled": false,
"clientKind": "git",
"useIgnoreFile": false
},
"files": {
"ignoreUnknown": false,
"ignore": []
},
"formatter": {
"enabled": true,
"indentStyle": "tab",
"ignore": ["dist/", "pkg/"]
},
"organizeImports": {
"enabled": true
},
"linter": {
"enabled": true,
"rules": {
"recommended": true
}
},
"javascript": {
"formatter": {
"quoteStyle": "double"
}
}
}

8
crates/sbv2_wasm/build.sh Executable file
View File

@@ -0,0 +1,8 @@
#!/bin/sh
wasm-pack build --target web ./crates/sbv2_wasm --release
wasm-opt -O3 -o ./crates/sbv2_wasm/pkg/sbv2_wasm_bg.wasm ./crates/sbv2_wasm/pkg/sbv2_wasm_bg.wasm
wasm-strip ./crates/sbv2_wasm/pkg/sbv2_wasm_bg.wasm
mkdir -p ./crates/sbv2_wasm/dist
cp ./crates/sbv2_wasm/pkg/sbv2_wasm_bg.wasm ./crates/sbv2_wasm/dist/sbv2_wasm_bg.wasm
cd ./crates/sbv2_wasm
pnpm build

View File

@@ -0,0 +1,14 @@
import { ModelHolder } from "./dist/index.js";
import fs from "node:fs/promises";
ModelHolder.globalInit(await fs.readFile("./dist/sbv2_wasm_bg.wasm"));
const holder = await ModelHolder.create(
(await fs.readFile("../../models/tokenizer.json")).toString("utf-8"),
await fs.readFile("../../models/deberta.onnx"),
);
await holder.load(
"tsukuyomi",
await fs.readFile("../../models/tsukuyomi.sbv2"),
);
await fs.writeFile("out.wav", await holder.synthesize("tsukuyomi", "おはよう"));
holder.unload("tsukuyomi");

View File

@@ -0,0 +1,26 @@
{
"name": "sbv2",
"version": "0.2.0-alpha6",
"description": "Style Bert VITS2 wasm",
"main": "dist/index.js",
"types": "dist/index.d.ts",
"type": "module",
"scripts": {
"build": "tsc && esbuild src-js/index.ts --outfile=dist/index.js --minify --format=esm --bundle --external:onnxruntime-web",
"format": "biome format --write ."
},
"keywords": [],
"author": "tuna2134",
"contributes": ["neodyland"],
"license": "MIT",
"devDependencies": {
"@biomejs/biome": "^1.9.4",
"@types/node": "^22.13.5",
"esbuild": "^0.25.0",
"typescript": "^5.7.3"
},
"dependencies": {
"onnxruntime-web": "^1.20.1"
},
"files": ["dist/*", "package.json", "README.md", "pkg/*.ts", "pkg/*.js"]
}

504
crates/sbv2_wasm/pnpm-lock.yaml generated Normal file
View File

@@ -0,0 +1,504 @@
lockfileVersion: '9.0'
settings:
autoInstallPeers: true
excludeLinksFromLockfile: false
importers:
.:
dependencies:
onnxruntime-web:
specifier: ^1.20.1
version: 1.20.1
devDependencies:
'@biomejs/biome':
specifier: ^1.9.4
version: 1.9.4
'@types/node':
specifier: ^22.13.5
version: 22.13.5
esbuild:
specifier: ^0.25.0
version: 0.25.0
typescript:
specifier: ^5.7.3
version: 5.8.3
packages:
'@biomejs/biome@1.9.4':
resolution: {integrity: sha512-1rkd7G70+o9KkTn5KLmDYXihGoTaIGO9PIIN2ZB7UJxFrWw04CZHPYiMRjYsaDvVV7hP1dYNRLxSANLaBFGpog==}
engines: {node: '>=14.21.3'}
hasBin: true
'@biomejs/cli-darwin-arm64@1.9.4':
resolution: {integrity: sha512-bFBsPWrNvkdKrNCYeAp+xo2HecOGPAy9WyNyB/jKnnedgzl4W4Hb9ZMzYNbf8dMCGmUdSavlYHiR01QaYR58cw==}
engines: {node: '>=14.21.3'}
cpu: [arm64]
os: [darwin]
'@biomejs/cli-darwin-x64@1.9.4':
resolution: {integrity: sha512-ngYBh/+bEedqkSevPVhLP4QfVPCpb+4BBe2p7Xs32dBgs7rh9nY2AIYUL6BgLw1JVXV8GlpKmb/hNiuIxfPfZg==}
engines: {node: '>=14.21.3'}
cpu: [x64]
os: [darwin]
'@biomejs/cli-linux-arm64-musl@1.9.4':
resolution: {integrity: sha512-v665Ct9WCRjGa8+kTr0CzApU0+XXtRgwmzIf1SeKSGAv+2scAlW6JR5PMFo6FzqqZ64Po79cKODKf3/AAmECqA==}
engines: {node: '>=14.21.3'}
cpu: [arm64]
os: [linux]
'@biomejs/cli-linux-arm64@1.9.4':
resolution: {integrity: sha512-fJIW0+LYujdjUgJJuwesP4EjIBl/N/TcOX3IvIHJQNsAqvV2CHIogsmA94BPG6jZATS4Hi+xv4SkBBQSt1N4/g==}
engines: {node: '>=14.21.3'}
cpu: [arm64]
os: [linux]
'@biomejs/cli-linux-x64-musl@1.9.4':
resolution: {integrity: sha512-gEhi/jSBhZ2m6wjV530Yy8+fNqG8PAinM3oV7CyO+6c3CEh16Eizm21uHVsyVBEB6RIM8JHIl6AGYCv6Q6Q9Tg==}
engines: {node: '>=14.21.3'}
cpu: [x64]
os: [linux]
'@biomejs/cli-linux-x64@1.9.4':
resolution: {integrity: sha512-lRCJv/Vi3Vlwmbd6K+oQ0KhLHMAysN8lXoCI7XeHlxaajk06u7G+UsFSO01NAs5iYuWKmVZjmiOzJ0OJmGsMwg==}
engines: {node: '>=14.21.3'}
cpu: [x64]
os: [linux]
'@biomejs/cli-win32-arm64@1.9.4':
resolution: {integrity: sha512-tlbhLk+WXZmgwoIKwHIHEBZUwxml7bRJgk0X2sPyNR3S93cdRq6XulAZRQJ17FYGGzWne0fgrXBKpl7l4M87Hg==}
engines: {node: '>=14.21.3'}
cpu: [arm64]
os: [win32]
'@biomejs/cli-win32-x64@1.9.4':
resolution: {integrity: sha512-8Y5wMhVIPaWe6jw2H+KlEm4wP/f7EW3810ZLmDlrEEy5KvBsb9ECEfu/kMWD484ijfQ8+nIi0giMgu9g1UAuuA==}
engines: {node: '>=14.21.3'}
cpu: [x64]
os: [win32]
'@esbuild/aix-ppc64@0.25.0':
resolution: {integrity: sha512-O7vun9Sf8DFjH2UtqK8Ku3LkquL9SZL8OLY1T5NZkA34+wG3OQF7cl4Ql8vdNzM6fzBbYfLaiRLIOZ+2FOCgBQ==}
engines: {node: '>=18'}
cpu: [ppc64]
os: [aix]
'@esbuild/android-arm64@0.25.0':
resolution: {integrity: sha512-grvv8WncGjDSyUBjN9yHXNt+cq0snxXbDxy5pJtzMKGmmpPxeAmAhWxXI+01lU5rwZomDgD3kJwulEnhTRUd6g==}
engines: {node: '>=18'}
cpu: [arm64]
os: [android]
'@esbuild/android-arm@0.25.0':
resolution: {integrity: sha512-PTyWCYYiU0+1eJKmw21lWtC+d08JDZPQ5g+kFyxP0V+es6VPPSUhM6zk8iImp2jbV6GwjX4pap0JFbUQN65X1g==}
engines: {node: '>=18'}
cpu: [arm]
os: [android]
'@esbuild/android-x64@0.25.0':
resolution: {integrity: sha512-m/ix7SfKG5buCnxasr52+LI78SQ+wgdENi9CqyCXwjVR2X4Jkz+BpC3le3AoBPYTC9NHklwngVXvbJ9/Akhrfg==}
engines: {node: '>=18'}
cpu: [x64]
os: [android]
'@esbuild/darwin-arm64@0.25.0':
resolution: {integrity: sha512-mVwdUb5SRkPayVadIOI78K7aAnPamoeFR2bT5nszFUZ9P8UpK4ratOdYbZZXYSqPKMHfS1wdHCJk1P1EZpRdvw==}
engines: {node: '>=18'}
cpu: [arm64]
os: [darwin]
'@esbuild/darwin-x64@0.25.0':
resolution: {integrity: sha512-DgDaYsPWFTS4S3nWpFcMn/33ZZwAAeAFKNHNa1QN0rI4pUjgqf0f7ONmXf6d22tqTY+H9FNdgeaAa+YIFUn2Rg==}
engines: {node: '>=18'}
cpu: [x64]
os: [darwin]
'@esbuild/freebsd-arm64@0.25.0':
resolution: {integrity: sha512-VN4ocxy6dxefN1MepBx/iD1dH5K8qNtNe227I0mnTRjry8tj5MRk4zprLEdG8WPyAPb93/e4pSgi1SoHdgOa4w==}
engines: {node: '>=18'}
cpu: [arm64]
os: [freebsd]
'@esbuild/freebsd-x64@0.25.0':
resolution: {integrity: sha512-mrSgt7lCh07FY+hDD1TxiTyIHyttn6vnjesnPoVDNmDfOmggTLXRv8Id5fNZey1gl/V2dyVK1VXXqVsQIiAk+A==}
engines: {node: '>=18'}
cpu: [x64]
os: [freebsd]
'@esbuild/linux-arm64@0.25.0':
resolution: {integrity: sha512-9QAQjTWNDM/Vk2bgBl17yWuZxZNQIF0OUUuPZRKoDtqF2k4EtYbpyiG5/Dk7nqeK6kIJWPYldkOcBqjXjrUlmg==}
engines: {node: '>=18'}
cpu: [arm64]
os: [linux]
'@esbuild/linux-arm@0.25.0':
resolution: {integrity: sha512-vkB3IYj2IDo3g9xX7HqhPYxVkNQe8qTK55fraQyTzTX/fxaDtXiEnavv9geOsonh2Fd2RMB+i5cbhu2zMNWJwg==}
engines: {node: '>=18'}
cpu: [arm]
os: [linux]
'@esbuild/linux-ia32@0.25.0':
resolution: {integrity: sha512-43ET5bHbphBegyeqLb7I1eYn2P/JYGNmzzdidq/w0T8E2SsYL1U6un2NFROFRg1JZLTzdCoRomg8Rvf9M6W6Gg==}
engines: {node: '>=18'}
cpu: [ia32]
os: [linux]
'@esbuild/linux-loong64@0.25.0':
resolution: {integrity: sha512-fC95c/xyNFueMhClxJmeRIj2yrSMdDfmqJnyOY4ZqsALkDrrKJfIg5NTMSzVBr5YW1jf+l7/cndBfP3MSDpoHw==}
engines: {node: '>=18'}
cpu: [loong64]
os: [linux]
'@esbuild/linux-mips64el@0.25.0':
resolution: {integrity: sha512-nkAMFju7KDW73T1DdH7glcyIptm95a7Le8irTQNO/qtkoyypZAnjchQgooFUDQhNAy4iu08N79W4T4pMBwhPwQ==}
engines: {node: '>=18'}
cpu: [mips64el]
os: [linux]
'@esbuild/linux-ppc64@0.25.0':
resolution: {integrity: sha512-NhyOejdhRGS8Iwv+KKR2zTq2PpysF9XqY+Zk77vQHqNbo/PwZCzB5/h7VGuREZm1fixhs4Q/qWRSi5zmAiO4Fw==}
engines: {node: '>=18'}
cpu: [ppc64]
os: [linux]
'@esbuild/linux-riscv64@0.25.0':
resolution: {integrity: sha512-5S/rbP5OY+GHLC5qXp1y/Mx//e92L1YDqkiBbO9TQOvuFXM+iDqUNG5XopAnXoRH3FjIUDkeGcY1cgNvnXp/kA==}
engines: {node: '>=18'}
cpu: [riscv64]
os: [linux]
'@esbuild/linux-s390x@0.25.0':
resolution: {integrity: sha512-XM2BFsEBz0Fw37V0zU4CXfcfuACMrppsMFKdYY2WuTS3yi8O1nFOhil/xhKTmE1nPmVyvQJjJivgDT+xh8pXJA==}
engines: {node: '>=18'}
cpu: [s390x]
os: [linux]
'@esbuild/linux-x64@0.25.0':
resolution: {integrity: sha512-9yl91rHw/cpwMCNytUDxwj2XjFpxML0y9HAOH9pNVQDpQrBxHy01Dx+vaMu0N1CKa/RzBD2hB4u//nfc+Sd3Cw==}
engines: {node: '>=18'}
cpu: [x64]
os: [linux]
'@esbuild/netbsd-arm64@0.25.0':
resolution: {integrity: sha512-RuG4PSMPFfrkH6UwCAqBzauBWTygTvb1nxWasEJooGSJ/NwRw7b2HOwyRTQIU97Hq37l3npXoZGYMy3b3xYvPw==}
engines: {node: '>=18'}
cpu: [arm64]
os: [netbsd]
'@esbuild/netbsd-x64@0.25.0':
resolution: {integrity: sha512-jl+qisSB5jk01N5f7sPCsBENCOlPiS/xptD5yxOx2oqQfyourJwIKLRA2yqWdifj3owQZCL2sn6o08dBzZGQzA==}
engines: {node: '>=18'}
cpu: [x64]
os: [netbsd]
'@esbuild/openbsd-arm64@0.25.0':
resolution: {integrity: sha512-21sUNbq2r84YE+SJDfaQRvdgznTD8Xc0oc3p3iW/a1EVWeNj/SdUCbm5U0itZPQYRuRTW20fPMWMpcrciH2EJw==}
engines: {node: '>=18'}
cpu: [arm64]
os: [openbsd]
'@esbuild/openbsd-x64@0.25.0':
resolution: {integrity: sha512-2gwwriSMPcCFRlPlKx3zLQhfN/2WjJ2NSlg5TKLQOJdV0mSxIcYNTMhk3H3ulL/cak+Xj0lY1Ym9ysDV1igceg==}
engines: {node: '>=18'}
cpu: [x64]
os: [openbsd]
'@esbuild/sunos-x64@0.25.0':
resolution: {integrity: sha512-bxI7ThgLzPrPz484/S9jLlvUAHYMzy6I0XiU1ZMeAEOBcS0VePBFxh1JjTQt3Xiat5b6Oh4x7UC7IwKQKIJRIg==}
engines: {node: '>=18'}
cpu: [x64]
os: [sunos]
'@esbuild/win32-arm64@0.25.0':
resolution: {integrity: sha512-ZUAc2YK6JW89xTbXvftxdnYy3m4iHIkDtK3CLce8wg8M2L+YZhIvO1DKpxrd0Yr59AeNNkTiic9YLf6FTtXWMw==}
engines: {node: '>=18'}
cpu: [arm64]
os: [win32]
'@esbuild/win32-ia32@0.25.0':
resolution: {integrity: sha512-eSNxISBu8XweVEWG31/JzjkIGbGIJN/TrRoiSVZwZ6pkC6VX4Im/WV2cz559/TXLcYbcrDN8JtKgd9DJVIo8GA==}
engines: {node: '>=18'}
cpu: [ia32]
os: [win32]
'@esbuild/win32-x64@0.25.0':
resolution: {integrity: sha512-ZENoHJBxA20C2zFzh6AI4fT6RraMzjYw4xKWemRTRmRVtN9c5DcH9r/f2ihEkMjOW5eGgrwCslG/+Y/3bL+DHQ==}
engines: {node: '>=18'}
cpu: [x64]
os: [win32]
'@protobufjs/aspromise@1.1.2':
resolution: {integrity: sha512-j+gKExEuLmKwvz3OgROXtrJ2UG2x8Ch2YZUxahh+s1F2HZ+wAceUNLkvy6zKCPVRkU++ZWQrdxsUeQXmcg4uoQ==}
'@protobufjs/base64@1.1.2':
resolution: {integrity: sha512-AZkcAA5vnN/v4PDqKyMR5lx7hZttPDgClv83E//FMNhR2TMcLUhfRUBHCmSl0oi9zMgDDqRUJkSxO3wm85+XLg==}
'@protobufjs/codegen@2.0.4':
resolution: {integrity: sha512-YyFaikqM5sH0ziFZCN3xDC7zeGaB/d0IUb9CATugHWbd1FRFwWwt4ld4OYMPWu5a3Xe01mGAULCdqhMlPl29Jg==}
'@protobufjs/eventemitter@1.1.0':
resolution: {integrity: sha512-j9ednRT81vYJ9OfVuXG6ERSTdEL1xVsNgqpkxMsbIabzSo3goCjDIveeGv5d03om39ML71RdmrGNjG5SReBP/Q==}
'@protobufjs/fetch@1.1.0':
resolution: {integrity: sha512-lljVXpqXebpsijW71PZaCYeIcE5on1w5DlQy5WH6GLbFryLUrBD4932W/E2BSpfRJWseIL4v/KPgBFxDOIdKpQ==}
'@protobufjs/float@1.0.2':
resolution: {integrity: sha512-Ddb+kVXlXst9d+R9PfTIxh1EdNkgoRe5tOX6t01f1lYWOvJnSPDBlG241QLzcyPdoNTsblLUdujGSE4RzrTZGQ==}
'@protobufjs/inquire@1.1.0':
resolution: {integrity: sha512-kdSefcPdruJiFMVSbn801t4vFK7KB/5gd2fYvrxhuJYg8ILrmn9SKSX2tZdV6V+ksulWqS7aXjBcRXl3wHoD9Q==}
'@protobufjs/path@1.1.2':
resolution: {integrity: sha512-6JOcJ5Tm08dOHAbdR3GrvP+yUUfkjG5ePsHYczMFLq3ZmMkAD98cDgcT2iA1lJ9NVwFd4tH/iSSoe44YWkltEA==}
'@protobufjs/pool@1.1.0':
resolution: {integrity: sha512-0kELaGSIDBKvcgS4zkjz1PeddatrjYcmMWOlAuAPwAeccUrPHdUqo/J6LiymHHEiJT5NrF1UVwxY14f+fy4WQw==}
'@protobufjs/utf8@1.1.0':
resolution: {integrity: sha512-Vvn3zZrhQZkkBE8LSuW3em98c0FwgO4nxzv6OdSxPKJIEKY2bGbHn+mhGIPerzI4twdxaP8/0+06HBpwf345Lw==}
'@types/node@22.13.5':
resolution: {integrity: sha512-+lTU0PxZXn0Dr1NBtC7Y8cR21AJr87dLLU953CWA6pMxxv/UDc7jYAY90upcrie1nRcD6XNG5HOYEDtgW5TxAg==}
esbuild@0.25.0:
resolution: {integrity: sha512-BXq5mqc8ltbaN34cDqWuYKyNhX8D/Z0J1xdtdQ8UcIIIyJyz+ZMKUt58tF3SrZ85jcfN/PZYhjR5uDQAYNVbuw==}
engines: {node: '>=18'}
hasBin: true
flatbuffers@1.12.0:
resolution: {integrity: sha512-c7CZADjRcl6j0PlvFy0ZqXQ67qSEZfrVPynmnL+2zPc+NtMvrF8Y0QceMo7QqnSPc7+uWjUIAbvCQ5WIKlMVdQ==}
guid-typescript@1.0.9:
resolution: {integrity: sha512-Y8T4vYhEfwJOTbouREvG+3XDsjr8E3kIr7uf+JZ0BYloFsttiHU0WfvANVsR7TxNUJa/WpCnw/Ino/p+DeBhBQ==}
long@5.3.1:
resolution: {integrity: sha512-ka87Jz3gcx/I7Hal94xaN2tZEOPoUOEVftkQqZx2EeQRN7LGdfLlI3FvZ+7WDplm+vK2Urx9ULrvSowtdCieng==}
onnxruntime-common@1.20.1:
resolution: {integrity: sha512-YiU0s0IzYYC+gWvqD1HzLc46Du1sXpSiwzKb63PACIJr6LfL27VsXSXQvt68EzD3V0D5Bc0vyJTjmMxp0ylQiw==}
onnxruntime-web@1.20.1:
resolution: {integrity: sha512-TePF6XVpLL1rWVMIl5Y9ACBQcyCNFThZON/jgElNd9Txb73CIEGlklhYR3UEr1cp5r0rbGI6nDwwrs79g7WjoA==}
platform@1.3.6:
resolution: {integrity: sha512-fnWVljUchTro6RiCFvCXBbNhJc2NijN7oIQxbwsyL0buWJPG85v81ehlHI9fXrJsMNgTofEoWIQeClKpgxFLrg==}
protobufjs@7.4.0:
resolution: {integrity: sha512-mRUWCc3KUU4w1jU8sGxICXH/gNS94DvI1gxqDvBzhj1JpcsimQkYiOJfwsPUykUI5ZaspFbSgmBLER8IrQ3tqw==}
engines: {node: '>=12.0.0'}
typescript@5.8.3:
resolution: {integrity: sha512-p1diW6TqL9L07nNxvRMM7hMMw4c5XOo/1ibL4aAIGmSAt9slTE1Xgw5KWuof2uTOvCg9BY7ZRi+GaF+7sfgPeQ==}
engines: {node: '>=14.17'}
hasBin: true
undici-types@6.20.0:
resolution: {integrity: sha512-Ny6QZ2Nju20vw1SRHe3d9jVu6gJ+4e3+MMpqu7pqE5HT6WsTSlce++GQmK5UXS8mzV8DSYHrQH+Xrf2jVcuKNg==}
snapshots:
'@biomejs/biome@1.9.4':
optionalDependencies:
'@biomejs/cli-darwin-arm64': 1.9.4
'@biomejs/cli-darwin-x64': 1.9.4
'@biomejs/cli-linux-arm64': 1.9.4
'@biomejs/cli-linux-arm64-musl': 1.9.4
'@biomejs/cli-linux-x64': 1.9.4
'@biomejs/cli-linux-x64-musl': 1.9.4
'@biomejs/cli-win32-arm64': 1.9.4
'@biomejs/cli-win32-x64': 1.9.4
'@biomejs/cli-darwin-arm64@1.9.4':
optional: true
'@biomejs/cli-darwin-x64@1.9.4':
optional: true
'@biomejs/cli-linux-arm64-musl@1.9.4':
optional: true
'@biomejs/cli-linux-arm64@1.9.4':
optional: true
'@biomejs/cli-linux-x64-musl@1.9.4':
optional: true
'@biomejs/cli-linux-x64@1.9.4':
optional: true
'@biomejs/cli-win32-arm64@1.9.4':
optional: true
'@biomejs/cli-win32-x64@1.9.4':
optional: true
'@esbuild/aix-ppc64@0.25.0':
optional: true
'@esbuild/android-arm64@0.25.0':
optional: true
'@esbuild/android-arm@0.25.0':
optional: true
'@esbuild/android-x64@0.25.0':
optional: true
'@esbuild/darwin-arm64@0.25.0':
optional: true
'@esbuild/darwin-x64@0.25.0':
optional: true
'@esbuild/freebsd-arm64@0.25.0':
optional: true
'@esbuild/freebsd-x64@0.25.0':
optional: true
'@esbuild/linux-arm64@0.25.0':
optional: true
'@esbuild/linux-arm@0.25.0':
optional: true
'@esbuild/linux-ia32@0.25.0':
optional: true
'@esbuild/linux-loong64@0.25.0':
optional: true
'@esbuild/linux-mips64el@0.25.0':
optional: true
'@esbuild/linux-ppc64@0.25.0':
optional: true
'@esbuild/linux-riscv64@0.25.0':
optional: true
'@esbuild/linux-s390x@0.25.0':
optional: true
'@esbuild/linux-x64@0.25.0':
optional: true
'@esbuild/netbsd-arm64@0.25.0':
optional: true
'@esbuild/netbsd-x64@0.25.0':
optional: true
'@esbuild/openbsd-arm64@0.25.0':
optional: true
'@esbuild/openbsd-x64@0.25.0':
optional: true
'@esbuild/sunos-x64@0.25.0':
optional: true
'@esbuild/win32-arm64@0.25.0':
optional: true
'@esbuild/win32-ia32@0.25.0':
optional: true
'@esbuild/win32-x64@0.25.0':
optional: true
'@protobufjs/aspromise@1.1.2': {}
'@protobufjs/base64@1.1.2': {}
'@protobufjs/codegen@2.0.4': {}
'@protobufjs/eventemitter@1.1.0': {}
'@protobufjs/fetch@1.1.0':
dependencies:
'@protobufjs/aspromise': 1.1.2
'@protobufjs/inquire': 1.1.0
'@protobufjs/float@1.0.2': {}
'@protobufjs/inquire@1.1.0': {}
'@protobufjs/path@1.1.2': {}
'@protobufjs/pool@1.1.0': {}
'@protobufjs/utf8@1.1.0': {}
'@types/node@22.13.5':
dependencies:
undici-types: 6.20.0
esbuild@0.25.0:
optionalDependencies:
'@esbuild/aix-ppc64': 0.25.0
'@esbuild/android-arm': 0.25.0
'@esbuild/android-arm64': 0.25.0
'@esbuild/android-x64': 0.25.0
'@esbuild/darwin-arm64': 0.25.0
'@esbuild/darwin-x64': 0.25.0
'@esbuild/freebsd-arm64': 0.25.0
'@esbuild/freebsd-x64': 0.25.0
'@esbuild/linux-arm': 0.25.0
'@esbuild/linux-arm64': 0.25.0
'@esbuild/linux-ia32': 0.25.0
'@esbuild/linux-loong64': 0.25.0
'@esbuild/linux-mips64el': 0.25.0
'@esbuild/linux-ppc64': 0.25.0
'@esbuild/linux-riscv64': 0.25.0
'@esbuild/linux-s390x': 0.25.0
'@esbuild/linux-x64': 0.25.0
'@esbuild/netbsd-arm64': 0.25.0
'@esbuild/netbsd-x64': 0.25.0
'@esbuild/openbsd-arm64': 0.25.0
'@esbuild/openbsd-x64': 0.25.0
'@esbuild/sunos-x64': 0.25.0
'@esbuild/win32-arm64': 0.25.0
'@esbuild/win32-ia32': 0.25.0
'@esbuild/win32-x64': 0.25.0
flatbuffers@1.12.0: {}
guid-typescript@1.0.9: {}
long@5.3.1: {}
onnxruntime-common@1.20.1: {}
onnxruntime-web@1.20.1:
dependencies:
flatbuffers: 1.12.0
guid-typescript: 1.0.9
long: 5.3.1
onnxruntime-common: 1.20.1
platform: 1.3.6
protobufjs: 7.4.0
platform@1.3.6: {}
protobufjs@7.4.0:
dependencies:
'@protobufjs/aspromise': 1.1.2
'@protobufjs/base64': 1.1.2
'@protobufjs/codegen': 2.0.4
'@protobufjs/eventemitter': 1.1.0
'@protobufjs/fetch': 1.1.0
'@protobufjs/float': 1.0.2
'@protobufjs/inquire': 1.1.0
'@protobufjs/path': 1.1.2
'@protobufjs/pool': 1.1.0
'@protobufjs/utf8': 1.1.0
'@types/node': 22.13.5
long: 5.3.1
typescript@5.8.3: {}
undici-types@6.20.0: {}

View File

@@ -0,0 +1,108 @@
import * as wasm from "../pkg/sbv2_wasm.js";
import { InferenceSession, Tensor } from "onnxruntime-web";
export class ModelHolder {
private models: Map<string, [InferenceSession, wasm.StyleVectorWrap]> =
new Map();
constructor(
private tok: wasm.TokenizerWrap,
private deberta: InferenceSession,
) {}
public static async globalInit(buf: ArrayBufferLike) {
await wasm.default(buf);
}
public static async create(tok: string, deberta: ArrayBufferLike) {
return new ModelHolder(
wasm.load_tokenizer(tok),
await InferenceSession.create(deberta, {
executionProviders: ["webnn", "webgpu", "wasm", "cpu"],
graphOptimizationLevel: "all",
}),
);
}
public async synthesize(
name: string,
text: string,
style_id: number = 0,
style_weight: number = 1.0,
sdp_ratio: number = 0.4,
speed: number = 1.0,
) {
const mod = this.models.get(name);
if (!mod) throw new Error(`No model named ${name}`);
const [vits2, style] = mod;
return wasm.synthesize(
text,
this.tok,
async (a: BigInt64Array, b: BigInt64Array) => {
try {
const res = (
await this.deberta.run({
input_ids: new Tensor("int64", a, [1, a.length]),
attention_mask: new Tensor("int64", b, [1, b.length]),
})
)["output"];
return [new Uint32Array(res.dims), await res.getData(true)];
} catch (e) {
console.warn(e);
throw e;
}
},
async (
[a_shape, a_array]: any,
b_d: any,
c_d: any,
d_d: any,
e_d: any,
f: number,
g: number,
) => {
try {
const a = new Tensor("float32", a_array, [1, ...a_shape]);
const b = new Tensor("int64", b_d, [1, b_d.length]);
const c = new Tensor("int64", c_d, [1, c_d.length]);
const d = new Tensor("int64", d_d, [1, d_d.length]);
const e = new Tensor("float32", e_d, [1, e_d.length]);
const res = (
await vits2.run({
x_tst: b,
x_tst_lengths: new Tensor("int64", [b_d.length]),
sid: new Tensor("int64", [0]),
tones: c,
language: d,
bert: a,
style_vec: e,
sdp_ratio: new Tensor("float32", [f]),
length_scale: new Tensor("float32", [g]),
noise_scale: new Tensor("float32", [0.677]),
noise_scale_w: new Tensor("float32", [0.8]),
})
).output;
return [new Uint32Array(res.dims), await res.getData(true)];
} catch (e) {
console.warn(e);
throw e;
}
},
sdp_ratio,
1.0 / speed,
style_id,
style_weight,
style,
);
}
public async load(name: string, b: Uint8Array) {
const [style, vits2_b] = wasm.load_sbv2file(b);
const vits2 = await InferenceSession.create(vits2_b as Uint8Array, {
executionProviders: ["webnn", "webgpu", "wasm", "cpu"],
graphOptimizationLevel: "all",
});
this.models.set(name, [vits2, style]);
}
public async unload(name: string) {
return this.models.delete(name);
}
public modelList() {
return this.models.keys();
}
}

View File

@@ -0,0 +1,102 @@
pub fn vec8_to_array8(v: Vec<u8>) -> js_sys::Uint8Array {
let arr = js_sys::Uint8Array::new_with_length(v.len() as u32);
arr.copy_from(&v);
arr
}
pub fn vec_f32_to_array_f32(v: Vec<f32>) -> js_sys::Float32Array {
let arr = js_sys::Float32Array::new_with_length(v.len() as u32);
arr.copy_from(&v);
arr
}
pub fn array8_to_vec8(buf: js_sys::Uint8Array) -> Vec<u8> {
let mut body = vec![0; buf.length() as usize];
buf.copy_to(&mut body[..]);
body
}
pub fn vec64_to_array64(v: Vec<i64>) -> js_sys::BigInt64Array {
let arr = js_sys::BigInt64Array::new_with_length(v.len() as u32);
arr.copy_from(&v);
arr
}
pub fn vec_to_array(v: Vec<wasm_bindgen::JsValue>) -> js_sys::Array {
let arr = js_sys::Array::new_with_length(v.len() as u32);
for (i, v) in v.into_iter().enumerate() {
arr.set(i as u32, v);
}
arr
}
struct A {
shape: Vec<u32>,
data: Vec<f32>,
}
impl TryFrom<wasm_bindgen::JsValue> for A {
type Error = sbv2_core::error::Error;
fn try_from(value: wasm_bindgen::JsValue) -> Result<Self, Self::Error> {
let value: js_sys::Array = value.into();
let mut shape = vec![];
let mut data = vec![];
for (i, v) in value.iter().enumerate() {
match i {
0 => {
let v: js_sys::Uint32Array = v.into();
shape = vec![0; v.length() as usize];
v.copy_to(&mut shape);
}
1 => {
let v: js_sys::Float32Array = v.into();
data = vec![0.0; v.length() as usize];
v.copy_to(&mut data);
}
_ => {}
};
}
Ok(A { shape, data })
}
}
pub fn array_to_array2_f32(
a: wasm_bindgen::JsValue,
) -> sbv2_core::error::Result<ndarray::Array2<f32>> {
let a = A::try_from(a)?;
if a.shape.len() != 2 {
return Err(sbv2_core::error::Error::OtherError(
"Length mismatch".to_string(),
));
}
let shape = [a.shape[0] as usize, a.shape[1] as usize];
let arr = ndarray::Array2::from_shape_vec(shape, a.data.to_vec())
.map_err(|e| sbv2_core::error::Error::OtherError(e.to_string()))?;
Ok(arr)
}
pub fn array_to_array3_f32(
a: wasm_bindgen::JsValue,
) -> sbv2_core::error::Result<ndarray::Array3<f32>> {
let a = A::try_from(a)?;
if a.shape.len() != 3 {
return Err(sbv2_core::error::Error::OtherError(
"Length mismatch".to_string(),
));
}
let shape = [
a.shape[0] as usize,
a.shape[1] as usize,
a.shape[2] as usize,
];
let arr = ndarray::Array3::from_shape_vec(shape, a.data.to_vec())
.map_err(|e| sbv2_core::error::Error::OtherError(e.to_string()))?;
Ok(arr)
}
pub fn array2_f32_to_array(a: ndarray::Array2<f32>) -> js_sys::Array {
let shape: Vec<wasm_bindgen::JsValue> = a.shape().iter().map(|f| (*f as u32).into()).collect();
let typed_array = js_sys::Float32Array::new_with_length(a.len() as u32);
typed_array.copy_from(&a.into_flat().to_vec());
vec_to_array(vec![vec_to_array(shape).into(), typed_array.into()])
}

123
crates/sbv2_wasm/src/lib.rs Normal file
View File

@@ -0,0 +1,123 @@
use once_cell::sync::Lazy;
use sbv2_core::*;
use wasm_bindgen::prelude::*;
use wasm_bindgen_futures::JsFuture;
mod array_helper;
static JTALK: Lazy<jtalk::JTalk> = Lazy::new(|| jtalk::JTalk::new().unwrap());
#[wasm_bindgen]
pub struct TokenizerWrap {
tokenizer: tokenizer::Tokenizer,
}
#[wasm_bindgen]
pub fn load_tokenizer(s: js_sys::JsString) -> Result<TokenizerWrap, JsError> {
if let Some(s) = s.as_string() {
Ok(TokenizerWrap {
tokenizer: tokenizer::Tokenizer::from_bytes(s.as_bytes())
.map_err(|e| JsError::new(&e.to_string()))?,
})
} else {
Err(JsError::new("invalid utf8"))
}
}
#[wasm_bindgen]
pub struct StyleVectorWrap {
style_vector: ndarray::Array2<f32>,
}
#[wasm_bindgen]
pub fn load_sbv2file(buf: js_sys::Uint8Array) -> Result<js_sys::Array, JsError> {
let (style_vectors, vits2) = sbv2file::parse_sbv2file(array_helper::array8_to_vec8(buf))?;
let buf = array_helper::vec8_to_array8(vits2);
Ok(array_helper::vec_to_array(vec![
StyleVectorWrap {
style_vector: style::load_style(style_vectors)?,
}
.into(),
buf.into(),
]))
}
#[allow(clippy::too_many_arguments)]
#[wasm_bindgen]
pub async fn synthesize(
text: &str,
tokenizer: &TokenizerWrap,
bert_predict_fn: js_sys::Function,
synthesize_fn: js_sys::Function,
sdp_ratio: f32,
length_scale: f32,
style_id: i32,
style_weight: f32,
style_vectors: &StyleVectorWrap,
) -> Result<js_sys::Uint8Array, JsError> {
let synthesize_wrap = |bert_ori: ndarray::Array2<f32>,
x_tst: ndarray::Array1<i64>,
tones: ndarray::Array1<i64>,
lang_ids: ndarray::Array1<i64>,
style_vector: ndarray::Array1<f32>,
sdp_ratio: f32,
length_scale: f32| async move {
let arr = array_helper::vec_to_array(vec![
array_helper::array2_f32_to_array(bert_ori).into(),
array_helper::vec64_to_array64(x_tst.to_vec()).into(),
array_helper::vec64_to_array64(tones.to_vec()).into(),
array_helper::vec64_to_array64(lang_ids.to_vec()).into(),
array_helper::vec_f32_to_array_f32(style_vector.to_vec()).into(),
sdp_ratio.into(),
length_scale.into(),
]);
let res = synthesize_fn
.apply(&js_sys::Object::new().into(), &arr)
.map_err(|e| {
error::Error::OtherError(e.as_string().unwrap_or("unknown".to_string()))
})?;
let res = JsFuture::from(Into::<js_sys::Promise>::into(res))
.await
.map_err(|e| {
sbv2_core::error::Error::OtherError(e.as_string().unwrap_or("unknown".to_string()))
})?;
array_helper::array_to_array3_f32(res)
};
let (bert_ori, phones, tones, lang_ids) = tts_util::parse_text(
text,
&JTALK,
&tokenizer.tokenizer,
|token_ids: Vec<i64>, attention_masks: Vec<i64>| {
Box::pin(async move {
let arr = array_helper::vec_to_array(vec![
array_helper::vec64_to_array64(token_ids).into(),
array_helper::vec64_to_array64(attention_masks).into(),
]);
let res = bert_predict_fn
.apply(&js_sys::Object::new().into(), &arr)
.map_err(|e| {
error::Error::OtherError(e.as_string().unwrap_or("unknown".to_string()))
})?;
let res = JsFuture::from(Into::<js_sys::Promise>::into(res))
.await
.map_err(|e| {
sbv2_core::error::Error::OtherError(
e.as_string().unwrap_or("unknown".to_string()),
)
})?;
array_helper::array_to_array2_f32(res)
})
},
)
.await?;
let audio = synthesize_wrap(
bert_ori.to_owned(),
phones,
tones,
lang_ids,
style::get_style_vector(&style_vectors.style_vector, style_id, style_weight)?,
sdp_ratio,
length_scale,
)
.await?;
Ok(array_helper::vec8_to_array8(tts_util::array_to_vec(audio)?))
}

View File

@@ -0,0 +1,15 @@
{
"compilerOptions": {
"target": "ESNext",
"module": "ESNext",
"rootDir": "./src-js",
"outDir": "./dist",
"moduleResolution": "node",
"esModuleInterop": true,
"forceConsistentCasingInFileNames": true,
"strict": true,
"skipLibCheck": true,
"declaration": true,
"emitDeclarationOnly": true
}
}

View File

@@ -1,9 +0,0 @@
FROM rust AS builder
WORKDIR /work
COPY . .
RUN cargo build -r --bin sbv2_api
FROM gcr.io/distroless/cc-debian12
WORKDIR /work
COPY --from=builder /work/target/release/sbv2_api /work/main
COPY --from=builder /work/target/release/*.so /work
CMD ["/work/main"]

View File

@@ -1,22 +0,0 @@
[package]
name = "sbv2_api"
version = "0.1.0"
edition = "2021"
[dependencies]
anyhow.workspace = true
axum = "0.7.5"
dotenvy.workspace = true
env_logger.workspace = true
log = "0.4.22"
sbv2_core = { version = "0.1.3", path = "../sbv2_core" }
serde = { version = "1.0.210", features = ["derive"] }
tokio = { version = "1.40.0", features = ["full"] }
[features]
coreml = ["sbv2_core/coreml"]
cuda = ["sbv2_core/cuda"]
cuda_tf32 = ["sbv2_core/cuda_tf32"]
dynamic = ["sbv2_core/dynamic"]
directml = ["sbv2_core/directml"]
tensorrt = ["sbv2_core/tensorrt"]

View File

@@ -1,15 +0,0 @@
[package]
name = "sbv2_bindings"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[lib]
name = "sbv2_bindings"
crate-type = ["cdylib"]
[dependencies]
anyhow.workspace = true
ndarray.workspace = true
pyo3 = { version = "0.22.0", features = ["anyhow"] }
sbv2_core = { version = "0.1.4", path = "../sbv2_core" }

View File

@@ -1,35 +0,0 @@
[package]
name = "sbv2_core"
description = "Style-Bert-VITSの推論ライブラリ"
version = "0.1.4"
edition = "2021"
license = "MIT"
readme = "../README.md"
repository = "https://github.com/tuna2134/sbv2-api"
documentation = "https://docs.rs/sbv2_core"
[dependencies]
anyhow.workspace = true
dotenvy.workspace = true
env_logger.workspace = true
hound = "3.5.1"
jpreprocess = { version = "0.10.0", features = ["naist-jdic"] }
ndarray.workspace = true
num_cpus = "1.16.0"
once_cell = "1.19.0"
ort = { git = "https://github.com/pykeio/ort.git", version = "2.0.0-rc.6" }
regex = "1.10.6"
serde = { version = "1.0.210", features = ["derive"] }
serde_json = "1.0.128"
tar = "0.4.41"
thiserror = "1.0.63"
tokenizers = "0.20.0"
zstd = "0.13.2"
[features]
cuda = ["ort/cuda"]
cuda_tf32 = []
dynamic = ["ort/load-dynamic"]
directml = ["ort/directml"]
tensorrt = ["ort/tensorrt"]
coreml = ["ort/coreml"]

View File

@@ -1,23 +0,0 @@
use crate::error::Result;
use ndarray::Array2;
use ort::Session;
pub fn predict(
session: &Session,
token_ids: Vec<i64>,
attention_masks: Vec<i64>,
) -> Result<Array2<f32>> {
let outputs = session.run(
ort::inputs! {
"input_ids" => Array2::from_shape_vec((1, token_ids.len()), token_ids).unwrap(),
"attention_mask" => Array2::from_shape_vec((1, attention_masks.len()), attention_masks).unwrap(),
}?
)?;
let output = outputs.get("output").unwrap();
let content = output.try_extract_tensor::<f32>()?.to_owned();
let (data, _) = content.clone().into_raw_vec_and_offset();
Ok(Array2::from_shape_vec((content.shape()[0], content.shape()[1]), data).unwrap())
}

View File

@@ -1,403 +0,0 @@
use crate::error::{Error, Result};
use crate::mora::{MORA_KATA_TO_MORA_PHONEMES, VOWELS};
use crate::norm::{replace_punctuation, PUNCTUATIONS};
use jpreprocess::*;
use once_cell::sync::Lazy;
use regex::Regex;
use std::cmp::Reverse;
use std::collections::HashSet;
use std::sync::Arc;
type JPreprocessType = JPreprocess<DefaultFetcher>;
fn initialize_jtalk() -> Result<JPreprocessType> {
let config = JPreprocessConfig {
dictionary: SystemDictionaryConfig::Bundled(kind::JPreprocessDictionaryKind::NaistJdic),
user_dictionary: None,
};
let jpreprocess = JPreprocess::from_config(config)?;
Ok(jpreprocess)
}
static JTALK_G2P_G_A1_PATTERN: Lazy<Regex> = Lazy::new(|| Regex::new(r"/A:([0-9\-]+)\+").unwrap());
static JTALK_G2P_G_A2_PATTERN: Lazy<Regex> = Lazy::new(|| Regex::new(r"\+(\d+)\+").unwrap());
static JTALK_G2P_G_A3_PATTERN: Lazy<Regex> = Lazy::new(|| Regex::new(r"\+(\d+)/").unwrap());
static JTALK_G2P_G_E3_PATTERN: Lazy<Regex> = Lazy::new(|| Regex::new(r"!(\d+)_").unwrap());
static JTALK_G2P_G_F1_PATTERN: Lazy<Regex> = Lazy::new(|| Regex::new(r"/F:(\d+)_").unwrap());
static JTALK_G2P_G_P3_PATTERN: Lazy<Regex> = Lazy::new(|| Regex::new(r"\-(.*?)\+").unwrap());
fn numeric_feature_by_regex(regex: &Regex, text: &str) -> i32 {
if let Some(mat) = regex.captures(text) {
mat[1].parse::<i32>().unwrap()
} else {
-50
}
}
macro_rules! hash_set {
($($elem:expr),* $(,)?) => {{
let mut set = HashSet::new();
$(
set.insert($elem);
)*
set
}};
}
pub struct JTalk {
pub jpreprocess: Arc<JPreprocessType>,
}
impl JTalk {
pub fn new() -> Result<Self> {
let jpreprocess = Arc::new(initialize_jtalk()?);
Ok(Self { jpreprocess })
}
pub fn num2word(&self, text: &str) -> Result<String> {
let mut parsed = self.jpreprocess.text_to_njd(text)?;
parsed.preprocess();
let texts: Vec<String> = parsed
.nodes
.iter()
.map(|x| x.get_string().to_string())
.collect();
Ok(texts.join(""))
}
pub fn process_text(&self, text: &str) -> Result<JTalkProcess> {
let parsed = self.jpreprocess.run_frontend(text)?;
let jtalk_process = JTalkProcess::new(Arc::clone(&self.jpreprocess), parsed);
Ok(jtalk_process)
}
}
static KATAKANA_PATTERN: Lazy<Regex> = Lazy::new(|| Regex::new(r"[\u30A0-\u30FF]+").unwrap());
static MORA_PATTERN: Lazy<Vec<String>> = Lazy::new(|| {
let mut sorted_keys: Vec<String> = MORA_KATA_TO_MORA_PHONEMES.keys().cloned().collect();
sorted_keys.sort_by_key(|b| Reverse(b.len()));
sorted_keys
});
static LONG_PATTERN: Lazy<Regex> = Lazy::new(|| Regex::new(r"(\w)(ー*)").unwrap());
pub struct JTalkProcess {
jpreprocess: Arc<JPreprocessType>,
parsed: Vec<String>,
}
impl JTalkProcess {
fn new(jpreprocess: Arc<JPreprocessType>, parsed: Vec<String>) -> Self {
Self {
jpreprocess,
parsed,
}
}
fn fix_phone_tone(&self, phone_tone_list: Vec<(String, i32)>) -> Result<Vec<(String, i32)>> {
let tone_values: HashSet<i32> = phone_tone_list
.iter()
.map(|(_letter, tone)| *tone)
.collect();
if tone_values.len() == 1 {
assert!(tone_values == hash_set![0], "{:?}", tone_values);
Ok(phone_tone_list)
} else if tone_values.len() == 2 {
if tone_values == hash_set![0, 1] {
return Ok(phone_tone_list);
} else if tone_values == hash_set![-1, 0] {
return Ok(phone_tone_list
.iter()
.map(|x| {
let new_tone = if x.1 == -1 { 0 } else { 1 };
(x.0.clone(), new_tone)
})
.collect());
} else {
return Err(Error::ValueError("Invalid tone values 0".to_string()));
}
} else {
return Err(Error::ValueError("Invalid tone values 1".to_string()));
}
}
pub fn g2p(&self) -> Result<(Vec<String>, Vec<i32>, Vec<i32>)> {
let phone_tone_list_wo_punct = self.g2phone_tone_wo_punct()?;
let (seq_text, seq_kata) = self.text_to_seq_kata()?;
let sep_phonemes = JTalkProcess::handle_long(
seq_kata
.iter()
.map(|x| JTalkProcess::kata_to_phoneme_list(x.clone()).unwrap())
.collect(),
);
let phone_w_punct: Vec<String> = sep_phonemes
.iter()
.flat_map(|x| x.iter())
.cloned()
.collect();
let mut phone_tone_list =
JTalkProcess::align_tones(phone_w_punct, phone_tone_list_wo_punct)?;
let mut sep_tokenized: Vec<Vec<String>> = Vec::new();
for seq_text_item in &seq_text {
let text = seq_text_item.clone();
if !PUNCTUATIONS.contains(&text.as_str()) {
sep_tokenized.push(text.chars().map(|x| x.to_string()).collect());
} else {
sep_tokenized.push(vec![text]);
}
}
let mut word2ph = Vec::new();
for (token, phoneme) in sep_tokenized.iter().zip(sep_phonemes.iter()) {
let phone_len = phoneme.len() as i32;
let word_len = token.len() as i32;
word2ph.append(&mut JTalkProcess::distribute_phone(phone_len, word_len));
}
let mut new_phone_tone_list = vec![("_".to_string(), 0)];
new_phone_tone_list.append(&mut phone_tone_list);
new_phone_tone_list.push(("_".to_string(), 0));
let mut new_word2ph = vec![1];
new_word2ph.extend(word2ph.clone());
new_word2ph.push(1);
let phones: Vec<String> = new_phone_tone_list.iter().map(|(x, _)| x.clone()).collect();
let tones: Vec<i32> = new_phone_tone_list.iter().map(|(_, x)| *x).collect();
Ok((phones, tones, new_word2ph))
}
fn distribute_phone(n_phone: i32, n_word: i32) -> Vec<i32> {
let mut phones_per_word = vec![0; n_word as usize];
for _ in 0..n_phone {
let min_task = phones_per_word.iter().min().unwrap();
let min_index = phones_per_word
.iter()
.position(|&x| x == *min_task)
.unwrap();
phones_per_word[min_index] += 1;
}
phones_per_word
}
fn align_tones(
phone_with_punct: Vec<String>,
phone_tone_list: Vec<(String, i32)>,
) -> Result<Vec<(String, i32)>> {
let mut result: Vec<(String, i32)> = Vec::new();
let mut tone_index = 0;
for phone in phone_with_punct.clone() {
if tone_index >= phone_tone_list.len() {
result.push((phone, 0));
} else if phone == phone_tone_list[tone_index].0 {
result.push((phone, phone_tone_list[tone_index].1));
tone_index += 1;
} else if PUNCTUATIONS.contains(&phone.as_str()) {
result.push((phone, 0));
} else {
println!("phones {:?}", phone_with_punct);
println!("phone_tone_list: {:?}", phone_tone_list);
println!("result: {:?}", result);
println!("tone_index: {:?}", tone_index);
println!("phone: {:?}", phone);
return Err(Error::ValueError(format!("Mismatched phoneme: {}", phone)));
}
}
Ok(result)
}
fn handle_long(mut sep_phonemes: Vec<Vec<String>>) -> Vec<Vec<String>> {
for i in 0..sep_phonemes.len() {
if sep_phonemes[i].is_empty() {
continue;
}
if sep_phonemes[i][0] == "" {
if i != 0 {
let prev_phoneme = sep_phonemes[i - 1].last().unwrap();
if VOWELS.contains(&prev_phoneme.as_str()) {
sep_phonemes[i][0] = prev_phoneme.clone();
} else {
sep_phonemes[i][0] = "".to_string();
}
} else {
sep_phonemes[i][0] = "".to_string();
}
}
if sep_phonemes[i].contains(&"".to_string()) {
for e in 0..sep_phonemes[i].len() {
if sep_phonemes[i][e] == "" {
sep_phonemes[i][e] =
sep_phonemes[i][e - 1].chars().last().unwrap().to_string();
}
}
}
}
sep_phonemes
}
fn kata_to_phoneme_list(mut text: String) -> Result<Vec<String>> {
if PUNCTUATIONS.contains(&text.as_str()) {
return Ok(text.chars().map(|x| x.to_string()).collect());
}
if !KATAKANA_PATTERN.is_match(&text) {
return Err(Error::ValueError(format!(
"Input must be katakana only: {}",
text
)));
}
for mora in MORA_PATTERN.iter() {
let mora = mora.to_string();
let (consonant, vowel) = MORA_KATA_TO_MORA_PHONEMES.get(&mora).unwrap();
if consonant.is_none() {
text = text.replace(&mora, &format!(" {}", vowel));
} else {
text = text.replace(
&mora,
&format!(" {} {}", consonant.as_ref().unwrap(), vowel),
);
}
}
let long_replacement = |m: &regex::Captures| {
let result = m.get(1).unwrap().as_str().to_string();
let mut second = String::new();
for _ in 0..m.get(2).unwrap().as_str().char_indices().count() {
second += &format!(" {}", m.get(1).unwrap().as_str());
}
result + &second
};
text = LONG_PATTERN
.replace_all(&text, long_replacement)
.to_string();
let data = text.trim().split(' ').map(|x| x.to_string()).collect();
Ok(data)
}
pub fn text_to_seq_kata(&self) -> Result<(Vec<String>, Vec<String>)> {
let mut seq_kata = vec![];
let mut seq_text = vec![];
for parts in &self.parsed {
let (string, pron) = self.parse_to_string_and_pron(parts.clone());
let mut yomi = pron.replace('', "");
let word = replace_punctuation(string);
assert!(!yomi.is_empty(), "Empty yomi: {}", word);
if yomi == "" {
if !word
.chars()
.all(|x| PUNCTUATIONS.contains(&x.to_string().as_str()))
{
yomi = "'".repeat(word.len());
} else {
yomi = word.clone();
}
} else if yomi == "" {
assert!(word == "?", "yomi `` comes from: {}", word);
yomi = "?".to_string();
}
seq_text.push(word);
seq_kata.push(yomi);
}
Ok((seq_text, seq_kata))
}
fn parse_to_string_and_pron(&self, parts: String) -> (String, String) {
let part_lists: Vec<String> = parts.split(',').map(|x| x.to_string()).collect();
(part_lists[0].clone(), part_lists[9].clone())
}
fn g2phone_tone_wo_punct(&self) -> Result<Vec<(String, i32)>> {
let prosodies = self.g2p_prosody()?;
let mut results: Vec<(String, i32)> = Vec::new();
let mut current_phrase: Vec<(String, i32)> = Vec::new();
let mut current_tone = 0;
for (i, letter) in prosodies.iter().enumerate() {
if letter == "^" {
assert!(i == 0);
} else if ["$", "?", "_", "#"].contains(&letter.as_str()) {
results.extend(self.fix_phone_tone(current_phrase.clone())?);
if ["$", "?"].contains(&letter.as_str()) {
assert!(i == prosodies.len() - 1);
}
current_phrase = Vec::new();
current_tone = 0;
} else if letter == "[" {
current_tone += 1;
} else if letter == "]" {
current_tone -= 1;
} else {
let new_letter = if letter == "cl" {
"q".to_string()
} else {
letter.clone()
};
current_phrase.push((new_letter, current_tone));
}
}
Ok(results)
}
fn g2p_prosody(&self) -> Result<Vec<String>> {
let labels = self.jpreprocess.make_label(self.parsed.clone());
let mut phones: Vec<String> = Vec::new();
for (i, label) in labels.iter().enumerate() {
let mut p3 = {
let label_text = label.to_string();
let mattched = JTALK_G2P_G_P3_PATTERN.captures(&label_text).unwrap();
mattched[1].to_string()
};
if "AIUEO".contains(&p3) {
// 文字をlowerする
p3 = p3.to_lowercase();
}
if p3 == "sil" {
assert!(i == 0 || i == labels.len() - 1);
if i == 0 {
phones.push("^".to_string());
} else if i == labels.len() - 1 {
let e3 = numeric_feature_by_regex(&JTALK_G2P_G_E3_PATTERN, &label.to_string());
if e3 == 0 {
phones.push("$".to_string());
} else if e3 == 1 {
phones.push("?".to_string());
}
}
continue;
} else if p3 == "pau" {
phones.push("_".to_string());
continue;
} else {
phones.push(p3.clone());
}
let a1 = numeric_feature_by_regex(&JTALK_G2P_G_A1_PATTERN, &label.to_string());
let a2 = numeric_feature_by_regex(&JTALK_G2P_G_A2_PATTERN, &label.to_string());
let a3 = numeric_feature_by_regex(&JTALK_G2P_G_A3_PATTERN, &label.to_string());
let f1 = numeric_feature_by_regex(&JTALK_G2P_G_F1_PATTERN, &label.to_string());
let a2_next =
numeric_feature_by_regex(&JTALK_G2P_G_A2_PATTERN, &labels[i + 1].to_string());
if a3 == 1 && a2_next == 1 && "aeiouAEIOUNcl".contains(&p3) {
phones.push("#".to_string());
} else if a1 == 0 && a2_next == a2 + 1 && a2 != f1 {
phones.push("]".to_string());
} else if a2 == 1 && a2_next == 2 {
phones.push("[".to_string());
}
}
Ok(phones)
}
}

View File

@@ -1,21 +0,0 @@
use std::fs;
use sbv2_core::tts;
use std::env;
fn main() -> anyhow::Result<()> {
dotenvy::dotenv_override().ok();
env_logger::init();
let text = fs::read_to_string("content.txt")?;
let ident = "aaa";
let mut tts_holder = tts::TTSModelHolder::new(
&fs::read(env::var("BERT_MODEL_PATH")?)?,
&fs::read(env::var("TOKENIZER_PATH")?)?,
)?;
tts_holder.load_sbv2file(ident, fs::read(env::var("MODEL_PATH")?)?)?;
let audio = tts_holder.easy_synthesize(ident, &text, 0, tts::SynthesizeOptions::default())?;
fs::write("output.wav", audio)?;
Ok(())
}

View File

@@ -1,93 +0,0 @@
use crate::error::Result;
use ndarray::{array, Array1, Array2, Array3, Axis};
use ort::{GraphOptimizationLevel, Session};
#[allow(clippy::vec_init_then_push, unused_variables)]
pub fn load_model<P: AsRef<[u8]>>(model_file: P, bert: bool) -> Result<Session> {
let mut exp = Vec::new();
#[cfg(feature = "tensorrt")]
{
if bert {
exp.push(
ort::TensorRTExecutionProvider::default()
.with_fp16(true)
.with_profile_min_shapes("input_ids:1x1,attention_mask:1x1")
.with_profile_max_shapes("input_ids:1x100,attention_mask:1x100")
.with_profile_opt_shapes("input_ids:1x25,attention_mask:1x25")
.build(),
);
}
}
#[cfg(feature = "cuda")]
{
#[allow(unused_mut)]
let mut cuda = ort::CUDAExecutionProvider::default()
.with_conv_algorithm_search(ort::CUDAExecutionProviderCuDNNConvAlgoSearch::Default);
#[cfg(feature = "cuda_tf32")]
{
cuda = cuda.with_tf32(true);
}
exp.push(cuda.build());
}
#[cfg(feature = "directml")]
{
exp.push(ort::DirectMLExecutionProvider::default().build());
}
#[cfg(feature = "coreml")]
{
exp.push(ort::CoreMLExecutionProvider::default().build());
}
exp.push(ort::CPUExecutionProvider::default().build());
Ok(Session::builder()?
.with_execution_providers(exp)?
.with_optimization_level(GraphOptimizationLevel::Level3)?
.with_intra_threads(num_cpus::get_physical())?
.with_parallel_execution(true)?
.with_inter_threads(num_cpus::get_physical())?
.commit_from_memory(model_file.as_ref())?)
}
#[allow(clippy::too_many_arguments)]
pub fn synthesize(
session: &Session,
bert_ori: Array2<f32>,
x_tst: Array1<i64>,
tones: Array1<i64>,
lang_ids: Array1<i64>,
style_vector: Array1<f32>,
sdp_ratio: f32,
length_scale: f32,
) -> Result<Array3<f32>> {
let bert = bert_ori.insert_axis(Axis(0));
let x_tst_lengths: Array1<i64> = array![x_tst.shape()[0] as i64];
let x_tst = x_tst.insert_axis(Axis(0));
let lang_ids = lang_ids.insert_axis(Axis(0));
let tones = tones.insert_axis(Axis(0));
let style_vector = style_vector.insert_axis(Axis(0));
let outputs = session.run(ort::inputs! {
"x_tst" => x_tst,
"x_tst_lengths" => x_tst_lengths,
"sid" => array![0_i64],
"tones" => tones,
"language" => lang_ids,
"bert" => bert,
"style_vec" => style_vector,
"sdp_ratio" => array![sdp_ratio],
"length_scale" => array![length_scale],
}?)?;
let audio_array = outputs
.get("output")
.unwrap()
.try_extract_tensor::<f32>()?
.to_owned();
Ok(Array3::from_shape_vec(
(
audio_array.shape()[0],
audio_array.shape()[1],
audio_array.shape()[2],
),
audio_array.into_raw_vec_and_offset().0,
)?)
}

View File

@@ -1,370 +0,0 @@
use crate::error::{Error, Result};
use crate::{bert, jtalk, model, nlp, norm, style, tokenizer, utils};
use hound::{SampleFormat, WavSpec, WavWriter};
use ndarray::{concatenate, s, Array, Array1, Array2, Array3, Axis};
use ort::Session;
use std::io::{Cursor, Read};
use tar::Archive;
use tokenizers::Tokenizer;
use zstd::decode_all;
#[derive(PartialEq, Eq, Clone)]
pub struct TTSIdent(String);
impl std::fmt::Display for TTSIdent {
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
f.write_str(&self.0)?;
Ok(())
}
}
impl<S> From<S> for TTSIdent
where
S: AsRef<str>,
{
fn from(value: S) -> Self {
TTSIdent(value.as_ref().to_string())
}
}
pub struct TTSModel {
vits2: Session,
style_vectors: Array2<f32>,
ident: TTSIdent,
}
/// High-level Style-Bert-VITS2's API
pub struct TTSModelHolder {
tokenizer: Tokenizer,
bert: Session,
models: Vec<TTSModel>,
jtalk: jtalk::JTalk,
}
impl TTSModelHolder {
/// Initialize a new TTSModelHolder
///
/// # Examples
///
/// ```rs
/// let mut tts_holder = TTSModelHolder::new(std::fs::read("deberta.onnx")?, std::fs::read("tokenizer.json")?)?;
/// ```
pub fn new<P: AsRef<[u8]>>(bert_model_bytes: P, tokenizer_bytes: P) -> Result<Self> {
let bert = model::load_model(bert_model_bytes, true)?;
let jtalk = jtalk::JTalk::new()?;
let tokenizer = tokenizer::get_tokenizer(tokenizer_bytes)?;
Ok(TTSModelHolder {
bert,
models: vec![],
jtalk,
tokenizer,
})
}
/// Return a list of model names
pub fn models(&self) -> Vec<String> {
self.models.iter().map(|m| m.ident.to_string()).collect()
}
/// Load a .sbv2 file binary
///
/// # Examples
///
/// ```rs
/// tts_holder.load_sbv2file("tsukuyomi", std::fs::read("tsukuyomi.sbv2")?)?;
/// ```
pub fn load_sbv2file<I: Into<TTSIdent>, P: AsRef<[u8]>>(
&mut self,
ident: I,
sbv2_bytes: P,
) -> Result<()> {
let mut arc = Archive::new(Cursor::new(decode_all(Cursor::new(sbv2_bytes.as_ref()))?));
let mut vits2 = None;
let mut style_vectors = None;
let mut et = arc.entries()?;
while let Some(Ok(mut e)) = et.next() {
let pth = String::from_utf8_lossy(&e.path_bytes()).to_string();
let mut b = Vec::with_capacity(e.size() as usize);
e.read_to_end(&mut b)?;
match pth.as_str() {
"model.onnx" => vits2 = Some(b),
"style_vectors.json" => style_vectors = Some(b),
_ => continue,
}
}
if style_vectors.is_none() {
return Err(Error::ModelNotFoundError("style_vectors".to_string()));
}
if vits2.is_none() {
return Err(Error::ModelNotFoundError("vits2".to_string()));
}
self.load(ident, style_vectors.unwrap(), vits2.unwrap())?;
Ok(())
}
/// Load a style vector and onnx model binary
///
/// # Examples
///
/// ```rs
/// tts_holder.load("tsukuyomi", std::fs::read("style_vectors.json")?, std::fs::read("model.onnx")?)?;
/// ```
pub fn load<I: Into<TTSIdent>, P: AsRef<[u8]>>(
&mut self,
ident: I,
style_vectors_bytes: P,
vits2_bytes: P,
) -> Result<()> {
let ident = ident.into();
if self.find_model(ident.clone()).is_err() {
self.models.push(TTSModel {
vits2: model::load_model(vits2_bytes, false)?,
style_vectors: style::load_style(style_vectors_bytes)?,
ident,
})
}
Ok(())
}
/// Unload a model
pub fn unload<I: Into<TTSIdent>>(&mut self, ident: I) -> bool {
let ident = ident.into();
if let Some((i, _)) = self
.models
.iter()
.enumerate()
.find(|(_, m)| m.ident == ident)
{
self.models.remove(i);
true
} else {
false
}
}
/// Parse text and return the input for synthesize
///
/// # Note
/// This function is for low-level usage, use `easy_synthesize` for high-level usage.
#[allow(clippy::type_complexity)]
pub fn parse_text(
&self,
text: &str,
) -> Result<(Array2<f32>, Array1<i64>, Array1<i64>, Array1<i64>)> {
let text = self.jtalk.num2word(text)?;
let normalized_text = norm::normalize_text(&text);
let process = self.jtalk.process_text(&normalized_text)?;
let (phones, tones, mut word2ph) = process.g2p()?;
let (phones, tones, lang_ids) = nlp::cleaned_text_to_sequence(phones, tones);
let phones = utils::intersperse(&phones, 0);
let tones = utils::intersperse(&tones, 0);
let lang_ids = utils::intersperse(&lang_ids, 0);
for item in &mut word2ph {
*item *= 2;
}
word2ph[0] += 1;
let text = {
let (seq_text, _) = process.text_to_seq_kata()?;
seq_text.join("")
};
let (token_ids, attention_masks) = tokenizer::tokenize(&text, &self.tokenizer)?;
let bert_content = bert::predict(&self.bert, token_ids, attention_masks)?;
assert!(
word2ph.len() == text.chars().count() + 2,
"{} {}",
word2ph.len(),
normalized_text.chars().count()
);
let mut phone_level_feature = vec![];
for (i, reps) in word2ph.iter().enumerate() {
let repeat_feature = {
let (reps_rows, reps_cols) = (*reps, 1);
let arr_len = bert_content.slice(s![i, ..]).len();
let mut results: Array2<f32> =
Array::zeros((reps_rows as usize, arr_len * reps_cols));
for j in 0..reps_rows {
for k in 0..reps_cols {
let mut view = results.slice_mut(s![j, k * arr_len..(k + 1) * arr_len]);
view.assign(&bert_content.slice(s![i, ..]));
}
}
results
};
phone_level_feature.push(repeat_feature);
}
let phone_level_feature = concatenate(
Axis(0),
&phone_level_feature
.iter()
.map(|x| x.view())
.collect::<Vec<_>>(),
)?;
let bert_ori = phone_level_feature.t();
Ok((
bert_ori.to_owned(),
phones.into(),
tones.into(),
lang_ids.into(),
))
}
fn find_model<I: Into<TTSIdent>>(&self, ident: I) -> Result<&TTSModel> {
let ident = ident.into();
self.models
.iter()
.find(|m| m.ident == ident)
.ok_or(Error::ModelNotFoundError(ident.to_string()))
}
/// Get style vector by style id and weight
///
/// # Note
/// This function is for low-level usage, use `easy_synthesize` for high-level usage.
pub fn get_style_vector<I: Into<TTSIdent>>(
&self,
ident: I,
style_id: i32,
weight: f32,
) -> Result<Array1<f32>> {
style::get_style_vector(&self.find_model(ident)?.style_vectors, style_id, weight)
}
/// Synthesize text to audio
///
/// # Examples
///
/// ```rs
/// let audio = tts_holder.easy_synthesize("tsukuyomi", "こんにちは", 0, SynthesizeOptions::default())?;
/// ```
pub fn easy_synthesize<I: Into<TTSIdent> + Copy>(
&self,
ident: I,
text: &str,
style_id: i32,
options: SynthesizeOptions,
) -> Result<Vec<u8>> {
let style_vector = self.get_style_vector(ident, style_id, options.style_weight)?;
let audio_array = if options.split_sentences {
let texts: Vec<&str> = text.split('\n').collect();
let mut audios = vec![];
for (i, t) in texts.iter().enumerate() {
if t.is_empty() {
continue;
}
let (bert_ori, phones, tones, lang_ids) = self.parse_text(t)?;
let audio = model::synthesize(
&self.find_model(ident)?.vits2,
bert_ori.to_owned(),
phones,
tones,
lang_ids,
style_vector.clone(),
options.sdp_ratio,
options.length_scale,
)?;
audios.push(audio.clone());
if i != texts.len() - 1 {
audios.push(Array3::zeros((1, 1, 22050)));
}
}
concatenate(
Axis(2),
&audios.iter().map(|x| x.view()).collect::<Vec<_>>(),
)?
} else {
let (bert_ori, phones, tones, lang_ids) = self.parse_text(text)?;
model::synthesize(
&self.find_model(ident)?.vits2,
bert_ori.to_owned(),
phones,
tones,
lang_ids,
style_vector,
options.sdp_ratio,
options.length_scale,
)?
};
Self::array_to_vec(audio_array)
}
fn array_to_vec(audio_array: Array3<f32>) -> Result<Vec<u8>> {
let spec = WavSpec {
channels: 1,
sample_rate: 44100,
bits_per_sample: 32,
sample_format: SampleFormat::Float,
};
let mut cursor = Cursor::new(Vec::new());
let mut writer = WavWriter::new(&mut cursor, spec)?;
for i in 0..audio_array.shape()[0] {
let output = audio_array.slice(s![i, 0, ..]).to_vec();
for sample in output {
writer.write_sample(sample)?;
}
}
writer.finalize()?;
Ok(cursor.into_inner())
}
/// Synthesize text to audio
///
/// # Note
/// This function is for low-level usage, use `easy_synthesize` for high-level usage.
#[allow(clippy::too_many_arguments)]
pub fn synthesize<I: Into<TTSIdent>>(
&self,
ident: I,
bert_ori: Array2<f32>,
phones: Array1<i64>,
tones: Array1<i64>,
lang_ids: Array1<i64>,
style_vector: Array1<f32>,
sdp_ratio: f32,
length_scale: f32,
) -> Result<Vec<u8>> {
let audio_array = model::synthesize(
&self.find_model(ident)?.vits2,
bert_ori.to_owned(),
phones,
tones,
lang_ids,
style_vector,
sdp_ratio,
length_scale,
)?;
Self::array_to_vec(audio_array)
}
}
/// Synthesize options
///
/// # Fields
/// - `sdp_ratio`: SDP ratio
/// - `length_scale`: Length scale
/// - `style_weight`: Style weight
/// - `split_sentences`: Split sentences
pub struct SynthesizeOptions {
pub sdp_ratio: f32,
pub length_scale: f32,
pub style_weight: f32,
pub split_sentences: bool,
}
impl Default for SynthesizeOptions {
fn default() -> Self {
SynthesizeOptions {
sdp_ratio: 0.0,
length_scale: 1.0,
style_weight: 1.0,
split_sentences: true,
}
}
}

5
scripts/.gitignore vendored Normal file
View File

@@ -0,0 +1,5 @@
*.json
venv/
tmp/
*.safetensors
*.npy

View File

@@ -0,0 +1 @@
3.11

View File

@@ -5,6 +5,7 @@ from transformers import AutoModelForMaskedLM, AutoTokenizer
import torch
from torch import nn
from argparse import ArgumentParser
import os
parser = ArgumentParser()
parser.add_argument("--model", default="ku-nlp/deberta-v2-large-japanese-char-wwm")
@@ -15,7 +16,7 @@ bert_models.load_tokenizer(Languages.JP, model_name)
tokenizer = bert_models.load_tokenizer(Languages.JP)
converter = BertConverter(tokenizer)
tokenizer = converter.converted()
tokenizer.save("../models/tokenizer.json")
tokenizer.save("../../models/tokenizer.json")
class ORTDeberta(nn.Module):
@@ -42,9 +43,10 @@ inputs = AutoTokenizer.from_pretrained(model_name)(
torch.onnx.export(
model,
(inputs["input_ids"], inputs["token_type_ids"], inputs["attention_mask"]),
"../models/deberta.onnx",
"../../models/deberta.onnx",
input_names=["input_ids", "token_type_ids", "attention_mask"],
output_names=["output"],
verbose=True,
dynamic_axes={"input_ids": {1: "batch_size"}, "attention_mask": {1: "batch_size"}},
)
os.system("onnxsim ../../models/deberta.onnx ../../models/deberta.onnx")

View File

@@ -36,7 +36,7 @@ data = array.tolist()
hyper_parameters = HyperParameters.load_from_json(config_file)
out_name = hyper_parameters.model_name
with open(f"../models/style_vectors_{out_name}.json", "w") as f:
with open(f"../../models/style_vectors_{out_name}.json", "w") as f:
json.dump(
{
"data": data,
@@ -94,7 +94,7 @@ model = get_net_g(
)
def forward(x, x_len, sid, tone, lang, bert, style, length_scale, sdp_ratio):
def forward(x, x_len, sid, tone, lang, bert, style, length_scale, sdp_ratio, noise_scale, noise_scale_w):
return model.infer(
x,
x_len,
@@ -105,6 +105,8 @@ def forward(x, x_len, sid, tone, lang, bert, style, length_scale, sdp_ratio):
style,
sdp_ratio=sdp_ratio,
length_scale=length_scale,
noise_scale=noise_scale,
noise_scale_w=noise_scale_w,
)
@@ -122,8 +124,10 @@ torch.onnx.export(
style_vec_tensor,
torch.tensor(1.0),
torch.tensor(0.0),
torch.tensor(0.6777),
torch.tensor(0.8),
),
f"../models/model_{out_name}.onnx",
f"../../models/model_{out_name}.onnx",
verbose=True,
dynamic_axes={
"x_tst": {0: "batch_size", 1: "x_tst_max_length"},
@@ -144,14 +148,16 @@ torch.onnx.export(
"style_vec",
"length_scale",
"sdp_ratio",
"noise_scale",
"noise_scale_w"
],
output_names=["output"],
)
os.system(f"onnxsim ../models/model_{out_name}.onnx ../models/model_{out_name}.onnx")
onnxfile = open(f"../models/model_{out_name}.onnx", "rb").read()
stylefile = open(f"../models/style_vectors_{out_name}.json", "rb").read()
os.system(f"onnxsim ../../models/model_{out_name}.onnx ../../models/model_{out_name}.onnx")
onnxfile = open(f"../../models/model_{out_name}.onnx", "rb").read()
stylefile = open(f"../../models/style_vectors_{out_name}.json", "rb").read()
version = bytes("1", "utf8")
with taropen(f"../models/tmp_{out_name}.sbv2tar", "w") as w:
with taropen(f"../../models/tmp_{out_name}.sbv2tar", "w") as w:
def add_tar(f, b):
t = TarInfo(f)
@@ -161,9 +167,9 @@ with taropen(f"../models/tmp_{out_name}.sbv2tar", "w") as w:
add_tar("version.txt", version)
add_tar("model.onnx", onnxfile)
add_tar("style_vectors.json", stylefile)
open(f"../models/{out_name}.sbv2", "wb").write(
open(f"../../models/{out_name}.sbv2", "wb").write(
ZstdCompressor(threads=-1, level=22).compress(
open(f"../models/tmp_{out_name}.sbv2tar", "rb").read()
open(f"../../models/tmp_{out_name}.sbv2tar", "rb").read()
)
)
os.unlink(f"../models/tmp_{out_name}.sbv2tar")
os.unlink(f"../../models/tmp_{out_name}.sbv2tar")

View File

@@ -0,0 +1,6 @@
git+https://github.com/neodyland/style-bert-vits2-ref
onnxsim
numpy<2
zstandard
onnxruntime
cmake<4

View File

@@ -0,0 +1,17 @@
FROM rust AS builder
WORKDIR /work
COPY . .
RUN cargo build -r --bin sbv2_api
FROM ubuntu AS upx
WORKDIR /work
RUN apt update && apt-get install -y upx binutils
COPY --from=builder /work/target/release/sbv2_api /work/main
COPY --from=builder /work/target/release/*.so /work
RUN upx --best --lzma /work/main
RUN find /work -maxdepth 1 -name "*.so" -exec strip --strip-unneeded {} +
RUN find /work -maxdepth 1 -name "*.so" -exec upx --best --lzma {} +
FROM gcr.io/distroless/cc-debian12
WORKDIR /work
COPY --from=upx /work/main /work/main
COPY --from=upx /work/*.so /work
CMD ["/work/main"]

View File

@@ -2,9 +2,16 @@ FROM rust AS builder
WORKDIR /work
COPY . .
RUN cargo build -r --bin sbv2_api -F cuda,cuda_tf32
FROM nvidia/cuda:12.3.2-cudnn9-runtime-ubuntu22.04
FROM ubuntu AS upx
WORKDIR /work
RUN apt update && apt-get install -y upx binutils
COPY --from=builder /work/target/release/sbv2_api /work/main
COPY --from=builder /work/target/release/*.so /work
RUN upx --best --lzma /work/main
RUN find /work -maxdepth 1 -name "*.so" -exec strip --strip-unneeded {} +
FROM nvidia/cuda:12.3.2-cudnn9-runtime-ubuntu22.04
WORKDIR /work
COPY --from=upx /work/main /work/main
COPY --from=upx /work/*.so /work
ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/work
CMD ["/work/main"]
CMD ["/work/main"]

View File

@@ -1,3 +1,3 @@
docker run -it --rm -p 3000:3000 --name sbv2 \
-v ./models:/work/models --env-file .env \
ghcr.io/tuna2134/sbv2-api:cpu
ghcr.io/neodyland/sbv2-api:cpu

View File

@@ -1,4 +1,4 @@
docker run -it --rm -p 3000:3000 --name sbv2 \
-v ./models:/work/models --env-file .env \
--gpus all \
ghcr.io/tuna2134/sbv2-api:cuda
ghcr.io/neodyland/sbv2-api:cuda

14
scripts/make_dict.sh Executable file
View File

@@ -0,0 +1,14 @@
#!/bin/bash
set -e
git clone https://github.com/Aivis-Project/AivisSpeech-Engine ./scripts/tmp --filter=blob:none -n
cd ./scripts/tmp
git checkout 168b2a1144afe300b0490d9a6dd773ec6e927667 -- resources/dictionaries/*.csv
cd ../..
rm -rf ./crates/sbv2_core/src/dic
cp -r ./scripts/tmp/resources/dictionaries ./crates/sbv2_core/src/dic
rm -rf ./scripts/tmp
for file in ./crates/sbv2_core/src/dic/0*.csv; do
/usr/bin/cat "$file"
echo
done > ./crates/sbv2_core/src/all.csv
lindera build ./crates/sbv2_core/src/all.csv ./crates/sbv2_core/src/dic/all.dic -u -k ipadic

View File

@@ -0,0 +1,180 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 音声合成プログラム\n",
"\n",
"このノートブックでは、`sbv2_bindings` パッケージを使用して音声合成を行います。必要なモデルをダウンロードし、ユーザーが入力したテキストから音声を生成します。音声合成が終わったら、再度テキストの入力を求め、ユーザーが終了するまで繰り返します。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 必要なパッケージのインストール\n",
"%pip install sbv2_bindings\n",
"\n",
"# 必要なモジュールのインポート\n",
"import os\n",
"import urllib.request\n",
"import time\n",
"from sbv2_bindings import TTSModel"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## モデルのダウンロード\n",
"\n",
"モデルファイルとトークナイザーをダウンロードします。ユーザーが独自のモデルを使用したい場合は、該当するURLまたはローカルパスを指定してください。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# モデルの URL またはローカルパスの指定\n",
"user_sbv2_model_url = \"\" # カスタムモデルのURLがあればここに指定\n",
"user_sbv2_model_path = \"\" # カスタムモデルのローカルパスがあればここに指定\n",
"\n",
"# モデル用のディレクトリを作成\n",
"model_dir = 'models'\n",
"os.makedirs(model_dir, exist_ok=True)\n",
"\n",
"# ダウンロードするファイルの URL\n",
"file_urls = [\n",
" \"https://huggingface.co/googlefan/sbv2_onnx_models/resolve/main/tokenizer.json\",\n",
" \"https://huggingface.co/googlefan/sbv2_onnx_models/resolve/main/deberta.onnx\",\n",
"]\n",
"\n",
"# モデルのパス決定\n",
"if user_sbv2_model_path:\n",
" sbv2_model_path = user_sbv2_model_path # ローカルモデルのパスを使用\n",
"elif user_sbv2_model_url:\n",
" sbv2_model_filename = os.path.basename(user_sbv2_model_url)\n",
" sbv2_model_path = os.path.join(model_dir, sbv2_model_filename)\n",
" file_urls.append(user_sbv2_model_url)\n",
"else:\n",
" # デフォルトのモデルを使用\n",
" sbv2_model_filename = \"tsukuyomi.sbv2\"\n",
" sbv2_model_path = os.path.join(model_dir, sbv2_model_filename)\n",
" file_urls.append(\"https://huggingface.co/googlefan/sbv2_onnx_models/resolve/main/tsukuyomi.sbv2\")\n",
"\n",
"# ファイルをダウンロード\n",
"for url in file_urls:\n",
" file_name = os.path.join(model_dir, os.path.basename(url))\n",
" if not os.path.exists(file_name):\n",
" print(f\"{file_name} をダウンロードしています...\")\n",
" urllib.request.urlretrieve(url, file_name)\n",
" else:\n",
" print(f\"{file_name} は既に存在します。\")\n",
"\n",
"# ダウンロードまたは使用するファイルを確認\n",
"print(\"\\n使用するファイル:\")\n",
"for file in os.listdir(model_dir):\n",
" print(file)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## モデルの読み込みと音声合成\n",
"\n",
"モデルを読み込み、ユーザーが入力したテキストから音声を生成します。話者名は使用する `.sbv2` ファイル名から自動的に取得します。音声合成が終わったら、再度テキストの入力を求め、ユーザーが終了するまで繰り返します。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 音声合成の実行\n",
"def main():\n",
" try:\n",
" print(\"\\nモデルを読み込んでいます...\")\n",
" model = TTSModel.from_path(\n",
" os.path.join(model_dir, \"deberta.onnx\"),\n",
" os.path.join(model_dir, \"tokenizer.json\")\n",
" )\n",
" print(\"モデルの読み込みが完了しました!\")\n",
" except Exception as e:\n",
" print(f\"モデルの読み込みに失敗しました: {e}\")\n",
" return\n",
"\n",
" # 話者名を取得(.sbv2 ファイル名の拡張子を除いた部分)\n",
" speaker_name = os.path.splitext(os.path.basename(sbv2_model_path))[0]\n",
" \n",
" # 指定されたモデルのパスを使用\n",
" try:\n",
" model.load_sbv2file_from_path(speaker_name, sbv2_model_path)\n",
" print(f\"話者 '{speaker_name}' のセットアップが完了しました!\")\n",
" except Exception as e:\n",
" print(f\"SBV2ファイルの読み込みに失敗しました: {e}\")\n",
" return\n",
"\n",
" # 音声合成を繰り返し実行\n",
" while True:\n",
" # 合成したいテキストをユーザーから入力\n",
" user_input = input(\"\\n音声合成したいテキストを入力してください終了するには 'exit' と入力): \")\n",
" \n",
" if user_input.strip().lower() == 'exit':\n",
" print(\"音声合成を終了します。\")\n",
" break\n",
"\n",
" # 出力ファイル名\n",
" output_file = \"output.wav\"\n",
"\n",
" # 音声合成を実行\n",
" try:\n",
" print(\"\\n音声合成を開始します...\")\n",
" start_time = time.time()\n",
"\n",
" audio_data = model.synthesize(user_input, speaker_name, 0, 0.0, 1)\n",
"\n",
" with open(output_file, \"wb\") as f:\n",
" f.write(audio_data)\n",
"\n",
" end_time = time.time()\n",
" elapsed_time = end_time - start_time\n",
"\n",
" print(f\"\\n音声が '{output_file}' に保存されました。\")\n",
" print(f\"音声合成にかかった時間: {elapsed_time:.2f} 秒\")\n",
" except Exception as e:\n",
" print(f\"音声合成に失敗しました: {e}\")\n",
"\n",
"if __name__ == \"__main__\":\n",
" main()"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.x"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

8
scripts/sbv2-test-api.py Normal file
View File

@@ -0,0 +1,8 @@
import requests
res = requests.post(
"http://localhost:3000/synthesize",
json={"text": "おはようございます", "ident": "tsukuyomi"},
)
with open("output.wav", "wb") as f:
f.write(res.content)

View File

@@ -3,16 +3,16 @@ from sbv2_bindings import TTSModel
def main():
print("Loading models...")
model = TTSModel.from_path("../models/debert.onnx", "../models/tokenizer.json")
model = TTSModel.from_path("./models/debert.onnx", "./models/tokenizer.json")
print("Models loaded!")
model.load_sbv2file_from_path("amitaro", "../models/amitaro.sbv2")
model.load_sbv2file_from_path("amitaro", "./models/amitaro.sbv2")
print("All setup is done!")
style_vector = model.get_style_vector("amitaro", 0, 1.0)
with open("output.wav", "wb") as f:
f.write(
model.synthesize("おはようございます。", "amitaro", style_vector, 0.0, 0.5)
model.synthesize("おはようございます。", "amitaro", 0, 0, 0.0, 0.5)
)

23
test.py
View File

@@ -1,8 +1,19 @@
import requests
res = requests.post(
"http://localhost:3000/synthesize",
json={"text": "おはようございます", "ident": "tsukuyomi"},
)
with open("output.wav", "wb") as f:
f.write(res.content)
data = (requests.get("http://localhost:8080/audio_query", params={
"text": "こんにちは、今日はいい天気ですね。",
})).json()
print(data)
data = (requests.post("http://localhost:8080/synthesis", json={
"text": data["text"],
"ident": "tsukuyomi",
"speaker_id": 0,
"style_id": 0,
"sdp_ratio": 0.5,
"length_scale": 0.5,
"audio_query": data["audio_query"],
})).content
with open("test.wav", "wb") as f:
f.write(data)