Providers specifics by technology
Below, you will find an overview of the different providers and their specificities regarding your uses of widgets.
Each of providers has different specificities and their uses will depend on what you are looking for in terms of technical characteristics but also according to the characteristics of your computer and your operating system.
The 3 different technology providers available on VDK are:
Cerence
Sensory
Vivoka
Provider specifics for Audio Front-End
Provider specifics for Free-Speech
Provider | Cerence |
Language count | 24 |
Free-speech customization | Limited |
Language list |
|
SDK name | CSDK |
Resource size | Between 250 to 300 MB |
SDK code size | ~65 MB |
Platform supported | WINDOWS - X86_64 LINUX - X86_64 ARMV7HF ARMV8 ANDROID 6.0 (API 23) |
Hardware supported | MPU |
Providers specifics for Grammar Editor
Provider | Cerence | Sensory | Vivoka |
Language count | 41 | 7 | 2 |
Phonetic supported | Yes | Yes | Yes |
Dynamic data supported | Yes | Yes | Yes |
Language list |
|
|
|
SDK name | CSDK | TNL | VASR |
SDK version | 4 | 6.17.0 | 1.0.0 |
Model + resource size | ~15 MB | ~6 MB | ~100 MB |
SDK code size | ~65 MB | ~10 MB | ~200 MB |
Platform supported | WINDOWS - X86_64 LINUX - X86_64 ARMV7HF ARMV8 ANDROID 6.0 (API 23) | WINDOWS - X86_64 LINUX - X86_64 ARMV7HF ARMV8 ANDROID 6.0 (API 23) | WINDOWS - X86_64 LINUX - X86_64 ARMV8 |
Hardware supported | MPU | MPU | MPU |
Providers specifics for Voice Biometrics
Feature | Sensory | ID R&D |
---|---|---|
Authentication from file | Yes | Yes |
Authentication from streaming (microphone) | Yes | Yes |
Identification from file | Yes | Yes |
Identification from streaming (microphone) | Yes | No |
Text dependent | Yes | Yes |
Text independent | Yes | Yes |
SDK name | TSSV | IdVoice |
Resource size | < 1 MB | ~230 MB |
Voice template size | ~50 kB / user | ~5 kB / user |
Platform supported | WINDOWS - X86_64 LINUX - X86_64 ARMV7HF ARMV8 ANDROID 6.0 (API 23) | WINDOWS - X86_64 LINUX - X86_64 ARMV8 ANDROID 6.0 (API 23) |
Hardware supported | MPU | MPU |
Providers specifics for Voice Synthesis
Provider | Cerence | Voxygen | Readspeaker |
---|---|---|---|
Language count | 67 | 8 | 30 |
Voice count | |||
Creation of custom voice possible1 | Yes (in studio) | Yes (in studio) | Yes (in studio) |
Emotion simulation2 | Yes (on request) | No | No |
Emotion presets3 | No | Yes | No |
Voice quality choice |
Not all voices are available in every quality. |
|
Not all voices are available in every quality. Not all qualities are available for every OS. |
Language list |
|
|
|
SDK name | CSDK | Baratinoo | VTAPI |
Voice size |
800KiB ↦ 30 MiB
3.4MiB ↦ 114 MiB
27MiB ↦ 316MiB
40MiB ↦ 527 MiB
191MiB ↦ 266 MiB |
50MiB ↦ 300MiB |
4MiB ↦ 36MB
126MiB ↦ 450MiB |
SDK code size | ~50 MiB | ~25 MiB | ~5 MiB |
Platform supported | WINDOWS - X86_64 LINUX - X86_64 ARMV7HF ARMV8 ANDROID 6.0 (API 23) | WINDOWS - X86_64 LINUX - X86_64 ARMV7HF ARMV8 ANDROID 6.0 (API 23) | WINDOWS - X86_64 LINUX - X86_64 ARMV7HF ARMV8 ANDROID 6.0 (API 23) |
Hardware supported | MPU | MPU | MPU |
Creation of custom voice possible: You can modify the voice using Ssml markups which change the pitch, the rate, the timbre, the volume, …
Emotion simulation: You can change the voice style using the Ssml markups i.e lively, neutral, formal, conversational, apologetic, didactic, … You can check this page for more details about the supported styles of each voice.
Emotion presets: You can play recorded emotion audio by writing the name of the record in your voice synthesis. You can check Voxygen voices features for more details about the supported emotion presets by voice.