SDK specifics
Below, you will find an overview of the different SDKs and their specificities regarding your uses of widgets.
Each SDK has different specificities and their uses will depend on what you are looking for in terms of technical characteristics but also according to the characteristics of your computer and your operating system.
Dictation
SDK | VSDK-CSDK |
Language count | 24 |
Free-speech customization | Limited |
Language list |
|
Resource size | Between 250 to 300 MB |
SDK code size | ~65 MB |
Platform supported | WINDOWS - X86_64 LINUX - X86_64 ARMV7HF ARMV8 ANDROID 6.0 (API 23) |
Hardware supported | MPU |
Grammar Base Recognition
SDK | VSDK-CSDK | VSDK-TNL | VSDK-VASR |
Language count | 41 | 7 | 5 |
Dynamic data | Yes | Yes | Yes |
Phonetic | Yes | No | Yes |
Phonetic in dynamic data | Yes | No | Yes |
Tag annotation | Yes | No | Yes |
Intermediate results | Yes | Yes | Yes |
Voice activity detection | Yes | Yes | Yes |
Confidence score | Yes | No | Yes |
Event detection | Yes | Yes | Yes |
Language list |
|
|
|
SDK version | 3 | 6.17.0 | 3.0.5 |
Model + resource size | ~15 MB | ~6 MB | ~10 MB |
SDK code size | ~65 MB | ~10 MB | ~50 MB |
Platform supported | WINDOWS - X86_64 LINUX - X86_64 LINUX - ARMV7HF LINUX - ARMV8 ANDROID 6.0 (API 23) | WINDOWS - X86_64 LINUX - X86_64 LINUX - ARMV7HF LINUX - ARMV8 ANDROID 6.0 (API 23) | WINDOWS - X86_64 LINUX - X86_64 LINUX - ARMV7HFLINUX - ARMV8 ANDROID 7.0 (API 24) |
Hardware supported | MPU | MPU | MPU |
Voice Biometrics
Feature | VSDK-TSSV | VSDK-IDVOICE |
---|---|---|
Authentication from file | Yes | Yes |
Authentication from streaming (microphone) | Yes | Yes |
Identification from file | Yes | Yes |
Identification from streaming (microphone) | Yes | No |
Text dependent | Yes | Yes |
Text independent | Yes | Yes |
Resource size | < 1 MB | ~230 MB |
Voice template size | ~50 kB / user | ~5 kB / user |
Platform supported | WINDOWS - X86_64 LINUX - X86_64 ARMV7HF ARMV8 ANDROID 6.0 (API 23) | WINDOWS - X86_64 LINUX - X86_64 ARMV8 ANDROID 6.0 (API 23) |
Hardware supported | MPU | MPU |
Enrollment feature (per utterance) | VSDK-TSSV | VSDK-IDVOICE |
---|---|---|
Is accepted | Yes | Yes |
SNR level | Yes | Yes |
SNR is acceptable | Yes | Yes |
Contains speech | Yes | Yes |
Speech begin time | Yes | No |
Speech end time | Yes | No |
Speech duration | Yes | Yes |
Speech duration is acceptable | Yes | Yes |
Is peak clipped | Yes | No |
Is band limited | Yes | No |
Is consistent (relative to previous ones, text dependent only) | Yes | No |
Is phrase verified (text dependent only) | Yes | No |
Voice Synthesis
SDK | VSDK-CSDK | VSDK-BARATINOO | VSDK-VTAPI |
---|---|---|---|
Language count | 65 | 8 | 30 |
Voice count | |||
Creation of custom voice possible1 | Yes (in studio) | Yes (in studio) | Yes (in studio) |
Emotion simulation2 | Yes (on request) | No | No |
Emotion presets3 | No | Yes | No |
Voice quality choice |
Not all voices are available in every quality. |
|
Not all voices are available in every quality. Not all qualities are available for every OS. |
Language list |
|
|
|
Voice size |
800KiB ↦ 30 MiB
3.4MiB ↦ 114 MiB
27MiB ↦ 316MiB
40MiB ↦ 527 MiB
191MiB ↦ 266 MiB |
50MiB ↦ 300MiB |
4MiB ↦ 36MB
126MiB ↦ 450MiB |
SDK code size | ~50 MiB | ~25 MiB | ~5 MiB |
Platform supported | WINDOWS - X86_64 LINUX - X86_64 ARMV7HF ARMV8 ANDROID 6.0 (API 23) | WINDOWS - X86_64 LINUX - X86_64 ARMV7HF ARMV8 ANDROID 6.0 (API 23) | WINDOWS - X86_64 LINUX - X86_64 ARMV7HF ARMV8 ANDROID 6.0 (API 23) |
Hardware supported | MPU | MPU | MPU |
Creation of custom voice possible: You can modify the voice using Ssml markups which change the pitch, the rate, the timbre, the volume, …
Emotion simulation: You can change the voice style using the Ssml markups i.e lively, neutral, formal, conversational, apologetic, didactic, … You can check this page for more details about the supported styles of each voice.
Emotion presets: You can play recorded emotion audio by writing the name of the record in your voice synthesis. You can check vsdk-baratinoo voices features for more details about the supported emotion presets by voice.