SDK specifics | vivoka

Below, you will find an overview of the different SDKs and their specificities regarding your uses of widgets.
Each SDK has different specificities and their uses will depend on what you are looking for in terms of technical characteristics but also according to the characteristics of your computer and your operating system.

Dictation

SDK	VSDK-CSDK
Language count	24
Free-speech customization	Limited
Language list	`ces-CZ` `cmn-CN` `cmn-TW` `deu-DE` `eng-CN` `eng-GB` `eng-IN` `eng-US` `fra-FR` `hin-IN` `hun-HU` `ita-IT` `jpn-JP` `kor-KR` `nld-NL` `por-BR` `por-PT` `rus-RU` `spa-ES` `spa-MX` `tha-TH` `yue-CN` `yue-HK` `zho-CN-SC`
Resource size	Between 250 to 300 MB
SDK code size	~65 MB
Platform supported	Windows - x86_64 Linux - x86_64 armv7hf armv8 Android 6.0 (API 23)
Hardware supported	MPU

Grammar Base Recognition

SDK	VSDK-CSDK	VSDK-TNL	VSDK-VASR
Language count	41	7	5
Dynamic data	Yes	Yes	Yes
Phonetic	Yes	No	Yes
Phonetic in dynamic data	Yes	No	Yes
Tag annotation	Yes	No	Yes
Intermediate results	Yes	Yes	Yes
Voice activity detection	Yes	Yes	Yes
Confidence score	Yes	No	Yes
Event detection	Yes	Yes	Yes
Language list	`afb-APG` `bul-BG` `ces-CZ` `cmn-CN` `cmn-TW` `dan-DK` `deu-DE` `ell-GR` `eng-AU` `eng-CN` `eng-GB` `eng-IN` `eng-US` `fas-APG` `fin-FI` `fra-CA` `fra-FR` `heb-IL` `hin-IN` `hun-HU` `ind-ID` `ita-IT` `jpn-JP` `kor-KR` `msa-MY` `nld-NL` `nor-NO` `pol-PL` `por-BR` `por-PT` `rus-RU` `slk-SK` `spa-ES` `spa-MX` `swe-SE` `tha-TH` `tur-TR` `yue-CN` `yue-HK` `zho-CN-SC` `zho-CN-SH`	`cmn-CN` `deu-DE` `eng-GB` `eng-US` `fra-FR` `jpn-JP` `spa-ES`	`deu-DE` `eng-US` `fra-FR` `ita-IT` `spa-ES`
SDK version	3	6.17.0	3.0.5
Model + resource size	~15 MB	~6 MB	~10 MB
SDK code size	~65 MB	~10 MB	~50 MB
Platform supported	Windows - x86_64 Linux - x86_64 LINUX - armv7hf LINUX - armv8 Android 6.0 (API 23)	Windows - x86_64 Linux - x86_64 LINUX - armv7hf LINUX - armv8 Android 6.0 (API 23)	Windows - x86_64 Linux - x86_64 LINUX - armv7hf LINUX - armv8 Android 7.0 (API 24)
Hardware supported	MPU	MPU	MPU

Voice Biometrics

Feature	VSDK-TSSV	VSDK-IDVOICE
Authentication from file	Yes	Yes
Authentication from streaming (microphone)	Yes	Yes
Identification from file	Yes	Yes
Identification from streaming (microphone)	Yes	No
Text dependent	Yes	Yes
Text independent	Yes	Yes
Resource size	< 1 MB	~230 MB
Voice template size	~50 kB / user	~5 kB / user
Platform supported	Windows - x86_64 Linux - x86_64 armv7hf armv8 Android 6.0 (API 23)	Windows - x86_64 Linux - x86_64 armv8 Android 6.0 (API 23)
Hardware supported	MPU	MPU

Enrollment feature (per utterance)	VSDK-TSSV	VSDK-IDVOICE
Is accepted	Yes	Yes
SNR level	Yes	Yes
SNR is acceptable	Yes	Yes
Contains speech	Yes	Yes
Speech begin time	Yes	No
Speech end time	Yes	No
Speech duration	Yes	Yes
Speech duration is acceptable	Yes	Yes
Is peak clipped	Yes	No
Is band limited	Yes	No
Is consistent (relative to previous ones, text dependent only)	Yes	No
Is phrase verified (text dependent only)	Yes	No

Voice Synthesis

SDK	VSDK-CSDK	VSDK-BARATINOO	VSDK-VTAPI
Language count	65	8	30
Voice count	181	15	85
Creation of custom voice possible¹	Yes (in studio)	Yes (in studio)	Yes (in studio)
Emotion simulation²	Yes (on request)	No	No
Emotion presets³	No	Yes	No
Voice quality choice	`embedded-compact`, `embedded-pro`, `embedded-high`, `embedded-premium`, `premium-high` Not all voices are available in every quality.	`default`	`D22`, `P22` Not all voices are available in every quality. Not all qualities are available for every OS.
Language list	`afb-APG`, `arb-001`, `ben-IN`, `bho-IN-JH`, `bul-BG`, `cat-ES`, `cat-ES-VC`, `ces-CZ`, `cmn-CN`, `cmn-CND`, `cmn-TW`, `dan-DK`, `deu-DE`, `ell-GR`, `eng-AU`, `eng-GB`, `eng-GB-SCT`, `eng-IE`, `eng-IN`, `eng-US`, `eng-ZA`, `eus-ES`, `fas-APG`, `fin-FI`, `fra-BE`, `fra-CA`, `fra-FR`, `glg-ES-GA`, `heb-IL`, `hin-IN`, `hrv-HR`, `hun-HU`, `ind-ID`, `ita-IT`, `jpn-JP`, `kan-IN-KA`, `kor-KR`, `mar-IN`, `msa-MY`, `nld-BE`, `nld-NL`, `nor-NO`, `pol-PL`, `por-BR`, `por-PT`, `ron-RO`, `rus-RU`, `slk-SK`, `slv-SL`, `spa-AR`, `spa-CL`, `spa-CO`, `spa-ES`, `spa-MX`, `swe-SE`, `tam-IN-TN`, `tel-IN`, `tha-TH`, `tur-TR`, `ukr-UA`, `vie-VN`, `yue-HK`, `zho-CN-SC`, `zho-CN-SH`, `zho-CN-SN`	`arb-MA`, `deu-DE`, `eng-GB`, `eng-US`, `fra-FR`, `ita-IT`, `nld-NL`, `spa-ES`	`arb-001`, `ces-CZ`, `cmn-CN`, `cmn-TW`, `deu-DE`, `eng-AU`, `eng-GB`, `eng-IN`, `eng-US`, `fra-CA`, `fra-FR`, `hin-IN`, `hun-HU`, `ind-ID`, `ita-IT`, `jpn-JP`, `kor-KR`, `nor-NO`, `pol-PL`, `por-BR`, `por-PT`, `ron-RO`, `rus-RU`, `slk-SK`, `spa-AR`, `spa-ES`, `spa-MX`, `swe-SE`, `tha-TH`, `yue-CN`
Voice size	`Embedded-compact` 800KiB ↦ 30 MiB `Embedded-pro` 3.4MiB ↦ 114 MiB `Embedded-high` 27MiB ↦ 316MiB `Embedded-premium` 40MiB ↦ 527 MiB `Premium-high` 191MiB ↦ 266 MiB	`default` 50MiB ↦ 300MiB	`D22` 4MiB ↦ 36MB `P22` 126MiB ↦ 450MiB
SDK code size	~50 MiB	~25 MiB	~5 MiB
Platform supported	Windows - x86_64 Linux - x86_64 Armv7hf armv8 android 6.0 (api 23)	Windows - x86_64 Linux - x86_64 Armv7hf armv8 android 6.0 (api 23)	Windows - x86_64 Linux - x86_64 Armv7hf armv8 android 6.0 (api 23)
Hardware supported	MPU	MPU	MPU

Creation of custom voice possible: You can modify the voice using Ssml markups which change the pitch, the rate, the timbre, the volume, …
Emotion simulation: You can change the voice style using the Ssml markups i.e lively, neutral, formal, conversational, apologetic, didactic, … You can check this page for more details about the supported styles of each voice.
Emotion presets: You can play recorded emotion audio by writing the name of the record in your voice synthesis. You can check vsdk-baratinoo voices features for more details about the supported emotion presets by voice.