Providers specifics by technology

Below, you will find an overview of the different providers and their specificities regarding your uses of widgets.

Each of providers has different specificities and their uses will depend on what you are looking for in terms of technical characteristics but also according to the characteristics of your computer and your operating system.

The 3 different technology providers available on VDK are:

Cerence
Sensory
Vivoka

Provider specifics for Free-Speech

Provider	Cerence
Language count	24
Free-speech customization	Limited
Language list	`ces-CZ` `cmn-CN` `cmn-TW` `deu-DE` `eng-CN` `eng-GB` `eng-IN` `eng-US` `fra-FR` `hin-IN` `hun-HU` `ita-IT` `jpn-JP` `kor-KR` `nld-NL` `por-BR` `por-PT` `rus-RU` `spa-ES` `spa-MX` `tha-TH` `yue-CN` `yue-HK` `zho-CN-SC`
SDK name	CSDK
Resource size	Between 250 to 300 MB
SDK code size	~65 MB
Platform supported	Windows - x86_64 Linux - x86_64 armv7hf armv8 Android 6.0 (API 23)
Hardware supported	MPU

Providers specifics for Grammar Editor

Provider	Cerence	Sensory	Vivoka
Language count	41	7	2
Phonetic supported	Yes	Yes	Yes
Dynamic data supported	Yes	Yes	Yes
Language list	`afb-APG` `bul-BG` `ces-CZ` `cmn-CN` `cmn-TW` `dan-DK` `deu-DE` `ell-GR` `eng-AU` `eng-CN` `eng-GB` `eng-IN` `eng-US` `fas-APG` `fin-FI` `fra-CA` `fra-FR` `heb-IL` `hin-IN` `hun-HU` `ind-ID` `ita-IT` `jpn-JP` `kor-KR` `msa-MY` `nld-NL` `nor-NO` `pol-PL` `por-BR` `por-PT` `rus-RU` `slk-SK` `spa-ES` `spa-MX` `swe-SE` `tha-TH` `tur-TR` `yue-CN` `yue-HK` `zho-CN-SC` `zho-CN-SH`	`cmn-CN` `deu-DE` `eng-GB` `eng-US` `fra-FR` `jpn-JP` `spa-ES`	`eng-US` `fra-FR`
SDK name	CSDK	TNL	VASR
SDK version	4	6.17.0	1.0.0
Model + resource size	~15 MB	~6 MB	~100 MB
SDK code size	~65 MB	~10 MB	~200 MB
Platform supported	Windows - x86_64 Linux - x86_64 armv7hf armv8 Android 6.0 (API 23)	Windows - x86_64 Linux - x86_64 armv7hf armv8 Android 6.0 (API 23)	Windows - x86_64 Linux - x86_64 armv8
Hardware supported	MPU	MPU	MPU

Providers specifics for Voice Biometrics

Feature	Sensory	ID R&D
Authentication from file	Yes	Yes
Authentication from streaming (microphone)	Yes	Yes
Identification from file	Yes	Yes
Identification from streaming (microphone)	Yes	No
Text dependent	Yes	Yes
Text independent	Yes	Yes
SDK name	TSSV	IdVoice
Resource size	< 1 MB	~230 MB
Voice template size	~50 kB / user	~5 kB / user
Platform supported	Windows - x86_64 Linux - x86_64 armv7hf armv8 Android 6.0 (API 23)	Windows - x86_64 Linux - x86_64 armv8 Android 6.0 (API 23)
Hardware supported	MPU	MPU

Providers specifics for Voice Synthesis

Provider	Cerence	Voxygen	Readspeaker
Language count	67	8	30
Voice count	676	15	168
Creation of custom voice possible¹	Yes (in studio)	Yes (in studio)	Yes (in studio)
Emotion simulation²	Yes (on request)	No	No
Emotion presets³	No	Yes	No
Voice quality choice	`embedded-compact`, `embedded-pro`, `embedded-high`, `embedded-premium`, `premium-high` Not all voices are available in every quality.	`default`	`D22`, `P22` Not all voices are available in every quality. Not all qualities are available for every OS.
Language list	`afb-APG`, `arb-001`, `arb-MA`, `ben-IN`, `bho-IN-JH`, `bul-BG`, `cat-ES`, `cat-ES-VC`, `ces-CZ`, `cmn-CN`, `cmn-CND`, `cmn-TW`, `dan-DK`, `deu-DE`, `ell-GR`, `eng-AU`, `eng-GB`, `eng-GB-SCT`, `eng-IE`, `eng-IN`, `eng-US`, `eng-ZA`, `eus-ES`, `fas-APG`, `fin-FI`, `fra-BE`, `fra-CA`, `fra-FR`, `glg-ES-GA`, `heb-IL`, `hin-IN`, `hrv-HR`, `hun-HU`, `ind-ID`, `ita-IT`, `jpn-JP`, `kan-IN-KA`, `kor-KR`, `mar-IN`, `msa-MY`, `nld-BE`, `nld-NL`, `nor-NO`, `pol-PL`, `por-BR`, `por-PT`, `ron-RO`, `rus-RU`, `slk-SK`, `slv-SL`, `spa-AR`, `spa-CL`, `spa-CO`, `spa-ES`, `spa-MX`, `swe-SE`, `tam-IN-TN`, `tel-IN`, `tha-TH`, `tur-TR`, `ukr-UA`, `vie-VN`, `yue-CN`, `yue-HK`, `zho-CN-SC`, `zho-CN-SH`, `zho-CN-SN`	`arb-MA`, `deu-DE`, `eng-GB`, `eng-US`, `fra-FR`, `ita-IT`, `nld-NL`, `spa-ES`	`arb-001`, `ces-CZ`, `cmn-CN`, `cmn-TW`, `deu-DE`, `eng-AU`, `eng-GB`, `eng-IN`, `eng-US`, `fra-CA`, `fra-FR`, `hin-IN`, `hun-HU`, `ind-ID`, `ita-IT`, `jpn-JP`, `kor-KR`, `nor-NO`, `pol-PL`, `por-BR`, `por-PT`, `ron-RO`, `rus-RU`, `slk-SK`, `spa-AR`, `spa-ES`, `spa-MX`, `swe-SE`, `tha-TH`, `yue-CN`
SDK name	CSDK	Baratinoo	VTAPI
Voice size	`Embedded-compact` 800KiB ↦ 30 MiB `Embedded-pro` 3.4MiB ↦ 114 MiB `Embedded-high` 27MiB ↦ 316MiB `Embedded-premium` 40MiB ↦ 527 MiB `Premium-high` 191MiB ↦ 266 MiB	`default` 50MiB ↦ 300MiB	`D22` 4MiB ↦ 36MB `P22` 126MiB ↦ 450MiB
SDK code size	~50 MiB	~25 MiB	~5 MiB
Platform supported	Windows - x86_64 Linux - x86_64 Armv7hf armv8 android 6.0 (api 23)	Windows - x86_64 Linux - x86_64 Armv7hf armv8 android 6.0 (api 23)	Windows - x86_64 Linux - x86_64 Armv7hf armv8 android 6.0 (api 23)
Hardware supported	MPU	MPU	MPU

Creation of custom voice possible: You can modify the voice using Ssml markups which change the pitch, the rate, the timbre, the volume, …
Emotion simulation: You can change the voice style using the Ssml markups i.e lively, neutral, formal, conversational, apologetic, didactic, … You can check this page for more details about the supported styles of each voice.
Emotion presets: You can play recorded emotion audio by writing the name of the record in your voice synthesis. You can check Voxygen voices features for more details about the supported emotion presets by voice.