Skip to main content
Skip table of contents

Providers specifics by technology

Below, you will find an overview of the different providers and their specificities regarding your uses of widgets.

Each of providers has different specificities and their uses will depend on what you are looking for in terms of technical characteristics but also according to the characteristics of your computer and your operating system.

The 3 different technology providers available on VDK are:

  • Cerence

  • Sensory

  • Vivoka

Provider specifics for Audio Front-End

Provider specifics for Free-Speech

Provider

Cerence

Language count

24

Free-speech customization

Limited

Language list

ces-CZ cmn-CN cmn-TW deu-DE eng-CN eng-GB eng-IN eng-US fra-FR hin-IN hun-HU ita-IT jpn-JP kor-KR nld-NL por-BR por-PT rus-RU spa-ES spa-MX tha-TH yue-CN yue-HK zho-CN-SC

SDK name

CSDK

Resource size

Between 250 to 300 MB

SDK code size

~65 MB

Platform supported

WINDOWS - X86_64 LINUX - X86_64 ARMV7HF ARMV8 ANDROID 6.0 (API 23)

Hardware supported

MPU

Providers specifics for Grammar Editor

Provider

Cerence

Sensory

Vivoka

Language count

41

7

2

Phonetic supported

Yes

Yes

Yes

Dynamic data supported

Yes

Yes

Yes

Language list

afb-APG bul-BG ces-CZ cmn-CN cmn-TW dan-DK deu-DE ell-GR eng-AU eng-CN eng-GB eng-IN eng-US fas-APG fin-FI fra-CA fra-FR heb-IL hin-IN hun-HU ind-ID ita-IT jpn-JP kor-KR msa-MY nld-NL nor-NO pol-PL por-BR por-PT rus-RU slk-SK spa-ES spa-MX swe-SE tha-TH tur-TR yue-CN yue-HK zho-CN-SC zho-CN-SH

cmn-CN deu-DE eng-GB eng-US fra-FR jpn-JP spa-ES

eng-US fra-FR

SDK name

CSDK

TNL

VASR

SDK version

4

6.17.0

1.0.0

Model + resource size

~15 MB

~6 MB

~100 MB

SDK code size

~65 MB

~10 MB

~200 MB

Platform supported

WINDOWS - X86_64 LINUX - X86_64 ARMV7HF ARMV8 ANDROID 6.0 (API 23)

WINDOWS - X86_64 LINUX - X86_64 ARMV7HF ARMV8 ANDROID 6.0 (API 23)

WINDOWS - X86_64 LINUX - X86_64 ARMV8

Hardware supported

MPU

MPU

MPU

Providers specifics for Voice Biometrics

Feature

Sensory

ID R&D

Authentication from file

Yes

Yes

Authentication from streaming (microphone)

Yes

Yes

Identification from file

Yes

Yes

Identification from streaming (microphone)

Yes

No

Text dependent

Yes

Yes

Text independent

Yes

Yes

SDK name

TSSV

IdVoice

Resource size

< 1 MB

~230 MB

Voice template size

~50 kB / user

~5 kB / user

Platform supported

WINDOWS - X86_64 LINUX - X86_64 ARMV7HF ARMV8 ANDROID 6.0 (API 23)

WINDOWS - X86_64 LINUX - X86_64 ARMV8 ANDROID 6.0 (API 23)

Hardware supported

MPU

MPU

Providers specifics for Voice Synthesis

Provider

Cerence

Voxygen

Readspeaker

Language count

67

8

30

Voice count

676

15

168

Creation of custom voice possible1

Yes (in studio)

Yes (in studio)

Yes (in studio)

Emotion simulation2

Yes (on request)

No

No

Emotion presets3

No

Yes

No

Voice quality choice

embedded-compact, embedded-pro, embedded-high, embedded-premium, premium-high

Not all voices are available in every quality.

default

D22, P22

Not all voices are available in every quality.

Not all qualities are available for every OS.

Language list

afb-APG, arb-001, arb-MA, ben-IN, bho-IN-JH, bul-BG, cat-ES, cat-ES-VC, ces-CZ, cmn-CN, cmn-CND, cmn-TW, dan-DK, deu-DE, ell-GR, eng-AU, eng-GB, eng-GB-SCT, eng-IE, eng-IN, eng-US, eng-ZA, eus-ES, fas-APG, fin-FI, fra-BE, fra-CA, fra-FR, glg-ES-GA, heb-IL, hin-IN, hrv-HR, hun-HU, ind-ID, ita-IT, jpn-JP, kan-IN-KA, kor-KR, mar-IN, msa-MY, nld-BE, nld-NL, nor-NO, pol-PL, por-BR, por-PT, ron-RO, rus-RU, slk-SK, slv-SL, spa-AR, spa-CL, spa-CO, spa-ES, spa-MX, swe-SE, tam-IN-TN, tel-IN, tha-TH, tur-TR, ukr-UA, vie-VN, yue-CN, yue-HK, zho-CN-SC, zho-CN-SH, zho-CN-SN

arb-MA, deu-DE, eng-GB, eng-US, fra-FR, ita-IT, nld-NL, spa-ES

arb-001, ces-CZ, cmn-CN, cmn-TW, deu-DE, eng-AU, eng-GB, eng-IN, eng-US, fra-CA, fra-FR, hin-IN, hun-HU, ind-ID, ita-IT, jpn-JP, kor-KR, nor-NO, pol-PL, por-BR, por-PT, ron-RO, rus-RU, slk-SK, spa-AR, spa-ES, spa-MX, swe-SE, tha-TH, yue-CN

SDK name

CSDK

Baratinoo

VTAPI

Voice size

Embedded-compact

800KiB ↦ 30 MiB

Embedded-pro

3.4MiB ↦ 114 MiB

Embedded-high

27MiB ↦ 316MiB

Embedded-premium

40MiB ↦ 527 MiB

Premium-high

191MiB ↦ 266 MiB

default

50MiB ↦ 300MiB

D22

4MiB ↦ 36MB

P22

126MiB ↦ 450MiB

SDK code size

~50 MiB

~25 MiB

~5 MiB

Platform supported

WINDOWS - X86_64 LINUX - X86_64 ARMV7HF ARMV8 ANDROID 6.0 (API 23)

WINDOWS - X86_64 LINUX - X86_64 ARMV7HF ARMV8 ANDROID 6.0 (API 23)

WINDOWS - X86_64 LINUX - X86_64 ARMV7HF ARMV8 ANDROID 6.0 (API 23)

Hardware supported

MPU

MPU

MPU


  1. Creation of custom voice possible: You can modify the voice using Ssml markups which change the pitch, the rate, the timbre, the volume, …

  2. Emotion simulation: You can change the voice style using the Ssml markups i.e lively, neutral, formal, conversational, apologetic, didactic, … You can check this page for more details about the supported styles of each voice.

  3. Emotion presets: You can play recorded emotion audio by writing the name of the record in your voice synthesis. You can check Voxygen voices features for more details about the supported emotion presets by voice.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.