Skip to main content
Skip table of contents

Speech Synthesis‎

Introduction

Speech synthesis (also known as text-to-speech or TTS) is the process of converting written text into spoken audio.

In VSDK, speech synthesis is powered by CSDK, which offers a wide range of voices across different languages, genders, and voice quality (Voice quality availability).

Voice Format

For <language>, refer to the table and use the value from the Vsdk-csdk Code column.
For <name>, use the lowercase version of the name shown in VDK-Studio.
For <quality>, you can find this information in VDK-Studio under Resources → Voice.

Engine

Format

Example

vsdk-csdk

<language>,<name>,<quality>

enu,evan,embedded-pro

SSML Support

VSDK also supports SSML (Speech Synthesis Markup Language), which gives you finer control over how the text is spoken—allowing adjustments such as:

  • Pronunciation

  • Pauses

  • Pitch

  • Rate

  • Emphasis

SSML is supported for embedded voices, but not for neural voices (if present in your configuration). Neural voices are more natural-sounding but behave as a black box and do not support markup-based control.

Audio Format

The audio data is a 16-bit signed PCM buffer in Little-Endian format.
It is always mono (1 channel), and the sample rate depends on the engine being used.

Engine

Sample Rate (kHz)

csdk

22050

Sample project

A sample project is available for Speech Synthesis usage with VDK Service (in C# or Python).

Python
  • Download and extract the zip below

  • Head inside the project

  • Create and activate a virtual environment (Python Venv documentation)

  • Install the project : pip install -e .

  • Run the script : vdk-synthesis --help

If you see the list of options, you can start your configured VDK Service and interact with it using the options available. For example vdk-synthesis --list will list available voices.

VdkServiceSample-VoiceSynthesis (python).zip

C#
  • Download and extract the zip below

  • Open the project solution (.sln)

  • Set VoiceSynthesis as startup project

  • Build and run project with the argument “--help”

If you see the list of options, you can start your configured VDK Service and interact with it using the options available. For example --list will list available voices.

VdkServiceSample-VoiceSynthesis (C#).zip

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.