Skip to main content
Skip table of contents

Speech Synthesis (TTS)

Introduction

Speech Synthesis (TTS) is the bridge between digital text and spoken word. Our Text-to-Speech solution delivers customizable speech output that adapts to your application’s needs.

VDK’s engine supports an extensive range of languages (up to 65 for TTS) and provides numerous voice options with adjustable parameters for pitch, rate, volume, and timbre.

Key Features and Benefits

  • Natural-Looking Speech:
    Produce lifelike audio output using models of different size and quality. Benefit from smooth intonation and flexible expressiveness.

  • Multilingual and Multidialect Support:
    Synthesize speech in up to 65 languages and multiple regional dialects. Choose from a wide variety of voices to match your target audience.

  • SSML Compatibility:
    Enhance your synthesized speech using the Speech Synthesis Markup Language (SSML) to control pronunciation, volume, pitch, rate, and other expressive features.

SSML is not available for neural voices.

Performance and Optimization

Real-Time Synthesis:
Our engine is optimized for low-latency synthesis, providing near-instantaneous feedback for interactive applications.

Resource Management:
Choose the right voice quality for your deployment—embedded voices are designed for devices with limited resources, while neural voices offer enhanced naturalness on more powerful systems.

Next pages to read

Introduction

If you want to start integrating technologies:

Integration

VDK-Service

VSDK

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.