Skip to main content
Skip table of contents

Voice Synthesis

Voice synthesis, also known as Text-to-Speech or Text-to-Voice, is a technology used to create real-time voice in order to read aloud your text. These synthetic voices can be selected according to language, gender and size.

https://www.youtube.com/watch?v=o65IX2350RY&list=PLxpkg3kmxJgii81jzA9lgohtwcxSm0SHB

Voice synthesis card

  1. Voice synthesis settings. This button will open the voice synthesis settings which allows you to edit a channel SDK or delete a channel.

  2. Channel card. Click on this card to open the voice synthesis main screen.

  3. Channel name. The name of the channel was chosen in the creation step.

  4. Channel SDK. Indicates which SDK is used for the channel. Please refer to SDK specifics for Voice Synthesis for more details.

  5. Voices list. The list of voices used in the channel. Please refer to SDK specifics for Voice Synthesis to choose your channel’s voices.

  6. Add a channel. This button will open a new wizard which allows you to add a new voice synthesis channel to your current project.

Voice synthesis main screen

  1. Text input. Enter your text in this section.

  2. Fine tune voice (SSML). This button will open a new instance which allows you, using SSML markup (SSML reference) , to customize your text input. Checkout the SSML Markups screen to learn more about this section. You can uncheck Enable SSML checkbox to play as raw text.

  3. Phonetic dictionary. If you already choose dictionaries for your channel you can enable them by checking this combobox. This option is not available if the channel doesn’t have a dictionary.

  4. Available voices. You can select the voices to use in voice synthesis in this section. Once you are satisfied with the entry, you can then click on Save as audio files to export it on your computer (.wav file).

  5. Voices list. In this section, you will find all the available voices to test. To test one or several voices for your project, click on the voice name from the list. To select/deselect all the voices list, you can check/

    uncheck the Selection indicator box. To save your selection in your current project channel you can click on Save configuration button.

  6. Voice filtering. You can make a filtering of the voices by “Languages”, “Genders” and/or “size”. To show or hide the voice filter, you can click on Filter icon. Depending on which filters you set up, the list below will show you the available voices. To reset all the current filters, you can click on Reset icon. 

  7. Play voices. By clicking on Play icon you will be able to test your selected entries. To stop you can click on Stop icon.

The list of voices that will be shown will depend on the voices you have already downloaded in VDK Studio and your channel’s SDK. To download voices, you can click on Download more voices button.

You can select only downloaded voices from your channel's SDK

Not all SSML markups are supported by all SDKs.

SSML Markups screen

  1. Markups. Choose the markup you want to insert in your text input. 

  2. Description. Once you click on a tag on the markup list, you will find here a description of what the markup is used for.

  3. Parameters. For each markup, you can select attributes and values to apply on it. 

  4. Insert SSML tag. Once you have selected and configured your markup tag, you can click onto insert it in your text input.

Once you are done with the markup selection, you can close the current window to get back to the main screen. When you return to the main screen, your markup tag will appear in the text input.

By default, the markup is placed at the location of the text cursor. You can also select a specific part of the text you want to modify. In this case, the markup you insert will automatically be placed around this selected part.

Add a voice synthesis channel screen

  1. Channel name. The channel name must be unique in the voice synthesis technology.

  2. SDK. Three different SDKs (vsdk-csdk, vsdk-vtapi and vsdk-baratinoo) are available depending on your license. Please refer to chapter SDK specifics for more details about your SDK.

  3. Next. Once you have filled everything, you can add your channel to your current project by clicking on the Next button and then the Add to project button.

Voice synthesis settings screen

  1. Channels. The list channels in your current project. You can choose from here the channel to edit or to delete.

  2. Delete. Once you select your channel you can use this button to delete it.

  3. Channel name. The channel name is read only and you can’t modify it. 

  4. SDK. Three different SDKs (vsdk-csdk, vsdk-vtapi and vsdk-baratinoo) are available depending on your license. Please refer to chapter SDK specifics for more details about your SDK.

  5. Save. Once you chose your new channel SDK, you can save your changes to your current project by clicking on Save button.

How to create a channel ?

  1. Goto Playground.

  2. In the voice synthesis card, click on add a channel.

  3. In the opened wizard you have to enter:

    1. Your channel’s name: It must be unique among voice synthesis technology.

    2. Your SDK: The SDK that will be used for voice synthesis. You can check the user guide for full comparison between the SDK.

  4. You finish by clicking on Add to project.

You can see your newly created channels in the voice synthesis card.

How to open voice synthesis settings screen ?

In the Voice synthesis card click on the Settings button to open the settings window that allows you to:

  • Change a channel's SDK

  • Delete a channel

If you are using a project and you don't have voice synthesis technology yet, you can click on Add a technology to add it.

How to use SSML Markups screen ?

  1. Select some words in the text input if you want to add a markup with content.

  2. Click on the Fine tune voice (SSML) button to open the SSML markups window.

  3. Select and configure a SSML markup.

  4. Click on the Insert SSML tag button to insert the markup at cursor position in text input (the selected text will be moved inside the markup).

How to generate audio files from text input ?

  1. In the Voice synthesis screen Enter your text input.

  2. Select the voices to use in the Available voices section.

  3. Click on Save as audio files to save synthesized text as ".wav" files.

A file will be created for each selected voice with the following name format {your_prefix}_{voice_id}.wav.

How to add Voice Synthesis technology to your project ?

  1. Open or create a custom project

  2. In the project editor (left side bar) click on Add a technology button.

  3. Select Add voice synthesis and click on Next.

  4. Enter your channel name, select a SDK and click on Next.

  5. Click on Add to project button.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.