VDK features two TTS libraries: CSDK and Baratinoo.

Configuration

TTS engines must be configured before the program starts. Here is a complete setup with 2 channels, each one using a different language (using the CSDK engine):

{
    "version": "2.0",
    "csdk": {
        "tts": {
            "channels": {
                "channelFrf": {
                    "voices": ["frf,aurelie,embedded-compact", "enu,ava,embedded-compact"]
                },
                "channelEnu": {
                    "voices": ["enu,ava,embedded-compact"]
                }
            }
        }
    }
}
JSON

An empty channel list will trigger an error, as well as an empty voice list!

Voice format

Each engine has its own voice format, described in the following table:

Engine

Format

Example

vsdk-csdk

<language>,<name>,<quality>

enu,evan,embedded-pro

vsdk-baratinoo

<name>

Arnaud_neutre

Starting the engine

com.vivoka.vsdk.Vsdk.init(mContext, "config/main.json", vsdkSuccess -> {
    if (vsdkSuccess)
    {
        com.vivoka.csdk.tts.Engine.getInstance().init(mContext, engineSuccess -> {
            if (engineSuccess)
            {
                // at this point the TtsEngine has been correctly initialized   
            }    
        });
    }
});
JAVA

Creating a channel

Remember, channel must be configured beforehand!

Channel channelFrf = com.vivoka.csdk.tts.Engine.getInstance().makeChannel("channelFrf", "frf,aurelie,embedded-compact");
JAVA

Speech Synthesis

Speech Synthesis is asynchronous! That means the call will not block the thread during the synthesis.

channelFrf.synthesisFromText("Bonjour ! Je suis une voix synthétique", () -> {
    // channelFrf.synthesisResult contains the audioData to play
});

// Also works with SSML input
final String ssml = "<speak version=\"1.0\" xmlns=\"http://www.w3.org/2001/10/synthesis\" xml:lang=\"fr-FR\">Bonjour Vivoka</speak>";
channelFrf.synthesisFromSSML(ssml, () -> {
    // channelFrf.synthesisResult contains the audioData to play
});
JAVA

Playing the result

VSDK provides an audio player. Playing the result is very easy:

AudioPlayer.play(channel.synthesisResult.getAudioData(),
                 channel.synthesisResult.getSampleRate(),
                 new AudioTrack.OnPlaybackPositionUpdateListener()
                 {
                    @Override
                    public void onMarkerReached(AudioTrack track) {}

                    @Override
                    public void onPeriodicNotification(AudioTrack track) { }
                });
JAVA

The audio data is a 16bit signed Little-Endian PCM buffer. Channel count is always 1 and sample rate varies depending on the engine:

Engine

Sample Rate (kHz)

csdk

22050

baratinoo

24000

Storing the result on disk

 channel.synthesisResult.saveToFile("directory", "filename", new ICreateAudioFileListener(){});	
JAVA

Only PCM extension is available, which means the file has no audio header of any sort.