Several samples are available. You can choose the one corresponding to your usecase (ASR, TTS, ASR+TTS,...) and build it to discover how to use our SDK.
These samples are pretty easy to understand and the integration in an already existing project should not be a problem.
We rely on Conan to fetch the dependencies and CMake to build. All of our samples contain the required files to do it.
We support Speech Synthesis Markup Language (SSML) tagging to customize the flow of the speech with any voices.
Yes, our voice synthesis technology supports the Speech Synthesis Markup Language (SSML) which allows a standard way to control aspects of speech such as pronunciation, volume, pitch, rate, etc. across different voices.
You can check this page Speech Synthesis Markup Language (SSML) for the list of supported markups.
You have two ways to use multi-languages using the voice synthesis technology:
Choosing a multi-language voice which supports both german and english i.e Anna-ml or Petra-ml and use the SSML markup lang to select the language of your choice.
English, <lang xml:lang="de">Deutsch</lang>, English.XML
Switch between voices when you want to say a word in different language using the voice SSML markup.
English, <voice xml:lang="de">Deutsch</voice>, EnglishXML
You can check Providers specifics for Voice Synthesis page for more details about the supported features by each provider.
In case you want to pronounce a word differently you can use the sub or the phoneme SSML markup.
The sub markup is used to substitute text for the purposes of pronunciation.
<sub alias="Voice Development Kit">VDK</sub>
The phoneme markup is used to provide a phonetic pronunciation for the contained text.
<phoneme alphabet="ipa" ph="vivo͡ʊkə">Vivoka</phonem
You can check Speech Synthesis Markup Language (SSML) page for more details on how to use the sub and phoneme markups.
Our solution can run on a Raspberry Pi. We use both Raspberry Pi 3b+ and 4 (32 and 64bits) on our side as test devices.