Frequently Asked Questions - FAQ

How complex is an integration in an already existing C/C++ application?

Several samples are available. You can choose the one corresponding to your usecase (ASR, TTS, ASR+TTS,...) and build it to discover how to use our SDK.
These samples are pretty easy to understand and the integration in an already existing project should not be a problem.
We rely on Conan to fetch the dependencies and CMake to build. All of our samples contain the required files to do it.

Does your system can alter/customize voices (pitch, speed, distortion, etc.)?

We support Speech Synthesis Markup Language (SSML) tagging to customize the flow of the speech with any voices.

Does your system make use of any markup languages such as SSML?

Yes, our voice synthesis technology supports the Speech Synthesis Markup Language (SSML) which allows a standard way to control aspects of speech such as pronunciation, volume, pitch, rate, etc. across different voices.
You can check this page Speech Synthesis Markup Language (SSML) for the list of supported markups.

Is it possible to use multiple languages in voice synthesis technology for example a mix between German and English?

You have two ways to use multi-languages using the voice synthesis technology:

Choosing a multi-language voice which supports both german and english i.e Anna-ml or Petra-ml and use the SSML markup lang to select the language of your choice.
XML
```
English, <lang xml:lang="de">Deutsch</lang>, English.
```
Switch between voices when you want to say a word in different language using the voice SSML markup.
XML
```
English, <voice xml:lang="de">Deutsch</voice>, English
```

You can check Providers specifics for Voice Synthesis page for more details about the supported features by each provider.

Would it be possible to pronounce certain words differently?

In case you want to pronounce a word differently you can use the sub or the phoneme SSML markup.

The sub markup is used to substitute text for the purposes of pronunciation.

XML

  <sub alias="Voice Development Kit">VDK</sub>

The phoneme markup is used to provide a phonetic pronunciation for the contained text.

XML

  <phoneme alphabet="ipa" ph="vivo͡ʊkə">Vivoka</phonem

You can check Speech Synthesis Markup Language (SSML) page for more details on how to use the sub and phoneme markups.

Can your solution run on a Raspberry Pi / Banana Pi with a Linux operating system?

Our solution can run on a Raspberry Pi. We use both Raspberry Pi 3b+ and 4 (32 and 64bits) on our side as test devices.