VSDK - Android
Introduction
The Vivoka Software Development Kit (VSDK) for Android delivers a complete suite of speech technologies — including automatic speech recognition (ASR), text-to-speech synthesis (TTS), voice biometrics, and speech enhancement — all optimized for mobile performance and seamless integration completely offline.
Get Started Options
Create a new Android project using the provided sample code – Ideal for quickly getting up and running with a working example.
Integrate VSDK into an existing Android project – Best if you're adding voice capabilities to an already developed app.
Option 1: Start from Sample Project
Quickly get up and running with a preconfigured Android project showcasing VSDK usage.
Review the descriptions below to choose the sample(s) that best match your use case. Then, use the corresponding package name to download the sample directly from the Vivoka Console.
Open the Vivoka Console and navigate to your Project Settings.
Go to the Downloads section.
In the search bar enter package name from table.
Sample code name | Description | Package name |
|---|---|---|
Simple application | A basic end-to-end demo combining ASR input with a TTS voice response, using the VSDK. |
|
Chained grammars | Demonstrates how to recognize a Wake Word followed immediately by a command, without requiring a pause between them. |
|
Dynamic grammar | Shows how to add slot values at runtime to a dynamic ASR model and compile it on device. |
|
Speech Enhancement | Demonstrates how to use the Speech Enhancer to clean and process audio in real time. |
|
TTS (text-to-speech) | A simple example showing how to integrate TTS functionality using VSDK. |
|
Voice biometrics | Provides examples for using Voice Biometrics in both authentication and identification modes. |
|
Natural Language Understanding (NLU) | Demonstrates how to use the NLU module to extract slots and intent from text. |
|
You should now be able to open the sample code in Android Studio and run it directly.
Option 2: Integrate into Existing Project
This guide requires VSDK version 6 or later
Add VSDK to your current Android app by following the setup and integration steps.
1. Creating and Exporting VDK-Studio Project
To run your exported project on Android, you need to include the appropriate VSDK libraries.
Managing Vsdk Assets - Android
2. Install Libraries in Android Project
To run your exported project on Android, you need to include the appropriate VSDK libraries.
Integrating Vsdk Libraries - Android
3. Initializing VSDK
Before you can use any voice technology (e.g., ASR, TTS), you must initialize the Vivoka SDK (VSDK).
3.1. Extracting Assets to Internal Storage
Android cannot directly access resources from the assets/ directory at runtime in the way native code expects.
You need to extract the content of .../main/assets/vsdk/ to a writable internal directory using the AssetsExtractor utility provided by VSDK:
import com.vivoka.vsdk.util.AssetsExtractor;
import com.vivoka.vsdk.Constants;
import com.vivoka.vsdk.Exception;
final String vsdkDataPath = getFilesDir().getAbsolutePath() + Constants.vsdkPath;
try {
AssetsExtractor.extract((Context) this, "vsdk", vsdkDataPath);
} catch (Exception e) {
e.printFormattedMessage();
}
For more detailed guide read this page: Managing Vsdk Assets - Android.
3.2. Initializing the VSDK
After the assets are extracted, initialize the VSDK by passing the path to the configuration file:
import com.vivoka.vsdk.Vsdk;
try {
Vsdk.init((Context) this, vsdkDataPath + "config/vsdk.json", success -> {
if (!success) {
// Handle initialization failure.
return;
}
// VSDK is successfully initialized. Engines can now be started now.
Log.i("VSDK", "VSDK is successfully initialized.");
});
} catch (Exception e) {
e.printFormattedMessage();
}
Vsdk.init(...) must be called before any engine (ASR, TTS, etc.) is created or started.
You only need to initialize VSDK once per application lifecycle (e.g., in your Application class or the main Activity’s onCreate() method).
Once you've called Vsdk.init(...), check your logs for the following confirmation message «VSDK is successfully initialized.».
If you see it, then everything has been set up correctly. You can now proceed with integrating specific technologies.
Otherwise:
Review all the previous steps (asset extraction, paths, Gradle setup).
Try to run our sample codes.
If the issue persists, feel free to raise a support ticket.
Engine
Each technology-provider pair in VSDK has its own dedicated engine, which must be initialized once, and only after VSDK has been initialized.
Here’s an example initializing both Speech Enhancement and ASR engines:
import com.vivoka.vsdk.Vsdk;
Vsdk.init(context, "config/main.json", vsdkSuccess -> {
if (!vsdkSuccess) {
return;
}
com.vivoka.vsdk.speechenhancement.s2c.Engine.getInstance().init(engineSuccess -> {
if (!engineSuccess) {
return;
}
// The Speech Enhancement engine is now ready!
});
com.vivoka.vsdk.asr.csdk.Engine.getInstance().init(engineSuccess -> {
if (!engineSuccess) {
return;
}
// The ASR engine is now ready!
});
});
Once an engine is initialized, you can build your audio pipeline using the corresponding components for that technology.
The following Audio Pipeline example does not require any engine initialization, as it doesn’t rely on any specific technology.
Audio Pipeline
What is a Pipeline?
A pipeline is a processing chain that handles audio flow through three types of components:
Producer: Captures or generates audio (e.g., microphone input, or TTS channel).
Modifiers (optional): Process or alter the audio (e.g., filters, noise reduction).
Consumers: Use or analyze the audio (e.g., speaker, ASR recognizer).
Flow
Producer → [Modifiers] → [Consumers]
Examples:
TTS channel (Producer) → AudioPlayer (Consumer)
AudioRecorder (Producer) → ASR Recognizer (Consumer)
AudioRecorder (Producer) → Speech Enhancer (Modifier) → ASR Recognizer (Consumer)
This modular design allows you to plug and play components based on your use case.
Pipeline class
Pipeline pipeline = new Pipeline();
pipeline.setProducer(producer);
pipeline.pushBackModifier(modifier); // We could have several modifiers.
pipeline.pushBackConsumer(consumer); // We could have several consumers.
The usage of .start(), .run(), and .stop() may vary depending on the technology you’re using (e.g., ASR, TTS). Always refer to the specific guide for each module.
However, some behaviors are consistent:
.start()runs the pipeline in a new thread.run()runs the pipeline and waits till it is finished (blocking).stop()is used to terminate the pipeline execution
A pipeline can be stopped and safely restarted by calling .start() again when needed.
Custom Modules
You can implement your own audio modules. This is particularly useful for custom pre-processing or post-processing stages in your voice workflow.
Types of modules
ProducerModuleIModifierModuleIConsumerModule
Implementation
Example: Creating a basic Audio Recording Pipeline
import com.vivoka.vsdk.audio.Pipeline;
import com.vivoka.vsdk.audio.producers.AudioRecorder;
import com.vivoka.vsdk.audio.consumers.File;
Pipeline pipeline = new Pipeline();
try {
AudioRecorder audioRecorder = new AudioRecorder(); // Microphone input.
File file = new File(getFilesDir().getAbsolutePath() + "/audio.pcm", true); // Will be saved in internal storage of you app.
pipeline.setProducer(audioRecorder);
pipeline.pushBackConsumer(file);
pipeline.start();
} catch (Exception e) {
e.printFormattedMessage();
}
AudioRecorder will require Android permission RECORD_AUDIO.
If everything is configured correctly, you should see new file created in you app internal storage.
To learn more details about Pipeline implementationPipeline - Android.
Read more
You can probably go directly to the technology guides, but if you’re missing some details, you can find them on these pages:
Integrating Vsdk Libraries - Android
Managing Vsdk Assets - Android