VSDK - Android

Introduction

The Vivoka Software Development Kit (VSDK) for Android delivers a complete suite of speech technologies — including automatic speech recognition (ASR), text-to-speech synthesis (TTS), voice biometrics, and speech enhancement — all optimized for mobile performance and seamless integration completely offline.

Get Started Options

Create a new Android project using the provided sample code – Ideal for quickly getting up and running with a working example.
Integrate VSDK into an existing Android project – Best if you're adding voice capabilities to an already developed app.

Option 1: Start from Sample Project

Quickly get up and running with a preconfigured Android project showcasing VSDK usage.

Review the descriptions below to choose the sample(s) that best match your use case. Then, use the corresponding package name to download the sample directly from the Vivoka Console.

Open the Vivoka Console and navigate to your Project Settings.
Go to the Downloads section.
In the search bar enter package name from table.

Sample code name	Description	Package name
Simple application	A basic end-to-end demo combining ASR input with a TTS voice response, using the VSDK.	`sample-simple-application-<version>-android-deps-vsdk-<version>.zip`
Chained grammars	Demonstrates how to recognize a Wake Word followed immediately by a command, without requiring a pause between them.	`sample-chained-grammars-<version>-android-deps-vsdk-<version>.zip`
Dynamic grammar	Shows how to add slot values at runtime to a dynamic ASR model and compile it on device.	`sample-dynamic-grammar-<version>-android-deps-<version>.zip`
Speech Enhancement	Demonstrates how to use the Speech Enhancer to clean and process audio in real time.	`sample-speech-enhancement-<version>-deps-vsdk-<version>.zip`
TTS (text-to-speech)	A simple example showing how to integrate TTS functionality using VSDK.	`sample-tts-<version>-android-deps-vsdk-<version>.zip`
Voice biometrics	Provides examples for using Voice Biometrics in both authentication and identification modes.	`sample-voice-biometrics-<version>-android-deps-vsdk-<version>.zip`
Natural Language Understanding (NLU)	Demonstrates how to use the NLU module to extract slots and intent from text.	`sample-voice-commands-language-understanding-<version>-android-deps-vsdk-<version>.zip`

You should now be able to open the sample code in Android Studio and run it directly.

Option 2: Integrate into Existing Project

This guide requires VSDK version 6 or later

Add VSDK to your current Android app by following the setup and integration steps.

1. Creating and Exporting VDK-Studio Project

To run your exported project on Android, you need to include the appropriate VSDK libraries.

Managing Vsdk Assets - Android

2. Install Libraries in Android Project

To run your exported project on Android, you need to include the appropriate VSDK libraries.

Integrating Vsdk Libraries - Android

3. Initializing VSDK

Before you can use any voice technology (e.g., ASR, TTS), you must initialize the Vivoka SDK (VSDK).

3.1. Extracting Assets to Internal Storage

Android cannot directly access resources from the assets/ directory at runtime in the way native code expects.

You need to extract the content of .../main/assets/vsdk/ to a writable internal directory using the AssetsExtractor utility provided by VSDK:

JAVA

import com.vivoka.vsdk.util.AssetsExtractor;
import com.vivoka.vsdk.Constants;
import com.vivoka.vsdk.Exception;

final String vsdkDataPath = getFilesDir().getAbsolutePath() + Constants.vsdkPath;
try {
    AssetsExtractor.extract((Context) this, "vsdk", vsdkDataPath);
} catch (Exception e) {
    e.printFormattedMessage();
}

For more detailed guide read this page: Managing Vsdk Assets - Android.

3.2. Initializing the VSDK

After the assets are extracted, initialize the VSDK by passing the path to the configuration file:

JAVA

import com.vivoka.vsdk.Vsdk;

try {
    Vsdk.init((Context) this, vsdkDataPath + "config/vsdk.json", success -> {
        if (!success) {
            // Handle initialization failure.
            return;
        }
        // VSDK is successfully initialized. Engines can now be started now.
        Log.i("VSDK", "VSDK is successfully initialized.");
    });
} catch (Exception e) {
    e.printFormattedMessage();
}

Vsdk.init(...) must be called before any engine (ASR, TTS, etc.) is created or started.

You only need to initialize VSDK once per application lifecycle (e.g., in your Application class or the main Activity’s onCreate() method).

Once you've called Vsdk.init(...), check your logs for the following confirmation message «VSDK is successfully initialized.».

If you see it, then everything has been set up correctly. You can now proceed with integrating specific technologies.

Otherwise:

Review all the previous steps (asset extraction, paths, Gradle setup).
Try to run our sample codes.
If the issue persists, feel free to raise a support ticket.

Engine

Each technology-provider pair in VSDK has its own dedicated engine, which must be initialized once, and only after VSDK has been initialized.

Here’s an example initializing both Speech Enhancement and ASR engines:

JAVA

import com.vivoka.vsdk.Vsdk;

Vsdk.init(context, "config/main.json", vsdkSuccess -> {
    if (!vsdkSuccess) {
        return;
    }
    com.vivoka.vsdk.speechenhancement.s2c.Engine.getInstance().init(engineSuccess -> {
        if (!engineSuccess) {
            return;
        }
        // The Speech Enhancement engine is now ready!
    });
    com.vivoka.vsdk.asr.csdk.Engine.getInstance().init(engineSuccess -> {
        if (!engineSuccess) {
            return;
        }
        // The ASR engine is now ready!
    });
});

Once an engine is initialized, you can build your audio pipeline using the corresponding components for that technology.

The following Audio Pipeline example does not require any engine initialization, as it doesn’t rely on any specific technology.

Audio Pipeline

What is a Pipeline?

A pipeline is a processing chain that handles audio flow through three types of components:

Producer: Captures or generates audio (e.g., microphone input, or TTS channel).
Modifiers (optional): Process or alter the audio (e.g., filters, noise reduction).
Consumers: Use or analyze the audio (e.g., speaker, ASR recognizer).

Flow

Producer → [Modifiers] → [Consumers]

Examples:

TEXT

TTS channel (Producer) → AudioPlayer (Consumer)
AudioRecorder (Producer) → ASR Recognizer (Consumer)
AudioRecorder (Producer) → Speech Enhancer (Modifier) → ASR Recognizer (Consumer)

This modular design allows you to plug and play components based on your use case.

Pipeline class

JAVA

Pipeline pipeline = new Pipeline();

pipeline.setProducer(producer);
pipeline.pushBackModifier(modifier); // We could have several modifiers.
pipeline.pushBackConsumer(consumer); // We could have several consumers.

The usage of .start(), .run(), and .stop() may vary depending on the technology you’re using (e.g., ASR, TTS). Always refer to the specific guide for each module.

However, some behaviors are consistent:

.start() runs the pipeline in a new thread
.run() runs the pipeline and waits till it is finished (blocking)
.stop() is used to terminate the pipeline execution

A pipeline can be stopped and safely restarted by calling .start() again when needed.

Custom Modules

You can implement your own audio modules. This is particularly useful for custom pre-processing or post-processing stages in your voice workflow.

Types of modules

ProducerModule
IModifierModule
IConsumerModule

Implementation

ProducerModule - File example

JAVA

import com.vivoka.vsdk.audio.Buffer;
import com.vivoka.vsdk.audio.ProducerModule;
import com.vivoka.vsdk.Exception;

public class SyncCustomProducer extends ProducerModule {

    @Override
    public void openImpl() throws Exception {
        _state = State.Opened;
    }

    @Override
    public void runImpl() throws Exception {
        _state = State.Started;

        for (int i = 0; i < 3; i++) {
            byte[] dummyAudio = new byte[320]; // 20ms of silence at 16kHz mono PCM
            Buffer buffer = new Buffer(1, 16000);
            buffer.append(dummyAudio);

            dispatchBuffer(buffer, false);
            Thread.sleep(100);
        }

        // Final (empty) buffer
        dispatchBuffer(new Buffer(1, 16000, new byte[0]), true);
        _state = State.Idle;
    }
}


public class AsyncCustomProducer extends ProducerModule {

    private Thread _thread;
    private volatile boolean _running = false;

    @Override
    public void openImpl() throws Exception {
        _state = State.Opened;
    }

    @Override
    public void startImpl() throws Exception {
        _state = State.Started;
        _running = true;

        _thread = new Thread(() -> {
            try {
                while (_running) {
                    byte[] audio = new byte[320];
                    Buffer buffer = new Buffer(1, 16000);
                    buffer.append(audio);
                    dispatchBuffer(buffer, false);
                    Thread.sleep(100);
                }

                // Final buffer
                dispatchBuffer(new Buffer(1, 16000, new byte[0]), true);

            } catch (Exception e) {
                new Exception("AsyncProducer error", e).printFormattedMessage();
            }
        });
        _thread.start();
    }

    @Override
    public void stopImpl() throws Exception {
        _running = false;
        if (_thread != null) {
            _thread.join();
            _thread = null;
        }
        _state = State.Idle;
    }
}

ModifierModule - CustomModifier

JAVA

import com.vivoka.vsdk.Exception;
import com.vivoka.vsdk.audio.Buffer;
import com.vivoka.vsdk.audio.IModifierModule;

public class CustomModifier implements IModifierModule {
    @Override
    public void process(Buffer buffer, boolean last) throws Exception {
        // This method is called when there is an available buffer to process from the producer.
        // Execution will be done in the thread of the producer.
        // Last is true if this is the last buffer to process.
    }
}

IConsumerModule - CustomConsumer

JAVA

import com.vivoka.vsdk.audio.Buffer;
import com.vivoka.vsdk.audio.IConsumerModule;
import com.vivoka.vsdk.Exception;

public class CustomConsumer implements IConsumerModule {
    @Override
    public void process(Buffer buffer, boolean last) throws Exception {
        // This method is called when there is an available buffer to process from the producer.
        // Execution will be done in the thread of the producer.
        // Last is true if this is the last buffer to process.
    }
}

Example: Creating a basic Audio Recording Pipeline

JAVA

import com.vivoka.vsdk.audio.Pipeline;
import com.vivoka.vsdk.audio.producers.AudioRecorder;
import com.vivoka.vsdk.audio.consumers.File;

Pipeline pipeline = new Pipeline();

try {
    AudioRecorder audioRecorder = new AudioRecorder(); // Microphone input.
    File file = new File(getFilesDir().getAbsolutePath() + "/audio.pcm", true); // Will be saved in internal storage of you app.
    
    pipeline.setProducer(audioRecorder);
    pipeline.pushBackConsumer(file);
    pipeline.start();
} catch (Exception e) {
    e.printFormattedMessage();
}

AudioRecorder will require Android permission RECORD_AUDIO.

How to request permissions

Do not forget to add <uses-permission android:name="android.permission.RECORD_AUDIO" /> to AndroidManifest.xml.

JAVA

public class PermissionsUtils {
    public static void requestAllPermissions(AppCompatActivity activity, int requestId) {
        String[] perms = new String[]{
                Manifest.permission.RECORD_AUDIO
        };
        ActivityCompat.requestPermissions(activity, perms, requestId);
    }
}

// Usage
int REQUEST_PERMISSIONS_ALL = 100;
PermissionsUtils.requestAllPermissions((AppCompatActivity) this, REQUEST_PERMISSIONS_ALL);

If everything is configured correctly, you should see new file created in you app internal storage.

To learn more details about Pipeline implementationPipeline - Android.

Introduction

Get Started Options

Option 1: Start from Sample Project

Option 2: Integrate into Existing Project

1. Creating and Exporting VDK-Studio Project

2. Install Libraries in Android Project

3. Initializing VSDK

3.1. Extracting Assets to Internal Storage

3.2. Initializing the VSDK

Engine

Audio Pipeline

What is a Pipeline?

Flow

Pipeline class

Custom Modules

Types of modules

Implementation

Example: Creating a basic Audio Recording Pipeline

Read more