VASR if the name of Vivoka's own ASR engine. It is provided through the VSDK framework under the designation VSDK-VASR.
It is designed to run completly offline and on small devices (like Raspberry Pi 3 / 4)

Supported languages

Currently 5 languages are supported:

  • 🇫🇷 fra-FR → French of France

  • 🇺🇸 eng-US → Enlish of United States

  • 🇮🇹 ita-IT → Italian from Italy

  • 🇪🇸 spa-ES → Spanish from Spain

  • 🇩🇪 deu-DE → German from Germany

Supported features

Here is a list of all the features currently supported by the engine:

Feature name



Ability to compile / load BNF formatted grammars

Dynamic Content

A.k.a. Dynamic slots. Ability to specify in a grammar a rule where its values will be given later

Custom phonetic

Ability to specify any phonetic for a given word or expression in a grammar

Custom phonetic in dynamic content

Ability to specify 1 or more custom phonetic with the values of a particular slot

Tag annotations

Ability to specify tags in the grammar used to easily retreive information from it

Intermediate result

Ability to return non-final results while the user is still speaking

VAD (Voice Activity Detection)

Ability to automatically detect when a user speak

Confidence score

Ability to return a confidence score with the result

Event detection

Ability to send feedback when speech / silence is detected

Engine configuration

Like any other engine in VSDK, this one also have its own configuration that needs to be provided as a JSON file when instantiating the engine.
Here is a sample of the configuration for VASR:

  "version": "2.0",
  "vasr": {
    "paths": {
      "data_root": "../data"
    "asr": {
      "recognizers": {
        "rec": {
          "acmods": ["eng-US.vam"]
      "models": {
        "cmd": {
          "type": "static",
          "file": "eng-US.vgg"

This configuration represent the minimum required in order to operate the engine. You can see the complete configuration with the description of every field Configuration file .

Required resources

In order to correctly run the engine require at least 2 files: an acoustic model and a compiled grammar.

The acoustic model file ends with a .vam extension (Vivoka Acoustic Model) and the compiled grammar ends with .vgg (Vivoka Grammar Graph). You can check the page Acoustic model file for the acoustic model and the page Compiled grammar for the compiled grammar if you want more details about those files.

Sample code

This sample code assume that the content of the vsdk.json configuration file is similar to the one above.

#include <vasr/Engine.hpp>
#include <vasr/GrammarComposer.hpp>
#include <vasr/LanguageModel.hpp>
#include <vasr/Recognizer.hpp>
std::vector<float> audioData();
int main() try
    Vasr::Engine engine("vsdk.json");
    auto & rec = engine.recognizer("rec"); // String taken from the vsdk.json file at vasr/asr/recognizers
    auto & grm = engine.grammarComposer("cmd"); // String taken from the vsdk.json file at vasr/asr/models
    if (grm.hasSlots())
        // Fill slots
    auto model = grm.compose();
    recognizer.setModels({ model });
    recogniser.installResultCallback([](Vasr::AsrResult result)
        if (result.isFinal())
     	    // Process final result
            // Process intermediate result
    recognizer.processAudioBuffer(audioData(), true);
catch (std::exception const & e)
    fmt::print(stderr, "A fatal error occured:\n");
    return EXIT_FAILURE;