Skip to main content
Skip table of contents

VSDK Voice Recognition - C++

VDK features three different ASR libraries: CSDK, TNL and our very own: VASR.

Basics

You will need to manipulate 2 concepts: Recognizers & Models. Both need to be configured but first let's explain who's who.
Models are fed to the Recognizer and describe the range of words and utterances that can be recognized. They will either be pre-compiled by the provider (like “free speech” models), or compiled from a grammar that you've written beforehand in the VDK Studio.

There are 3 types of models:

Type

Description

static

Static models embedd all possible vocabulary inside a single file or folder.

dynamic

Dynamic models have “holes” where you can plug new vocabulary at runtime. These need to be prepared and compiled at runtime before installing it on a recognizer.

free-speech

Free-Speech models are very large vocabulary static models. They often require additional files and are not supported by all engines.

Recognizers inherit Audio::ConsumerModule and report results as they receive audio and compare it to the current models data.

Configuration

Each engine has its own configuration quirks and tweaks, but here is a common (though incomplete) pattern using VSDK-CSDK, which supports all 3 types of models:

JSON
{
    "version": "2.0",
    "csdk": {
        "paths": {
            "data_root": "../data"
        },
        "asr": {
            "recognizers": {
                "rec": { ... }
            },
            "models": {
                "static_example": {
                    "type": "static",
                    "file": "<model_name>.fcf"
                },
                "dynamic_example": {
                    "type": "dynamic",
                    "file": "<base_model_name>.fcf",
                    "slots": {
                    "firstname": { ... },
                    "lastname": { ... }
                },
                ...
                },
                "free-speech_example": {
                    "type": "free-speech",
                    "file": "<base_model_name>.fcf",
                    "extra_models": { ... }
                }
            }
        }
    }
}

Starting the engine

CPP
#include <vsdk/asr/csdk.hpp> // underlying ASR engine, here we choose CSDK
using AsrEngine = Vsdk::Asr::Csdk::Engine;
Vsdk::Asr::EnginePtr const engine = Vsdk::Asr::Engine::make<AsrEngine>("config/vsdk.json");
// engine is a std::shared_ptr, copy it around as needed but don't let it go out of scope while you
need it!
// const here means the pointer is const, not the pointee (the Engine)

You can't create two separate instances of the same engine! Attempting to create a second one will get you another pointer to the existing engine. Terminate the first engine (i.e. let it go out of scope) then you can make a new instance.

That's it! If no exception was thrown your engine is ready to be used. Each engine has its own configuration document, check it out for further details, as well as the ASR samples to get started with actual, production-ready code.

Creating a Recognizer

CPP
auto const rec = engine->recognizer("rec"); // Instantiate the recognizer we configured above

You can then plug yourself to the reporting mechanism:

CPP
rec->subscribe([] (Vsdk::Asr::Recognizer::Event const & e) { ... });
rec->subscribe([] (Vsdk::Asr::Recognizer::Error const & e) { ... });
rec->subscribe([] (Vsdk::Asr::Recognizer::Result const & r) { ... });

And finally, apply a model to actually recognize vocabulary:

CPP
rec->setModel("static_example"); // same call whether the model is static, dynamic or free-speech!

Also, don't forget to insert it in the pipeline or nothing's going to happen by itself:

CPP
p.pushBackConsumer(rec);

Dynamic Models

Only dynamic models need to be manipulated explicitely to add the missing data at runtime:

CPP
auto const model = engine->dynamicModel("dynamic_example");
model->addData("firstname", "André");
model->addData("lastname", "Lemoine");
model->compile();
// We can now apply it to a recognizer!
rec->setModel("dynamic_example"); // Or use setModel(model->name())

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.