Skip to main content
Skip table of contents

Voice Biometrics - C++

Introduction

Voice biometrics is a technology that uses the unique characteristics of a person’s voice to identify or authenticate them.

Use cases

  • Authentication: Verifies if the speaker matches a specific enrolled identity.

  • Identification: Determines which enrolled user is speaking.

Providers

Feature

TSSV

IDVoice

Accuracy & Performance

Faster, but less accurate

Slower, but more accurate

Result Behavior

Returns results only if confidence ≥ threshold

Returns all results, regardless of confidence

Language Dependency

Language-agnostic

Language-agnostic

Enrollment Flow

Identical for both providers

Identical for both providers

Supported Modes

Text-dependent and text-independent

Text-dependent and text-independent

Audio Format

The input audio data for enrollment and recognition is a 16-bit signed PCM buffer in Little-Endian format. It is always mono (1 channel), and the sample rate 16KHz.

Getting Started

Before you begin, make sure you’ve completed all the necessary preparation steps.
There are two ways to prepare your project for Voice Biometrics:

  1. Using sample code

  2. Starting from scratch

From Sample Code

To download the sample code, you'll need Conan. All the necessary steps are outlined in the general Getting Started guide.

📦 voice-biometrics

CODE
conan search -r vivoka-customer voice-biometrics  # To get the latest version.
conan inspect -r vivoka-customer -a options voice-biometrics/<version>@vivoka/customer
conan install -if voice-biometrics voice-biometrics/<version>@vivoka/customer -o voice_bio_engine=tssv
  • Open project.vdk in VDK-Studio

  • Export in the same directory assets from VDK-Studio (optionally, you can enroll users through VDK-Studio)

CODE
conan install . -if build
conan build . -if build
./build/Release/voice-biometrics

From Scratch

Before proceeding, make sure you’ve completed the following steps:

1. Prepare your VDK Studio project
  • Create a new project in VDK Studio

  • Add the Voice Biometrics technology

  • Add model (you can optionally enroll users now, or handle enrollment later within your app).

  • Export the project to generate the required assets and configuration

2. Set up your project
  • Install the necessary libraries

    • vsdk-audio-portaudio/<version>@vivoka/customer

    • vsdk-samples-utils/<version>@vivoka/customer

    • vsdk-tssv/<version>@vivoka/customer

    • vsdk-idvoice/<version>@vivoka/customer

These steps are better explained in the Get Started guide.

Start Recognition

1. Initialize Engine

Start by initializing the Voice Recognition engine and model:

You cannot create two instances of the same engine.

CPP
#include <vsdk/global.hpp>
#include <vsdk/Exception.hpp>
#include <vsdk/utils/samples/EventLoop.hpp>

#include <vsdk/biometrics/tssv.hpp>
// #include <vsdk/biometrics/idvoice.hpp>
using namespace Vsdk::Biometrics;
using BioEngine = Vsdk::Biometrics::Tssv::Engine;
// using BioEngine = Vsdk::Biometrics::Idvoice::Engine; // Use idvoice include if you prefer IDRD engine
auto const engine = Vsdk::Biometrics::Engine::make<BioEngine>("config/vsdk.json");
CPP
int confidenceThreshold = 5;

auto model = engine->makeModel(modelName, ModelType::TextDependant, confidenceThreshold);
auto model = engine->makeModel(modelName, ModelType::TextIndependant, confidenceThreshold);

The third parameter is the required confidence level. It ranges from 0 to 10 and behaves differently depending on your provider. A value of 10 makes the recognizer as strict as possible.

We recommend testing the application in real-world conditions to determine the minimum score that best fits your needs. This helps you balance between two types of errors:

False rejection: when a valid user is incorrectly rejected.

False acceptance: when an invalid user is incorrectly accepted.

By default, you can simply check if the score is greater than 0, but fine-tuning it based on your use case will give you better accuracy and security.

2. Enroll users

You can enroll users either through the VDK-Studio interface or directly within your application. In the app, enrollment can be done using a file or a buffer.

Enrollment Requirements
  • Text-dependent: Requires at least 4 recordings of the same phrase.

  • Text-independent: Requires at least 13 seconds of speech.

CPP
model->addRecord("user-name", filePath1);
model->addRecord("user-name", filePath2);
...
model->compile();
fmt::print("Enrolled users: '{}'", fmt::join(model->users(), "', '"));

The more data you provide, the better the model's performance will be. For best results, record the data under conditions that match the model's intended use case.

The format is preferred to be 16 kHz mono-channel.
You can create such a WAV file as follows:

On Linux: arecord -c 1 -f S16_LE -r 16000 filename.wav

On Windows:
Use Audacity or any audio recorder that allows manual format selection.
In Audacity:

  1. Set the Project Rate (Hz) (bottom left) to 16000.

  2. Set the recording to Mono (1 channel).

  3. Record your audio.

  4. Export it as WAV (Signed 16-bit PCM).

3. Recognition

For authenticator you need to set user to recognizer (you do it only once and change when needed).

CPP
void onVoiceBioResult(Vsdk::details::StatusResult const & r)
{
    auto const id    = r.json["id"   ].get<std::string>();
    auto const score = r.json["score"].get<float>();
    fmt::print("[{}] Result: '{}' (score: {})\n", gRecognizerName, id, score);
}
CPP
auto authenticator = engine->makeAuthenticator(gRecognizerName, model, gConfidence);
authenticator->subscribe([] (Authenticator::Result const & r) { onVoiceBioResult(r); });
authenticator->setUserToRecognize("user-name");

auto identificator = engine->makeIdentificator(gRecognizerName, model, gConfidence);
identificator->subscribe([] (Identificator::Result const & r) { onVoiceBioResult(r); });

We’ll implement a simple pipeline that records audio from the microphone and sends it to recognizer:

CPP
#include <vsdk/audio/producers/PaMicrophone.hpp>
#include <vsdk/utils/PortAudio.hpp>

rec = std::move(identificator);
// or rec = std::move(authenticator);

auto const mic = Vsdk::Audio::Producer::PaMicrophone::make();

Vsdk::Audio::Pipeline pipeline;
pipeline.setProducer(mic);
pipeline.pushBackConsumer(rec);
pipeline.start();
CODE
pipeline.start();
pipeline.stop();
pipeline.run();
  • .start() runs the pipeline in a new thread

  • .run() runs the pipeline and waits till it is finished (blocking)

  • .stop() is used to terminate the pipeline execution

Once a pipeline has been stopped, you can restart it at any time by simply calling .start() again.

4. Events and errors

CPP
#include <vsdk/biometrics/tssv/Constants.hpp>
// #include <vsdk/biometrics/idvoice/Constants.hpp>

void onVoiceBioEvent(Model::Event const & e)
{
    namespace Key = Vsdk::Constants::Tssv::IdentResult;
    // or namespace Key = Vsdk::Constants::Idvoice::IdentResult;
    auto const user = result.json[Key::id ].get<std::string>();
    auto const score = result.json[Key::score].get<float>();
    fmt::print("Ident Result: '{}' (score: {})\n", user, score);
}

void onVoiceBioError(Model::Error const & e)
{
    namespace Key = Vsdk::Constants::Tssv::AuthResult;
    // or namespace Key = Vsdk::Constants::Idvoice::AuthResult;
    auto const user = result.json[Key::id ].get<std::string>();
    auto const score = result.json[Key::score].get<float>();
    fmt::print("Auth Result: '{}' (score: {})\n", user, score);
}

model->subscribe(&onVoiceBioEvent);
model->subscribe(&onVoiceBioError);
JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.