Voice Biometrics - C++
VDK features two Voice Biometrics libraries: TSSV and IDVoice.
Configuration
Voice biometrics engines must be configured before the program starts. Here is a complete setup for the TSSV provider:
{
"version": "2.0",
"tssv": { // Contrary to other technologies,
"version": "2.0",
"biometrics": { // biometrics paths are relative to the program's working directory!
"generated_models_path": "data/models",
"background_model_TI": "data/text-independent-16kHz.ubm",
"background_model_TD": "data/text-dependent-16kHz.ubm"
}
}
}
Here is the one for IDVoice:
{
"version": "2.0",
"idvoice": {
"version": "2.0",
"biometrics": { // path to the engine model
"background_model": "../data/voice_bio_background_model_idrnd"
},
"paths": { // path to the saved models (users' templates)
"model_generation": "../data/models"
}
}
}
Starting the engine
#include <vsdk/biometrics/tssv.hpp>
#include <vsdk/biometrics/idvoice.hpp>
using BioEngine = Vsdk::Biometrics::Tssv::Engine;
//using BioEngine = Vsdk::Biometrics::Idvoice::Engine; // Use idvoice include if you prefer IDRD engine
auto const engine = Vsdk::Biometrics::Engine::make<BioEngine>("./config/vsdk.json");
Creating a model
Models contain enrollment data that recognition operations need.
// To create a text independant model
auto const model = engine->makeModel("test_ti", Vsdk::Biometrics::ModelType::TEXT_INDEPENDANT);
// To create a text dependant model
auto const model = engine->makeModel("test_td", Vsdk::Biometrics::ModelType::TEXT_DEPENDANT);
Checking users in the model
If a model was previously created with the same name it will be loaded. You can check enrolled users with:
fmt::print("Enrolled users: '{}'", fmt::join(model->users(), "', '"));
Adding a user to the model
You can either add raw audio data directly or from a file. After adding all the data for a given user, compile the model to finalize the enrollment process:
model->addRecord("victorien", "data/victorienti.wav");
model->compile();
The more data you give the model better the result will be. Prefer to register the data in the condition of the use case of the model.
The format is preferred to be 16Khz mono-channel, you can create such wav file with the command: arecord -c 1 -f S16_LE -r 16000 [filename].wav
Checking model info
Models report the status of the enrollment:
model->subscribe([] (Vsdk::Biometrics::Model::Event const & e) { ... });
model->subscribe([] (Vsdk::Biometrics::Model::Error const & e) { ... });
Performing authentication or identication
Both are covered in the same chapter as it is very similar:
// Identificator
auto authenticator = engine->makeIdentificator("ident", model, 5);
// Authenticator
auto identificator = engine->makeAuthenticator("auth", model, 5);
auto identificator->setUserToRecognize("victorien");
The only difference is that the authentication can only recognize user “victorien” and the identification can recognize every user enrolled in the model. Note the third parameter: it is the confidence level you require. It ranges between 0 and 10 and act differently depending on your provider, 10 meaning you want your recognizer to be the strictest possible (you will have the lowest false positive but also the highest false negative). They both inherit Audio::ConsumerModule
and must be inserted into the pipeline.
Getting the result
#include <vsdk/biometrics/tssv/Constants.hpp>
identificator->subscribe([] (Vsdk::Biometrics::Identificator::Result const & result)
{
namespace Key = Vsdk::Constants::Tssv::IdentResult;
auto const user = result.json[Key::id ].get<std::string>();
auto const score = result.json[Key::score].get<float>();
fmt::print("Ident Result: '{}' (score: {})\n", user, score);
});
authenticator->subscribe([] (Vsdk::Biometrics::Authenticator::Result const & result)
{
namespace Key = Vsdk::Constants::Tssv::AuthResult;
auto const user = result.json[Key::id ].get<std::string>();
auto const score = result.json[Key::score].get<float>();
fmt::print("Auth Result: '{}' (score: {})\n", user, score);
});
Different providers will give you different results, for example IDVoice reports varying results as it analyzes the audio, while TSSV only sends you result if the engine thinks it is acceptable (depending of the confidence level you set). We recommend that you try it out the application in real situation to select your custom minimum score required to satisfy your need in false rejection and false acceptation. But by default you can just check if the score is above 0.