Speech Enhancement - C++
VDK currently features one Speech Enhancement library: vsdk-s2c.
Summary
Includes
#include <vsdk/speech-enhancement/s2c.hpp>
Creating and using a SpeechEnhancer
A SpeechEnhancer
is used to take audio data in and get new audio data out, so it works as an audio modifier in the Pipeline
!
Let’s say you want to read a file, clean its audio and write the output into another file:
using S2cEngine = Vsdk::SpeechEnhancer::S2c::Engine;
auto engine = Vsdk::SpeechEnhancer::Engine::make<S2cEngine>("config/vsdk.json");
auto enhancer = engine->speechEnhancer("<name>");
Vsdk::Audio::Pipeline pipeline;
// Reads a 1-channel/2-channels 16-bit signed Little-Endian 16kHz PCM file
// Speech Enhancers only work with 16kHz!
pipeline.setProducer<Vsdk::Audio::Producer::File>("file.raw", 16000, enhancer->inputChannelCount(), 0);
pipeline.pushBackModifier(enhancer);
pipeline.pushBackConsumer<Vsdk::Audio::Consumer::File>("file-cleaned.raw", true);
Configuration
The Speech Enhancer Engine must be configured using the VDK Studio.
Here is the template that will be added automatically in the vsdk.json
configuration file during export.
{
"s2c": {
"speech_enhancement": {
"speech_enhancers": {
"<name>": {
"configuration": "<configuration>"
}
}
}
}
}