Speech To Text

Speech-to-Text is a voice technology, based on deep learning language models, that is used to transform audio signals into transcribed text. The results are statistically determined regarding the most frequent sentence structures and word occurrence regarding the context identified.

Main Screen

K5mVdwWCzaMYpYEREBQCjBFLPnqSexdwMZyCUrlyoYd7uCsw1hC6AAeP9dLJwyVYAMVgWM_N8QwVmrHky9TBexxaVu10cuejGVfXhZk32q7x_t4aIfsdavr4NawkhsKAVtgUsQl5FUsEumsP87AaknuKHJ6tegQQ2weA8SRnn3yE0MmicDCFWACQY4XIwg

Audio Recording. Starts recognition using the microphone as input.
Audio File. Starts recognition using an audio file as input.
Result Panel. Displays the previous records and their hypotheses. The amount of hypothesis is controllable from the model settings dialog (accessible through the Modify button). Records can be individually removed with the Delete button.
Hypothesis Explorer. Displays the selected hypothesis here, where the text can be selected and copied.

Create a model

Go to the Playground.
In the voice recognition card, click on Add a model.
In the opened wizard with a choice, select dictation.
You will next have to choose the name and language for your model.
Finish by clicking on Add to project.

Test

You can click on the model inside the Voice recognition card to open the widget.
You can now import audio or record to have the audio fully transcribed.
You can change the number of result by editing the settings and adding the key LH_SEARCH_PARAM_MAXNBEST. To find these settings you can go to the settings of the ASR, advanced settings.