Voice Recognition
ROUTE Recognize
/v1/voice-recognition/recognize
Messages
SEND Audio Chunk Message
{
"data": "data:audio/pcm;base64,<base64_audio>",
"last": <bool>
}
RECEIVE Result Message
{
"result": {
"type": <int>,
"type_string": <string>,
"is_final": <bool>,
"begin_time": <int>,
"end_time": <int>,
"hypotheses": [ <hypothesis>, ... ]
}
}
|
Fields |
Possible values |
Description |
|---|---|---|
|
|
[ |
The result type as an int value |
|
|
[ |
The result type as a string value |
|
|
[ |
Indicates whether this result is final or not. if true, this is the final time this result will be returned; if not, then this result is an interim result and may be updated later on. |
|
|
[ |
The system time in milliseconds at the start of the hypothesis recognition operation. |
|
|
[ |
The system time in milliseconds at the end of the hypothesis recognition operation. |
|
|
[ |
Indicates the likelihood the recognized words are correct. |
|
|
- |
A JSON array containing all the hypotheses of the recognized speech content. |
DETAILS Result Hypothesis
"hypotheses": [
{
"confidence": <int>,
"begin_time": <int>,
"end_time": <int>,
"start_rule": <string>,
"items": [ <item>, ... ]
},
...
]
|
Fields |
Description |
|---|---|
|
|
Represents the entry point of the grammar (<main>). |
|
|
A JSON array containing all the matched tokens. An item object can be either a type |
DETAILS Result Item (Orthography)
"items": [
{
"confidence": <int>,
"begin_time": <int>,
"end_time": <int>,
"type": "terminal",
"orthography": <string>
},
...
]
|
Fields |
Description |
|---|---|
|
|
The matched terminal token. |
DETAILS Result Item (Tag)
"items": [
{
"confidence": <int>,
"begin_time": <int>,
"end_time": <int>,
"type": "tag",
"name": <string>,
"items": [ <item>, ... ]
},
...
]
|
Fields |
Description |
|---|---|
|
|
Represents the name given as an attribute to the |
|
|
Same as |
RECEIVE Event Message
{
"event": {
"code": <int>
"code_string": <string>,
"message": <string>,
"timestamp": <int>
}
}
|
Events |
Code |
Description |
|---|---|---|
|
|
0 |
Indicates that the recognizer has started processing speech input. |
|
|
1 |
Indicates that the recognizer is no longer processing speech input. |
|
|
2 |
Indicates that the recognizer detects input that it can identify as speech. |
|
|
3 |
Indicates that the recognizer is receiving silence or non-speech. |
RECEIVE Error Message
{
"error": {
"type": <string>,
"code": <int>,
"code_string": <string>,
"message": <string>
}
}
Voice Synthesis
Route Synthesize
/v1/voice-synthesis/synthesize
Messages
RECEIVE Audio Chunk Message
{
"data": "data:audio/pcm;base64,<base64_audio>",
"last": <bool>
}
RECEIVE Event Message
{
"event": {
"code": <int>
"code_string": <string>,
"message": <string>,
"timestamp": <int>
}
}
|
Fields |
Possible values |
|---|---|
|
|
[ |
|
|
[ |
RECEIVE Error Message
{
"error": {
"type": <string>,
"code": <int>,
"code_string": <string>,
"message": <string>
}
}
Voice Biometrics
Route Authenticate
/v1/voice-biometrics/authenticate
Messages
SEND Audio Chunk Message
{
"data": "data:audio/pcm;base64,<base64_audio>",
"last": <bool>
}
RECEIVE Result Message
{
"id": <string>,
"probability": <double>,
"score": <double>
}
RECEIVE Error Message
{
"error": {
"type": <string>,
"code": <int>,
"code_string": <string>,
"message": <string>
}
}
Route Identify
/v1/voice-biometrics/identify
Messages
SEND Audio Chunk Message
{
"data": "data:audio/pcm;base64,<base64_audio>",
"last": <bool>
}
RECEIVE Result Message
{
"id": <string>,
"probability": <double>,
"type": <string>
}
Route Enroll
/v1/voice-biometrics/enroll
Messages
SEND Audio Chunk Message
{
"data": "data:audio/pcm;base64,<base64_audio>",
"last": <bool>
}
RECEIVE Result MessagIne
{
"accepted": <bool>,
"progress": <int>,
"speech_duration": <double>,
"utterances": [ <utterance>, ... ]
}
DETAILS Result utterance
"utterances": [
{
"accepted": <bool>,
"contains_speech": <bool>,
"enough_speech": <bool>,
"is_band_limited": <bool>,
"is_consistent": <bool>,
"is_peak_clipped": <bool>,
"is_snr_ok": <bool>,
"snr": <double>,
"speech_duration": <double>
},
...
]
|
Fields |
Description |
|---|---|
|
|
Indicates that an utterance is valid and could be added to the enrollment profile. |
|
|
Indicates whether an audio contains speech or not. |
|
|
Indicates whether the given speech duration is enough to pass the enrollment process checks. |
|
|
Check if the utterance is band-limited. |
|
|
Check if the utterance is consistent with the previous utterances. |
|
|
Indicates whether the degree of peak clipping is below a certain threshold. |
|
|
Indicates if the SNR value is sufficiently high enough. |
|
|
Represents the signal-to-noise ratio of the enrollment utterance. SNR value is measured in dB. |
|
|
Represents the speech duration within an audio input. |
RECEIVE Event Message
{
"event": {
"code": 0
"code_string": "INFO",
"message": <string>,
"timestamp": <int>
}
}
RECEIVE Error Message
{
"error": {
"type": <string>,
"code": <int>,
"code_string": <string>,
"message": <string>
}
}
Speech Enhancement
Route Enhance
/v1/speech-enhancement/enhance
Messages
SEND Audio Chunk Message
{
"data": "data:audio/pcm;base64,<base64_audio>",
"last": <bool>
}
RECEIVE Audio Chunk Message
{
"data": "data:audio/pcm;base64,<base64_audio>",
"last": <bool>
}