@@ -51,6 +51,8 @@ def recognize_using_websocket(self,
51
51
processing_metrics = None ,
52
52
processing_metrics_interval = None ,
53
53
audio_metrics = None ,
54
+ end_of_phrase_silence_time = None ,
55
+ split_transcript_at_phrase_end = None ,
54
56
** kwargs ):
55
57
"""
56
58
Sends audio for speech recognition using web sockets.
@@ -188,6 +190,31 @@ def recognize_using_websocket(self,
188
190
:param bool audio_metrics: If `true`, requests detailed information about the
189
191
signal characteristics of the input audio. The service returns audio metrics with
190
192
the final transcription results. By default, the service returns no audio metrics.
193
+ :param float end_of_phrase_silence_time: (optional) If `true`, specifies
194
+ the duration of the pause interval at which the service splits a transcript
195
+ into multiple final results. If the service detects pauses or extended
196
+ silence before it reaches the end of the audio stream, its response can
197
+ include multiple final results. Silence indicates a point at which the
198
+ speaker pauses between spoken words or phrases.
199
+ Specify a value for the pause interval in the range of 0.0 to 120.0.
200
+ * A value greater than 0 specifies the interval that the service is to use
201
+ for speech recognition.
202
+ * A value of 0 indicates that the service is to use the default interval.
203
+ It is equivalent to omitting the parameter.
204
+ The default pause interval for most languages is 0.8 seconds; the default
205
+ for Chinese is 0.6 seconds.
206
+ See [End of phrase silence
207
+ time](https://cloud.ibm.com/docs/services/speech-to-text?topic=speech-to-text-output#silence_time).
208
+ :param bool split_transcript_at_phrase_end: (optional) If `true`, directs
209
+ the service to split the transcript into multiple final results based on
210
+ semantic features of the input, for example, at the conclusion of
211
+ meaningful phrases such as sentences. The service bases its understanding
212
+ of semantic features on the base language model that you use with a
213
+ request. Custom language models and grammars can also influence how and
214
+ where the service splits a transcript. By default, the service splits
215
+ transcripts based solely on the pause interval.
216
+ See [Split transcript at phrase
217
+ end](https://cloud.ibm.com/docs/services/speech-to-text?topic=speech-to-text-output#split_transcript).
191
218
:param dict headers: A `dict` containing the request headers
192
219
:return: A `dict` containing the `SpeechRecognitionResults` response.
193
220
:rtype: dict
0 commit comments