OpenVidu 2.25.0: AWS and Vosk support for Speech to Text

OpenVidu
2 min readDec 22, 2022

NEW FEATURES

Speech to Text support for AWS and Vosk

The Speech to Text service introduced in release 2.24.0 has been extended with support for more engines. Now you can use:

  • Azure: available since OpenVidu 2.24.0, it uses Speech to Text service from Azure stack.
  • AWS: using Amazon Transcribe service from AWS stack.
  • Vosk: using Vosk engine, the open source alternative with no extra cost from cloud providers.

Check out Speech to Text documentation to learn more about all of these alternatives.

Native support for private Docker registries

New configuration property OPENVIDU_PRO_DOCKER_REGISTRIES allows you to configure private Docker registries to be used in your Media Nodes. Your custom images of OpenVidu services can now be private but easily accessible by your nodes at the same time. The actual services that can take advantage of this feature are kurento-media-server (configured with property KMS_IMAGE) and speech-to-text-service (configured with property OPENVIDU_PRO_SPEECH_TO_TEXT_IMAGE).

OpenVidu Components new Speech to Text features

  • Speech to Text capabilities of OpenVidu Components now includes automatice reconnection to the service in case of failure. See Reconnecting to Speech to Text module in the case of a crash.
  • A new directive captionsLangOptions has been addded, which allows overriding the default Speech to Text language options. This allows you configuring your own custom languages in the case you are using Vosk as Speech to Text engine.

BUG FIXES

  • OpenVidu Components : Fixed typo with recordingActivity directive that prevented it from working as expected.

Stay tuned for next iterations! You can follow us on Twitter and a Star in GitHub is always welcome :)

--

--