---

Audio Analysis

Autonomy Virage speech recognition technology enables security organizations to search files in video, radio and telephony systems instantly. Autonomy Virage’s speech technology is fundamentally different; whereas other technologies adopt a simple phonetic approach using only acoustic information, Autonomy Virage achieves a higher level of understanding through language modeling.

Language modeling involves concept extraction in conjunction with acoustic-phonetic methods to achieve significantly greater accuracy and better results. Simple acoustic-phonetic methods alone fail to achieve good speech to text translation. The acoustic-phonetic approach doesn’t differentiate, for example, between “can I” and “can eye”. In this example, where the desired option is “can I”, Autonomy Virage’s speech technologies employ intelligent probabilistic language modeling to understand the context of what is being said and in this way select the appropriate option “can I”.

Autonomy Virage’s audio recognition functionality includes:

  • Speaker independence
  • Extensive vocabularies
  • Speaker identification
  • Word spotting and phrase recognition

 

Audio Segmentation

The powerful audio segmentation plug-in identifies unique sounds in the audio signal and registers exactly where they occur. The system can be configured to recognize specific audio types such as: a gun shot, a car engine starting, unusual crowd noise such as screaming.

 

Speaker Recognition

The speaker identification plug-in recognizes voices from a user-defined library, regardless of the words or even the language spoken. By simply providing a short speech sample, users can easily add new speakers to the library. Speaker identification makes it possible to associate blocks of text with a particular speaker, enhancing rich media navigation and retrieval.

 

Speech-to-Text

The Speech-to-text Audio Analysis Plug-in detects speech and performs real-time speech-to-text transcription with an approximate 90% accuracy rate. The text track is then synchronized with the video track, allowing for accurate video search.