<$BlogRSDUrl$> Marcus P. Zillman, M.S., A.M.H.A. Author/Speaker/Consultant
Marcus P. Zillman, M.S., A.M.H.A. Author/Speaker/Consultant
Internet Happenings, Events and Sources


Tuesday, June 17, 2008  



VoxForge - Open Source Speech Recognition Engines Transcribed Speech Collector
http://www.voxforge.org/home

VoxForge was set up to collect transcribed speech for use in Open Source Speech Recognition Engines ("SRE"s) such as such as ISIP, HTK, Julius and Sphinx. We will categorize and make available all submitted audio files (also called a 'Speech Corpus") and Acoustic Models in GPL format. In order to recognize speech, Speech Recognition Engines require two types of files: the first, called an Acoustic Model, is created by taking a very large number of transcribed speech recordings (called a Speech Corpus) and 'compiling' them into statistical representations of the sounds that make up each word. The second is a Grammar or Language Model. A Grammar is a relatively small file containing sets of predefined combinations of words. A Language Model is a much larger file containing the probabilities of certain sequences of words. Currently, you can easily create a single user Acoustic Model trained to recognize your own voice using open source speech recognition software - it just takes time and patience. VoxForge's main objective is to create multi-user Acoustic Models that can be used without training for: a) telephony IVR (8kHz Acoustic Models); b) desktop command and control (16-48kHz Acoustic Models); and c) dictation (in the future). To achieve this, VoxForge will serve as a repository for transcribed speech audio files that will be used to create continuously-improving Acoustic Models (as user contributions are merged into the VoxForge Multi-User Acoustic Model). As more and more transcribed speech data is collected, the creation of single user Acoustic Models will be made easier. This is because users will be able to adapt the VoxForge Multi-User Acoustic Model to recognize their voice, rather than to try to create one from scratch. As even more speech data is obtained, then the VoxForge Multi-User Acoustic Model will be able to recognize speech without needing to be adapted to a particular user's voice. This has been added to World Wide Web Reference Subject Tracerâ„¢ Information Blog. This has been added to the tools section of Research Resources Subject Tracer™ Information Blog.

posted by Marcus Zillman | 4:53 AM
archives
subject tracers™