Sphinx-4 : a Java speech recognizer

Sphinx-4 is a state-of-the-art speech recognition system written in Java. It was created via a joint collaboration between the Sphinx group at Carnegie Mellon University, Sun Microsystems Laboratories, Mitsubishi Electric Research Labs (MERL), and Hewlett Packard (HP), with contributions from the University of California at Santa Cruz (UCSC) and the Massachusetts Institute of Technology (MIT).

Sphinx-4 contains the following demo programs :

  • Hello World Demo: a command line application that recognizes simple phrases
  • Hello Digits Demo: a command line application that recognizes connected digits
  • Hello N-Gram Demo: a command line application using an N-gram language model for speech recognition
  • ZipCity Demo: a Java Web Start technology application that recognizes spoken zip codes and locates the associated city and state
  • WavFile Demo: a simple demo program to show how to decode audio files (e.g., .wav, .au files)
  • Transcriber Demo: a simple demo program showing how to transcribe a continuous audio file that has multiple utterances separated by silences
  • JSGF Demo: a simple demo program showing how a program can swap between multiple JSGF grammars
  • Dialog Demo: a demo program showing how a program can swap between multiple JSGF and dictation grammars
  • Action Tags Demo: a demo program showing how to use action tags for post-processing of RuleParse objects obtained from JSGF grammars
  • Confidence Demo: a simple demo program showing how to obtain confidence scores for result
  • Lattice Demo: a simple demo program showing how to extract lattices from recognition results

A number of tests and demos rely on having JSAPI installed. Sphinx-4 can be combined wit FreeTTS to set up a complete voice interface or a VoiceXML server.