Hello everyone,

I’m back with a very big question.

How to improve accuracy ?!?!?

Its been a week, I’m getting 2-3 mails everyday  regarding “How to improve accuracy”. So this post is answer to all those and upcoming mails. First of all the problem is really big and complex, And to understand the problem first you need to understand the very basics of Speech Recognition and background process of Speech Recognition.

Basics of Speech Recognition are not Java coding part Or Whats gram file Or Why we use Recognizer and Microphone classes .. NO !

Remember ! Changes in Java Coding will not make any major effect in Accuracy. You need to focus on config.xml file.

To solve this problem, All you need to do is, understand the Automatic Speech Recognition, Search and Language Processing Algo’s. Specifically on Sphinx, you need understanding of Hidden Markov Models ( HMMs ) methods of Speech Recognition, Beam Searching, Language Configs and Modeling. Its highly recommended to read any good book before you continue to work on it.

Before we apply the configs on our application to give more accurate results, We must know to how to track results.

Click here to know how to apply Accuracy Tracker on your application.

Next part is, Knowing what kind of SR application you are creating. Basically, SR applications are divided into 2 major categories : Dictation and Command

Command Application When words are very few And Dictation Application When you have a large vocabulary.

And for both type of applications we need a language model or grammar file respectively. A language model is a file containing the probabilities of sequences of words. A grammar is a much smaller file containing sets of predefined combination of words.

Now again on the most important part, Accuracy Tracking. Find as many resources, examples, notes you can. Finally collect databases, Few are also available on Sphinx website. Work on it. Recognize it. Try to get your best results.

Expected Results in Silent Environment :

Command : 5% WER

Medium Vocabulary : 15% WER

Large Vocabulary : 30% WER

If environment is noisy, Multiply the results by 2.

If you are not getting the expected results. Keep trying and try to find the bugs in config file and run time reports.

Still if you are not happy with the results And have got extra time. Open the config.xml file and start playing with Configs and keep experimenting. Also, Make sure you make notes on every configs and their outputs.

If you have any better ideas or any suggestions, Please feel free to post comments.

That’s it for now ! Signing off.

11 Responses

  1. Hello Puneet,

    How WER is calculated in SPhinx.How can we get it from Sphinx through our application.Will it provide WER if we give original audio and recorded speech .

    In my project I called Sphinx from php page by a shell command.Is there any other option to call to Sphinx.

    Expecting reply from u…so thanx in advance:)

  2. Can you please tell me how you can increase the accuracy of the Transcriber demo, Its very urgent! I get nothing every time I execute it.

  3. Hi!

    I’m working with Lattice demo, and i’ve created my own language modell, but my accuracy is really low… :S although i’m testing with the learning text, so it should work…

    could you give me some idea? 🙂

    // my test wave files from an audio book, so they are in great quality, and my modell is ok, i think.

    1. What if I’m developing sphinx4 code in an IDE such as netbeans. How can I improve the accuracy?

      Thanks in advance

  4. Hi puneet,

    i have run sphinx Trancriber and confidence module.and success to run both but facing problem in accuracy mostly in Transcriber it gives few times accurate result.when i spoke slowly then Confidence module gave 90% accurate result but speaking nomal speed it gives horrible result.how i can impove its accuracy .i think its possible by changing in “digits.gnxml” but don’t know how it possible.


  5. Hello, I’m trying to use Sphinx but the microphone on my computer is I think, the problem. For the classical “HelloWorld”, when I say Goodmorning Phillip, the program seems to understand Hello Rita –‘ very strange. I think that the program is very sensitive with the noise which makes my computer. So, I need to reduce that noise. How can I do that ?



  6. hello punit i watched your tutorial for java speech recognition ubuntu

    but my problem is that my helloworld program is running but not getting the output

    Say: (Good morning | Hello) ( Bhiksha | Evandro | Paul | Philip | Rita | Will )

    Start speaking. Press Ctrl-C to quit.

    You said:

    Start speaking. Press Ctrl-C to quit.

    You said:

    Start speaking. Press Ctrl-C to quit.

    You said:

    Start speaking. Press Ctrl-C to quit.

    1. hello , I got the same error but I found from another website that it may be problem of your microphone . So please check with micro phone and or just say “hello paul” or any other 2 words and just wait. 1st time i got the same error and now some words are correctly working and trying to update more clarity.

      ** chance to be the slang problem maybe not like “U.S.A”

Leave a Reply

Your email address will not be published. Required fields are marked *