Puneet Kalra - www.Puneetk.com

home | pwing | pikk

biography | facebook | contact

“Developing real time human a like robotics system with extra ordinary artificial intelligence, Not only artificial intelligence. A system that can learn new things itself” ~Puneet Kalra

< SUBSCRIBE >VIA FACEBOOK
FAVOURITES
Videos »
GET INSPIRED

Posts Tagged ‘Sphinx 4’

And the research continues ..

August 31st, 2010

Hey everyone,

I hope you guys are doing well.
First of all, Thanks for being so supportive, appreciating my work and posting such nice comments. Also, Sorry for not updating my blog as I’m really busy these days.

And yes, few updates from my research.. Yeah Yeah! I know, I’m busy but still, I can’t stop it, I’m addicted to it now. New things, New problems, New ways to think and Finally the New SOLUTIONS ! That’s how it goes!

Let’s talk about Sphinx, firstly, I got a partner to work on it. “Puneet Jindal” , Another Stubborn guy like me *Lol*, always ready to burn up his mind and a die hard Algo’s Lover. He’s pursuing B.Tech ( finaly year ) from NIT,Kurukshetra. We have got 85-90% accuracy on hundreds ( as the Accuracy Tracker says ) and now we are working with thousands of words to get same accuracy level on them.

Second major topic is HTML5, And I’m really loving it ! Not much to share about it. Just want to say, “HTML5 is just SO AWESOME” !

Now the upcoming topics, Optical Character Recognition (OCR), 3D Painting and Gaming/Artificial Intelligence Algo’s. I haven’t really started working on these topics, You can say that I’m having my one eye on them.

That’s it for now !
~Puneet Kalra

How To Improve Accuracy ( Sphinx 4 )

March 18th, 2010

Hello everyone,

I’m back with a very big question.

How to improve accuracy ?!?!?

Its been a week, I’m getting 2-3 mails everyday  regarding “How to improve accuracy”. So this post is answer to all those and upcoming mails. First of all the problem is really big and complex, And to understand the problem first you need to understand the very basics of Speech Recognition and background process of Speech Recognition.

Basics of Speech Recognition are not Java coding part Or Whats gram file Or Why we use Recognizer and Microphone classes .. NO !

Remember ! Changes in Java Coding will not make any major effect in Accuracy. You need to focus on config.xml file.

To solve this problem, All you need to do is, understand the Automatic Speech Recognition, Search and Language Processing Algo’s. Specifically on Sphinx, you need understanding of Hidden Markov Models ( HMMs ) methods of Speech Recognition, Beam Searching, Language Configs and Modeling. Its highly recommended to read any good book before you continue to work on it.

Before we apply the configs on our application to give more accurate results, We must know to how to track results.
Click here to know how to apply Accuracy Tracker on your application.

Next part is, Knowing what kind of SR application you are creating. Basically, SR applications are divided into 2 major categories : Dictation and Command

Command Application When words are very few And Dictation Application When you have a large vocabulary.

And for both type of applications we need a language model or grammar file respectively. A language model is a file containing the probabilities of sequences of words. A grammar is a much smaller file containing sets of predefined combination of words.

Now again on the most important part, Accuracy Tracking. Find as many resources, examples, notes you can. Finally collect databases, Few are also available on Sphinx website. Work on it. Recognize it. Try to get your best results.

Expected Results in Silent Environment :
Command : 5% WER
Medium Vocabulary : 15% WER
Large Vocabulary : 30% WER

If environment is noisy, Multiply the results by 2.

If you are not getting the expected results. Keep trying and try to find the bugs in config file and run time reports.

Still if you are not happy with the results And have got extra time. Open the config.xml file and start playing with Configs and keep experimenting. Also, Make sure you make notes on every configs and their outputs.

If you have any better ideas or any suggestions, Please feel free to post comments.

That’s it for now ! Signing off.

Expanding Dictionary Of Acoustic Model

January 5th, 2010

Hello Everyone,

Today I’m going to tell you how to expand dictionary of acoustic model for Sphinx4. In simple words, This tutorial will tell you how you can add more words in Sphinx’s words database (Dictionary) and let it recognize those words, which are not available in default acoustic models provided by CMU Sphinx. This tutorial is based on “HelloWorld” example provided by CMU Sphinx.

Important Files in this example :
1 ) HelloWorld.java
2) hello.gram
3) helloworld.config.xml

Acoustic Model used in this example :
WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar

Lets say, We are creating a SR system for ABC National airlines. Everything will go fine and Sphinx will recognize most of the words except the name of cities and states of India.  Now, I will tell you, How to add name of cities and states in dictionary.

PART ONE
Step 1 :
Create a txt file “words.txt”, Write all the names of cities and states in it and save.
Step 2 : Open this link : http://www.speech.cs.cmu.edu/tools/lmtool.html
Step 3 : On that page, go to “Sentence corpus file:” section, Browse to “words.txt” file and click “Compile Knowledge Base”.
Step 4 : On next page, Click on “Dictionary” link and save that .DIC file.

PART TWO
Step 1 : Extract WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar file.
Step 2 : Go to edu\cmu\sphinx\model\acoustic\WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz\dict folder.
Step 3 : Open “cmudict.0.6d” file in that folder.
Step 4 : Copy data from .DIC file, you have downloaded in PART ONE, paste it in “cmudict.0.6d” file and save.
Step 5 : Zip the extracted hierarchy back as it was and Zip file named should be same as JAR file.

Now, remove “WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar” file from Project’s CLASSPATH and add “WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.zip” instead of it.

That’s it ! We are done.  Now Sphinx will also recognize all name of cities and states that we wrote in “words.txt” file.
Now, FAQ time. I will be posting FAQ and few important notes in comments. :)

If you have any quires, Please feel free to ask.
Regards,