Hello Everyone,

Today I’m going to tell you how to expand dictionary of acoustic model for Sphinx4. In simple words, This tutorial will tell you how you can add more words in Sphinx’s words database (Dictionary) and let it recognize those words, which are not available in default acoustic models provided by CMU Sphinx. This tutorial is based on “HelloWorld” example provided by CMU Sphinx.

Important Files in this example :

1 ) HelloWorld.java

2) hello.gram

3) helloworld.config.xml

Acoustic Model used in this example :

WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar

Lets say, We are creating a SR system for ABC National airlines. Everything will go fine and Sphinx will recognize most of the words except the name of cities and states of India.Β  Now, I will tell you, How to add name of cities and states in dictionary.

PART ONE

Step 1 : Create a txt file “words.txt”, Write all the names of cities and states in it and save.

Step 2 : Open this link : http://www.speech.cs.cmu.edu/tools/lmtool.html

Step 3 : On that page, go to “Sentence corpus file:” section, Browse to “words.txt” file and click “Compile Knowledge Base”.

Step 4 : On next page, Click on “Dictionary” link and save that .DIC file.

PART TWO

Step 1 : Extract WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar file.

Step 2 : Go to edu\cmu\sphinx\model\acoustic\WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz\dict folder.

Step 3 : Open “cmudict.0.6d” file in that folder.

Step 4 : Copy data from .DIC file, you have downloaded in PART ONE, paste it in “cmudict.0.6d” file and save.

Step 5 : Zip the extracted hierarchy back as it was and Zip file named should be same as JAR file.

Now, remove “WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar” file from Project’s CLASSPATH and add “WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.zip” instead of it.

That’s it ! We are done.Β  Now Sphinx will also recognize all name of cities and states that we wrote in “words.txt” file.

Now, FAQ time. I will be posting FAQ and few important notes in comments. πŸ™‚

If you have any quires, Please feel free to ask.

Regards,

81 Responses

  1. Hello sir,

    Thanks a lot for such a nice tut…i have one request..please provide a tut for changing the grammers in Helloworld program..i mean how do i add/change few more names other than what is provided.

    Like i want to add words like “please jhon” etc…

  2. Because of reading your blog, I decided to start my own. I had never been interested in keeping a blog until I saw how interesting yours was, then I was inspired!

  3. hello Puneet,

    we tried to extend the acoustic dictionary as you hav said in the tutorial.

    But we get some error like

    16:38:40.740 WARNING dictionary Missing word: pune

    16:38:40.748 WARNING jsgfGrammar Can’t find pronunciation for Pune

    note : we’re working in the Eclipse IDE.

  4. Hello,

    Thanks for posting, sorry for late reply. Was little busy.

    Well, ..

    16:38:40.740 WARNING dictionary Missing word: pune

    16:38:40.748 WARNING jsgfGrammar Can’t find pronunciation for Pune

    ..

    This means, Sphinx is unable to find pronunciation of “Pune” in dic file of Acoustic model.

    Follow the steps correctly πŸ™‚ And it will work for sure πŸ™‚

    If you still face any problem, Please feel free to post here.

    Regards,

  5. Hello Shahid,

    Hmm, well i never tried it.

    I’ll suggest you to use Eclipse IDE.

    OR

    You’ll have follow the complete process of packing the files into JAR.

    Hope it works ! πŸ™‚

    If you have any queries, Please feel free to ask πŸ™‚

    Regards,

  6. hey thanx frnd…i didnid expect reply 4m u…actually dat was my first comment to one website:)…but i really want a tool for English not Hindi:( can u help me…

  7. Hii..

    step 2 in PART ONE of your post talked about one dictionary tool…right??

    I think , that tool will only give pronunciation of US Accent…For my project I need a tool which gives pronunciation of Indian English Accent …hope u can understand this:(

  8. Hello Nita,

    Hmm, I tried on Google. I don’t think there’s any Acoustic model available for Indian English ( in free ). There are few but paid ones by HP, IBM n few more.

    If you need any kind of help, Please feel free to post πŸ™‚

    Regards

  9. Hi Puneet Karla,

    I need to do speech analysis using sphinx. is it possible or do we need to try with any other alternate software? Please let me know your kind suggestion on this. Also, is Sphinx returns only text as output?

  10. Pingback: SCOTT
  11. Hi,

    I am working on a project on Speech to text conversion and I was previously using the Sphinx Train, but as you have suggested that Sphinx Train is a very tedious process, which indeed it is, and also with an addition that I am stuck at a place because of error.

    I am not able to move on.

    I wanted to know develop it for a set of my own 50 discrete words (not sentences). So will the above method work in doing that.

    And also please tell me what all changes in the Java Files do I have to make to make it recognize.

  12. Hello Dipesh,

    Go for “HelloWorld” demo provided by sphinx, As you are going to do Command based recognition.

    Follow the steps of this post. Once you are done with it. Try to play with Config file as it will increase accuracy. Again, Changing in java file has nothing to do with recognition accuracy.

    Hope that works ! Feel free to ask any questions.

    Regards,

  13. Hello Puneeth,

    I wanted to know, if there is any way to move grammer file out of the src folder.I need this because i am exporting the java project as a jar and i want to update the grammer file from outside.

  14. Hey guys,

    Sorry for late reply. Was busy with my assignments.

    @Salma, It’s Java CLASSPATH variable, It is used to add (or refer) to jar/zip (libraries) required to run any specific class with requirement of those libraries.

    On IDE side, Its just, you need to import those libraries to your project.

    @Puneeth, Yes of course it is possible. There’s one example for linking resource out of jar file. Try to find that demo on CMU site or just to try something like “resource:file://DRIVE:/folder/” .

    Hope it works ! if now, Just let me know.

    Regards,

    1. Hello Puneet,

      Grammar file can be moved out of the src folder as shown below.

      “file:///DRIVE:/FOLDER/GRAMMAR_FILE”

      Actually i have ported entire Java APP into an eclipse plug-in that can be plugged into any of the eclipse IDE or any RCP and control the app with voice.

      I have already mapped the words with the eclipse commands so that user can invoke any command in eclipse using his/her voice.

      I wanted to get the dictation mode into the application meaning if user goes on dictating the code, then the application should write it in the editor.To achieve this lot of accuracy is req, how can i improve upon that.

      And do you think it makes any sense??

  15. Hello Sir,

    Iam a final year B.Tech student.It was my dream to do a project in speech recognition.After spending a lot amount of time for searching I found sphinx.I tested the “HelloWorld” example and edited to recognise my own words.But I want something more accurate.If I said “ON”,is it possible to recognise exactly as “ON”?

    Now the problem is if I said “PON” or “TON” it will recognise as “ON” exactly the same way as when I spoke “ON”.

    My project is to control home appliances using voice.

    So accuracy is important.

    I want to recognise these words,

    “ALL ON”

    “ALL OFF”

    “BULB ONE ON”

    “BULB TWO ON”

    “BULB ONE OFF”

    “BULB TWO OFF”

    “FAN ON”

    “FAN OFF”

    “ROOM ONE”

    “ROOM TWO”

    “EXTERNAL”.

    Everything else should be displayed as “i don’t know”.

    Can you help me?

    What should I do for it?

  16. Hii

    I want to ask that how can we make our grammar to recognize eveything a user says? Can we add words at runtime or make it adaptive??

  17. Hi puneet

    you have done really a splendid job to create such a nice tutorial. I am running the helloworld application according to your tutorial. It run without any error but I got only

    Say: (Good morning | Hello) ( Bhiksha | Evandro | Paul | Philip | Rita | Will )

    Start speaking. Press Ctrl-C to quit.

    on screen, after that nothing happen. I have checked this application in both Eclipse and My Eclipse but the result was same. when running with the source code and debugging I found program hangs at the line

    Result result = recognizer.recognize();

    Thanks & Regards

    Samrat

  18. hello sir, i successfully did everything to include different states and cities in india.thanks πŸ™‚

    the problem is how can i make sphinx recognize the additional words?i work in ubuntu and i know only the commands to run the demo programs available along with the application like ‘HelloWorld’.

  19. Hi sir,

    I have an urgent query. I got to do an application using sphinx. For this I am using eclipse. I have an idea of editing the HelloWorld demo for it. But it was not possible. I do not have the source jar. From where should I download them?? Could you please enlighten me on doing it??

  20. Sir,

    I followed the tutorial. But I get struck at the very first step.

    When I click compile knowledge base, the following error shows up:

    Error in FORM header! []

    FORM error in form block (formtype)

    FORM error in form file (corpus)

    FORM error in form file (handdict)

    FORM error in form file (extrawords)

    FORM error in form block (phoneset)

    FORM error in form block (bracket)

    FORM error in form block (model)

    FORM error in form block (class)

    FORM error in form block (discount)

    Terminating process.

    Don’t know why!!!

    Really confused!!

  21. Hello, Puneet.

    I just want to recoginze about 10 discrete words in my languge, Vietnamese. I have learnt about SphinxTrain but it’s too complex.

    Do you have and trick to solve this problem?I don’t have enough time!

  22. sir,i’m expecting your reply for my earlier post:

    “hello sir, i successfully did everything to include different states and cities in india.thanks πŸ™‚

    the problem is how……”

    please do reply.thank you.

  23. Hey Athul,

    If you have followed the tutorial, you are almost done. Just write a grammar file with names. Use the modified acoustic model. That’s it !

    Hope that helps πŸ™‚

    Regards,

  24. Thnx puneet for previous reply!!!!

    i have 2 more doubts as follows:

    Is it possible to make sphinx recognize all words from its vocabulary without including it in the grammer?

    I want to ask that how can we make our grammar to recognize eveything a user says? Can we add words at runtime or make it adaptive??

    If thr is ne tutorial plz provide me link

  25. i am B.E. student my final year project is speech recognition. will sphinx help for accessing g_mail account using voice command

    as if i said inbox then inbox should open

    plz help me

      1. Hey Ashish,

        Thanks for reply,

        I believe it will not be as simple as you have mentioned. You will be interacting with 3rd party application for that and OS will also play major role here. Simply opening or redirection will not be enough as opening it every time in new window will look make it messy. So basically you will have to gain control over the “Web browser” no matter which one you are going to use. That can be done through processes or using their specific API.

        Hope that works ! πŸ™‚

        ~ Puneet Kalra

  26. we followed your procedure for expanding d accoustic model

    and performed the following :

    1) copy pasted the .DIC file contents in cmudict.0.6d

    2) v did d extraction process

    3) then we used the HelloWorld demo code

    a) changed the grammar from (Good Morning | Hello) to

    (Mumbai | Bangalore)

    b) the classpath in helloworld.manifest was changed from .jar to .zip

    On running We are getting NoClassDefFound Error

    Can you please help us out in this:-

    Please elaborate that removing of WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar to WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.zip.

    We are usind Eclpise as our IDE

  27. Can you please explain more in detail how to do the “(Now, remove β€œWSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar” file from Project’s CLASSPATH and add β€œWSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.zip” instead of it)” which you have mentioned in the last part …… please thanks alot for your postings… it helped me alot on my project ….

    1. Procedure you requested :

      1. Go to the folder sphinx4-1.0beta5\lib

      2. Extract WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar file using winrar

      3. You will get one folder named as WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz

      4. go to sphinx4-1.0beta5\lib\WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz\dict folder

      5. open cmudict.0.6d file & add your own words

      6 . Go to command prompt & travel to sphinx4-1.0beta5\lib folder

      7. run following command jar -cf WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz.jar WSJ_8gau_13dCep_16k_40mel_130Hz_6800Hz

      8. Refresh Project in eclipse & run project

  28. hey puneet i followed all the above given steps and when i am trying to run the helloworld.java file i am getting a launch error which says that “selection does not contain the contain the main type” what might be the possible thing which causes this and can u please explain me more about the class path step .!!

  29. Hi you are very good, πŸ™‚ i have 1 doubt, how can i change the language english for spanish??? i have been trying for a while but i cant find the way to get it, thank for your time and sorry for my english πŸ˜€

  30. Hey,

    @Samitha, We are using our modified acoustic model instead of the one comes with CMUSphinx. Then only Sphinx will be able to recognize new added words.

    @Alma, Acoustic Model !! Check out if theres any in spanish, Check the CMUSphinx’s website. They got 19 different acoustic model i guess.

    Regards,

    Puneet Kalra

    1. Hello,

      I was wondering the same you previously asked, Ashish. Maybe you’ve got the answers now and you can help me.

      Is it possible to make sphinx recognize all the words from its vocabulary without including them in the grammar?

      Or if I have always to use a grammar then how can I make the grammar to recognize eveything a user says (all the words in my dictionary)?

      I’m using a spanish acoustic model with Sphinx, but it only recognizes what it is on my gram archive.

      Thanks for your help!

      Mar

  31. hai puneet,

    u done a awesome job,

    ur website tutorials are very useful to me….

    In Acoustic Model ,i made changes with some city names in “cmudict.0.6d” file….what u have told in part 1 and 2..till now ok..

    but i cant understand..how it(cmudict.0.6d) would be run…

  32. Dear Puneet thks a lot for your tutorial.

    I want to expand this dictionary of acoustic model for regional Indian language(Gujarati). what i have to do for that??

    plz guide

Leave a Reply

Your email address will not be published. Required fields are marked *


Notice: Use of undefined constant STOPSPAM_PLUGIN_VERSION - assumed 'STOPSPAM_PLUGIN_VERSION' in /home/webpilla/puneetk.com/wp-content/plugins/stop-spam/stop-spam.php on line 36

Notice: Use of undefined constant STOPSPAM_PLUGIN_VERSION - assumed 'STOPSPAM_PLUGIN_VERSION' in /home/webpilla/puneetk.com/wp-content/plugins/stop-spam/stop-spam.php on line 40