The goals of this assignment are to gain first hand experience in training an HMM based speech recogniser. We will use the freely available Sphinx recogniser and train it on some Australian speech to get a localised recognition engine. This project is sufficiently involved to have you work in pairs to complete it. I will give each pair a different set of training data so that we can compare results at the end. Tasks that will need doing are:
Pre-process the speech data to fit into the formats required by the training tools.
Develop a pronunciation dictionary for Aus. English in the appropriate format for Sphinx. There are a few sources of pronunciations that we might use here.
Establish an HMM architecture, how many states per model, generate initial models etc.
Train the HMM
Develop a language model for a chosen test application (eg. route planning!)
Deploy and evaluate the recogniser.
Many of the details will be worked out as we go but we are helped by a good set of tools and documentation from CMU at the Sphinx website.