Hidden Markov models
(HMMs)
consist of
states
connected by directional arcs or
transitions
containing probability information. A. A. Markov formulated the basic structure in 1913 to describe letter sequences in Russian. Each state in Markov's model corresponded to a single letter, while the transition linking
A
to
B
represented the probability that
B
would follow
A
. James Baker of Carnegie Mellon University first applied HMMs to speech recognition in the 1970s.
In Markov's original model, a state emits its unique letter, making the path easily discernible. What makes an HMM "hidden" is that it's impossible to determine the path taken through the model on the b
asis of the intermediate outputs. Outputs of HMMs are simply the result of applying probabilities to the input and don't necessarily tell you what state produced them.
Speech recognition constructs the HMM for a word from spoken samples of that word. Each state contains acoustic information about a segment of the word, including acoustic variability. Transitions contain probabilities to determine the likelihood that one state will follow another state. Because they allow a recognition algorithm to move from one state to another based on the input data, HMMs are "nondeterministic" systems.
illustration_link (16 Kbytes)

This diagram illustrates how a speech-recognition algorithm might identify a word by comparing a series of input vectors (i.e., speech samples) with a five-state stored HMM. Here we see three possible paths, all starting at the same state. The orange line indicates the "best path" through this HMM, the one that most closely matches the characteristics of the HMM. For this solution, the first two inputs keep the path in the first state of the HMM. In physical terms, if this HMM represented the word six, the orange path might suggest that the speaker lengthened the
s
sound at the start of the word.