Hidden Markov Model© Karobben

Hidden Markov Model

A Hidden Markov Model is a Bayes Network with these assumptions:
Yt depends only on Yt-1
Xt depends only on Yt

The belief network conveys the independence assumption:
$$
for\ all\ i \geq 0, P(S_{i+1}|S_i) = P (S_1|S_0)
$$

$$
P(S_i = s) = \sum_{s’} P(S_{i+1} = s \mid S_i = s’) * P(S_i = s’)
$$

In the context of the equation you’re referring to, ( s ) and ( s’ ) represent states in a Markov chain. Typically, ( s ) is used to denote the current state, while ( s’ ) (read as “s prime”) denotes a subsequent or different state that the system can transition into from the current state ( s ).

The summation over ( s’ ) in the equation indicates that you’re summing over all possible subsequent states that the system can transition to from the current state ( s ). This is part of the definition of a stationary distribution for a Markov chain, where the probability of being in any given state ( s ) is equal to the sum of the probabilities of transitioning to state ( s ) from all possible previous states ( s’ ), weighted by the probability of being in state ( s’ ) at the previous time step.

Key advantage of a hidden Markov model: Polynomial-time complexity

  • Suppose there are |y| different speech sounds in English, and the length of the utterance is d centiseconds (|y| ≈ 50, d ≈ 100)
  • Without the HMM assumptions, to compute f(x)= argmaxP(y1, … , yd|x1, … , xd) requires a time complexity of
    O{|y|d} ≈ 50100
  • With an HMM, each variable has only one parent, so inference is O{|y|d} ≈ 502
  • The computationally efficient algorithm that we use to compute f(x)= argmaxP(y1, … , yd|x1, … , xd) is called the Viterbi algorithm, named after the electrical engineer who first applied it to error correction coding.

it works much better than bayes

Text generated by a naïve Bayes model (unigram model):

  • Representing and speedily is an good apt or come can different natural here he the a in came the to of to expert gray come to furnishes the line message had be these…

Text generated by a HMM (bigram model):

  • The head and in frontal attack on an English writer that the character of this point is therefore another for the letters that the time of who ever told the problem for an unexpected…

Applications of HMMs

  • Speech recognition HMMs:
    • Observations are acoustic signals (continuous valued)
    • States are specific positions in specific words (so, tens of thousands)
  • Machine translation HMMs:
    • Observations are words (tens of thousands)
    • States are cross-lingual alignments
  • Robot tracking:
    • Observations are range readings (continuous)
    • States are positions on a map

Viterbi Algorithm

The Viterbi algorithm is a computationally efficient algorithm for computing the maximum a posteriori (MAP) state sequence
$$
f(x)= argmax_{y_1, … , y_d}P(y_1, … , y_d|x_1, … , x_d)
$$

Author

Karobben

Posted on

2024-02-12

Updated on

2024-02-14

Licensed under

Comments