Viterbi Algorithm

Finding most probable sequence of hidden states

We often wish to take a particular HMM, and determine from an observation sequence the most likely sequence of underlying hidden states that might have generated it.

1. Exhaustive search for a solution

We can use a picture of the execution trellis to visualise the relationship between states and observations. 数据挖掘实验室

We can find the most probable sequence of hidden states by listing all possible sequences of hidden states and finding the probability of the observed sequence for each of the combinations. The most probable sequence of hidden states is that combination that maximises

Pr(observed sequence | hidden state combination). 数据挖掘实验室

For example, for the observation sequence in the trellis shown, the most probable sequence of hidden states is the sequence that maximises :

数据挖掘工具

Pr(dry,damp,soggy | sunny,sunny,sunny), Pr(dry,damp,soggy | sunny,sunny,cloudy), Pr(dry,damp,soggy | sunny,sunny,rainy), . . . . Pr(dry,damp,soggy | rainy,rainy,rainy) 数据挖掘论坛

This approach is viable, but to find the most probable sequence by exhaustively calculating each combination is computationally expensive. As with the forward algorithm, we can use the time invariance of the probabilities to reduce the complexity of the calculation. 数据挖掘研究院

2. Reducing complexity using recursion

We will consider recursively finding the most probable sequence of hidden states given an observation sequence and a HMM. We will first define the partial probability d, which is the probability of reaching a particular intermediate state in the trellis. We then show how these partial probabilities are calculated at t=1 and at t=n (> 1). 数据挖掘工具

These partial probabilities differ from those calculated in the forward algorithm since they represent the probability of the most probable path to a state at time t, and not a total.

数据挖掘论坛

2a. Partial probabilities ( d′s) and partial best paths

Consider the trellis below showing the states and first order transitions for the observation sequence dry,damp,soggy; 数据挖掘实验室

[Picture of trellis]

For each intermediate and terminating state in the trellis there is a most probable path to that state. So, for example, each of the three states at t = 3 will have a most probable path to it, perhaps like this;

数据挖掘工具

[Picture]
We will call these paths partial best paths. Each of these partial best paths has an associated probability, the partial probability or d. Unlike the partial probabilities in the forward algorithm, d is the probablity of the one (most probable) path to the state.

Thus d (i,t) is the maximum probability of all sequences ending at state i at time t, and the partial best path is the sequence which achieves this maximal probability. Such a probability (and partial path) exists for each possible value of i and t. 数据挖掘研究院

In particular, each state at time t = T will have a partial probability and a partial best path. We find the overall best path by choosing the state with the maximum partial probability and choosing its partial best path.
2b. Calculating d ′s at time t = 1
We calculate the d partial probabilities as the most probable route to our current position (given particular knowledge such as observation and probabilities of the previous state). When t = 1 the most probable path to a state does not sensibly exist; however we use the probability of being in that state given t = 1 and the observable state k1 ; i.e.

  数据挖掘论坛

[Formula]

 

数据挖掘实验室

- as in the forward algorithm, this quantity is compounded by the appropriate observation probability.

2c. Calculating d ′s at time t ( > 1 )
We now show that the partial probabilities d at time t can be calculated in terms of the d′s at time t-1.

Consider the trellis below :

数据挖掘工具

[Picture]

We consider calculating the most probable path to the state X at time t; this path to X will have to pass through one of the states A, B or C at time (t-1). 数据挖掘实验室

Therefore the most probable path to X will be one of

  (sequence of states), . . ., A, X
  (sequence of states), . . ., B, X
or (sequence of states), . . ., C, X
数据挖掘论坛

We want to find the path ending AX, BX or CX which has the maximum probability. 数据挖掘工具

Recall that the Markov assumption says that the probability of a state occurring given a previous state sequence depends only on the previous n states. In particular, with a first order Markov assumption, the probability of X occurring after a sequence depends only on the previous state, i.e.

Pr (most probable path to A) . Pr (X | A) . Pr (observation | X)
Following this, the most probable path ending AX will be the most probable path to A followed by X. Similarly, the probability of this path will be
Pr (most probable path to A) . Pr (X | A) . Pr (observation | X)
So, the probability of the most probable path to X is :
[Formula]

where the first term is given by d at t-1, the second by the transition probabilities and the third by the observation probabilities.

数据挖掘交友

Generalising the above expression, the probability of the partial best path to a state i at time t when the observation kt is seen, is :
[Formula]

Here, we are assuming knowledge of the previous state, using the transition probabilites and multiplying by the appropriate observation probability. We then select the maximum such. 数据挖掘研究院

2d. Back pointers, f′s
Consider the trellis
[Trellis]

At each intermediate and end state we know the partial probability, d (i,t). However the aim is to find the most probable sequence of states through the trellis given an observation sequence - therefore we need some way of remembering the partial best paths through the trellis. 数据挖掘工具

Recall that to calculate the partial probability, d at time t we only need the d′s for time t-1. Having calculated this partial probability, it is thus possible to record which preceding state was the one to generate d(i,t) - that is, in what state the system must have been at time t-1 if it is to arrive optimally at state i at time t. This recording (remembering) is done by holding for each state a back pointer f which points to the predecessor that optimally provokes the current state.

Formally, we can write

  数据挖掘研究院

[Formula]

Here, the argmax operator selects the index j which maximises the bracketed expression. 数据挖掘工具

Notice that this expression is calculated from the d′s of the preceding time step and the transition probabilites, and does not include the obervation probability (unlike the calculation of the d′s themselves). This is because we want these f′s to answer the question `If I am here, by what route is it most likely I arrived?′ - this question relates to the hidden states, and therefore confusing factors due to the observations can be overlooked.
2e. Advantages of the approach
Using the Viterbi algorithm to decode an observation sequence carries two important advantages:

 

数据挖掘论坛

  1. There is a reduction in computational complexity by using the recursion - this argument is exactly analogous to that used in justifying the forward algorithm.
  1. The Viterbi algorithm has the very useful property of providing the best interpretation given the entire context of the observations. An alternative to it might be, for example, to decide on the execution sequence

      数据挖掘实验室

    [Formula]

    where 数据挖掘研究院

      数据挖掘论坛

    [Formula]

    Here, decisions are taken about a likely interpretation in a `left-to-right′ manner, with an interpretaion being guessed given an interpretation of the preceding stage (with initialisation from the p vector).

    数据挖掘论坛

  1. continued...
    This approach, in the event of a noise garble half way through the sequence, will wander away from the correct answer.

    Conversely, the Viterbi algorithm will look at the whole sequence before deciding on the most likely final state, and then `backtracking′ through the f pointers to indicate how it might have arisen. This is very useful in `reading through′ isolated noise garbles, which are very common in live data. 数据挖掘论坛

3. Section Summary

The Viterbi algorithm provides a computationally efficient way of analysing observations of HMMs to recapture the most likely underlying state sequence. It exploits recursion to reduce comuputational load, and uses the context of the entire sequence to make judgements, thereby allowing good analysis of noise. 数据挖掘研究院

In use, the algorithm proceeds through an execution trellis calculating a partial probability for each cell, together with a back-pointer indicating how that cell could most probably be reached. On completion, the most likely final state is taken as correct, and the path to it traced back to t=1 via the back pointers.

资料全文下载

[数据挖掘专家] [数据挖掘研究院] [数据挖掘论坛] [数据挖掘实验室]
上一篇:Viterbi算法定义( Viterbi algorithm definition)
下一篇:前向算法使用实例(Forward Algorithm)
最新评论共有 0 位网友发表了评论 , 查看所有评论
发表评论( 不能超过250字,需审核,请自觉遵守互联网相关政策法规。 )
匿名?
数据挖掘网站导航 数据挖掘论坛导航
  • 数据挖掘工具
  • 数据挖掘论坛
  • DataCruncher - Cognos
  • MineSet - MathSoft
  • Intelligent Miner - GainSmarts
  • Sqlserver - SAS - Clementine
  • CART - Weka - WizSoft
  • NeuroShell - ModelQuest
  • data mining tools - Darwin
  • 数据挖掘交友
  • 数据挖掘博客
  • 数据挖掘工具
  • 数据挖掘资源
  • 数据挖掘技术算法
  • 数据挖掘相关期刊、会议
  • 研究院联盟合作专区
  • 数据挖掘基础与相关技术
  • 数据挖掘厂商与就业
  • 数据挖掘研究者乐园
  • 知名厂商数据挖掘工具资料
  • 国内数据挖掘实验室
  • Foreign Data Mining Lab
  • 热点关注
  • 方差分析软件下载
  • 因子分析
  • 第七章 主成分与因子分析
  • 第五章 相关与回归分析
  • 第八章 聚类分析与判别分析
  • 一段求极值的matlab代码 SGA
  • 第十三章 时间序列分析
  • 利用Excel进行医学统计t检验分析
  • 第六章 试验设计与方差分析 (1)
  • 第九章 典型相关分析
  • 论坛最新话题
  • Foundations of Statistical Natural Langu
  • Game Theory meet Data Mining: A Recent P
  • System Building: How does it help or hin
  • 数据挖掘与Clementine培训
  • 新手报到
  • 求 SASEM 客户流失预测分析
  • 数据挖掘工程师/搜索研究院—北京——无线
  • 数据挖掘入门介绍(如何着手数据挖掘)
  • Information Overload Survey Results
  • The INEX 2005 Workshop on Element Retrie
  • 相关资讯
  • JASA中一组经典的统计学文章
  • 中国文化与中国的统计科学
  • “显著性”的关系和这种关系中的陷阱
  • 有关标准化回归系数的误用
  • 描述性回归与预测性回归
  • 论文撰写中常见的统计学问题及其处理
  • 医学论文中常见的统计学处理问题
  • 心理统计学(Psychological Statistics)
  • 统计学习笔记——因素分析
  • 统计学习笔记-判别分析
  • 数据挖掘实验室资料
  • 数据挖掘博客地址
  • 数据挖掘实验室网站地址
  • Prepare for Medicare audits by using dat
  • 注册成为SAS用户与爱好者俱乐部会员
  • 水南梅
  • 明日烟
  • 新人报道
  • 下载
  • 厦门服务器托管,450元/月—0592-5177319 高
  • 买空间送域名--0592-5177319 高静