ML-4-1 期望最大算法－金锄头文库

资源描述

《ML-4-1 期望最大算法》由会员分享，可在线阅读，更多相关《ML-4-1 期望最大算法（54页珍藏版）》请在金锄头文库上搜索。

1、1,机器学习,图像中心,第3章期望最大算法,2,内容提要,一、期望最大(Expectation Maximization, EM) 算法及应用二、GMM及其应用,3,一、EM算法及应用,1 最大似然估计(Maximization Likelihood),设总体的密度函数是参数或参数向量，是该总体的样本，对给定的一组观测值，其联合密度是的函数，又称似然函数，记为：,4,其中为参数集，若存在使就称是的最大似然估计值，而是的最大似然估计量。,求最大似然估计法的步骤,第一步:写出似然函数 ; 计算 ; 第二步:如果似然函数关于参数是可微的,求 ; 第三步:解方程 , 从中得到

2、使取得极大值的 , 就是参数的最大似然估计量.,5,6,解方程,得到,7,Problem 1,Fitting points to one line,Solution: LSE,Fitting points to two lines,How to determine which line generates each point? Solution: ?,2 实际问题,8,Problem 2: Parameter Estimation for Mixture Models,Single Gaussian,Solution: MLE by maximizing,9,Multiple Gauss

3、ians,Which component does each point belong to? Solution: ?,10,Incompleteness Analytically hard,Common Feature,Likelihood of parameter given data X: Maximize expectation of L by “tweaking” ,11,Common Feature - Incompleteness,Not Known,A paradox: Interdependency between hidden values and parameters g

4、overning the distribution,12,A more general problem: estimating the parameters of a p.d.f.,No direct access to the needed data (missed) Outcome is an accumulation of simpler outcomes(grouped) The number of underlying data is unknown (truncated),13,A.P.Dempster, et al 1977: “Maximum likelihood from i

5、ncomplete data via the EM algorithm” General statement of the algorithm Prove convergence Coin the term ”EM algorithm”,2 EM Algorithm Creative work,14,在许多实际的学习问题框架中，相关实例特征中只有一部分可观察到已有许多方法被提出来处理存在未观察到变量的问题比如，如果某些变量有时能观察到，有时不能，那么可以用观察到该变量的实例去预测未观察到的实例中的变量的值 EM算法是存在隐含变量时广泛使用的一种学习方法，可用于变量的值从来没有被直接观察到的

6、情形，只要这些变量所遵循的概率分布的一般形式已知用于GMM的训练用于马尔可夫模型的训练,15,Stochastically independent Bayes rule Logarithm Expectation,Review,16,Maximize expectation of L by tweaking : Analytically hard,Idea：,Introduce nonexistent data Y Incomplete data X Stochastically independent Y Complete data Z := (X, Y) - Easier!,17,So

7、lution ： Introducing additional variables to make it complete,Z=(X,Y),Incomplete data likelihood,Complete data likelihood,18,Define,19,Initialize with random/guess, set n=1 E-step: use current parameters to estimate M-step: compute maximum likelihood estimation of using set n = n+1 repeat until conv

8、ergence,EM Algorithm步骤,20,Solution to Line Fitting,Parameters: = a1, b1, a2, b2,Posterior Probability:,where, l=1,2,21,Expectation:,Maximization: Taking the derivative of Q with respect to al, bl and setting the result to zero, we obtain,Where l=1,2,22,Solution to Mixture Modelling,Parameters:,Poste

9、rior Probability:,where,is a single Gaussian,centered at i with covariance matrixi ai is the mixing weight, i=1,2,M,23,Expectation:,Maximization:,Taking the derivative of Q with respect to al, ul and and setting the result to zero, we obtain the following update rules,24,4 EM Algorithm应用举例,例1,假设我们观察

10、到随机序列,求：,（这里用EM算法来求解该问题，当然，也可以用优化算法如Newtons method.）,25,对数似然函数为：,引入辅助变量,26,X是完整的随机序列，Y是观察到的随机序列。则，完整的随机序列的对数似然函数为：,27,E Step:,注：,28,M Step:,29,EM Iteration Results:,Iteration,Theta,Error,1 2 3 4 5 6 7 8 9 10 11,0.6082474 0.624321 0.6264889 0.6267773 0.6268156 0.6268207 0.6268214 0.6268215 0.6268215

11、 0.6268215 0.6268215,0.1082474 0.01607363 0.002167829 0.0002884433 3.830976e-05 5.086909e-06 6.754367e-07 8.968368e-08 1.190809e-08 1.581141e-09 2.099418e-10,30,例2,现在有来自一个概率分布的一些样本,需要使用这些样本来估计概率密度函数。,我们使用高斯混合模型来近似表示该概率密度函数,其中，是均值和协方差阵分别为,，,的正态分布。,31,（1）假设所有中的均可表示为如下形式：,则，可表示为：,求偏导数：,32,最大似然函数的

12、对数为：,33,其中,34,由,可得,（a）,（b）,35,在,的条件下，估计,需用拉格朗日乘子法,由,可得,36,对于j1,m将各式相加，得：,由,可得,带入上式可得,（c）,37,EM Iteration :,设定一个起始参数值（j=1,m（可用K-means方法设定一个较好的起始参数值）使用起始参数计算利用公式（a）（b）（c）来计算令若小于某一個极小的容忍值，则停止。否则令并跳回步骤2。,38,（2）中的不能表示为形式,推导公式类似(1),请自己推导!,39,5 Generalized EM,Assume and function are differentiabl

13、e in .The EM likelihood converges to a point where GEM: Instead of setting (n) = argmax Q (, (n-1) Just find (n) such that Q (, (n) Q (, (n-1) GEM also is guaranteed to converge,40,J.Bilmes, A Gentle Tutorial of the EM algorithm and its Application to Parameter Estimation for Gaussian Mixture and Hi

14、dden Markov Models. TR-CS-Berkeley, 1998 A.Dempster, Laird, Rubin. Maximum Likelihood from Incomplete Data via the EM Algorithm. Journal of the Royal Statistical Society, 1977 Neal, Hinton. A view of the EM Algorithm that Justifies Incremental, Sparse and other Variants. Learning in Graphical Models

15、,1998 A.Moore, Very Fast Mixture Model based Clustering using Multi-resolution k-d trees. NIPS 1999 N.Friedman, The Bayesian Structural EM Algorithm. UAI 1998 Belongie et.al, Color- and Texture-Based Image Segmentation Using EM and Its Application to Content-Based Image Retrieval, ICCV 98,Recommended Further Readings,41,三、GMM及其应用,1 基于GMMs的简单图像分割方法,基本思想：把图像象素的R、G、B颜色分量分别看作两个高斯模型的混叠，利用整个图像颜色信息通过EM算法分别求出对应于每一颜色分量的两个高斯模型的参数，根据所得高斯模型，即可将所有象素的每一颜色分量分成两类，并分别赋以对应高斯模型的平均值。由于每个颜色分量都可分成两类，则所有象素可分成8类。上述过程已将8类象素赋予不同的值，从而将整幅图分割成8个区域。,42,43,2 Skin color segmentation,Yang, Ahuja 1999: GMMs

展开阅读全文