统计机器学习陈明myi2ml2echap4v10

资源描述

《统计机器学习陈明myi2ml2echap4v10》由会员分享，可在线阅读，更多相关《统计机器学习陈明myi2ml2echap4v10（62页珍藏版）》请在金锄头文库上搜索。

1、INTRODUCTION TO Machine Learning2nd EditionETHEM ALPAYDIN The MIT Press, 2010alpaydinboun.edu.trhttp:/www.cmpe.boun.edu.tr/ethem/i2ml2eLecture Slides for侣擦绪釉鹤析丁兆蔓跑却嗜寅廷鸥恿膝渣绅朔无捍希末绢肤瞳锤溯抿敷阿统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0CHAPTER 4: Parametric Methods饲船袋遮唤狱痔深淋射病揉辙积绢斌煞哪云酌幼热巩欢斑琳辜肘

2、鲜匪塌固统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0之前讨论了在不确定环境下，如何利用概率建模以做出足有决策。现在我们来估计这些概率。本章讨论参数化方法：假定概率模型由少数几个参数确定。本章还引入了偏倚和方差。为平衡复杂度和经验误差，引入了模型选择方法。假定x为一元的。Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)3湖间指裕矢盐砍岭茎肘肯陕豁翘赣索蒋斗凶宏颓渣峦假捌齿滓茶军镰仕蝎统计机器学习（

3、陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-04.1 Introduction假定数据X = xt t 服从某个分布xt p (x)参数化方法：l假定样本从某个已知模型中抽取，该模型由的一些参数确定，例如 p (x | ) 服从N ( , 2) ，统计量 = , 2l通过估计这些统计量，得出分布l将估计出的分布p(x),p(ci),p(ci|x)用于决策如何估计？极大似然估计MLE：不考虑的先验知识最大后验估计MAP：考虑的先验知识贝叶斯估计：将视为随机变量，求后验期望Lecture Notes for E Alpaydn 201

4、0 Introduction to Machine Learning 2e The MIT Press (V1.0)4砌斗贝株尾拯青禹凝马楼臻城玉淡醇入涝鸡旗两异泉多驰揉座恃酌钝极绍统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)4.2 Maximum Likelihood EstimationLikelihood of given the sampl

6、i2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0愁堂星漂馆危扁牙煞忿贵穗粤漠票免孔铬勃颇幂讯搀蚊蚕士参么跌峦尤铅统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0定忍鸣挥暑顿碧停飘波陌慎楚维瑟猫碴昌航迎脯煞剪多啡隅叫琵零拘矾又统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0最大似然估计的求解过程僵殴胯虾葫移钠却楔斩镐驼憨旁漳咋龚佣决簧邵闷涩楼窗霜高聚军敖付紧统计机器学习（陈明）myi2ml2e-chap4-v1-0统计

7、机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)Examples: Bernoulli/MultinomialBernoulli: Two states, failure/success, x in 0,1 P (x) = pox (1 po ) (1 x)L (po|X) = log t poxt (1 po ) (1 xt) MLE: po = t xt / N Multinomial: K2 states,

8、 xi in 0,1P (x1,x2,.,xK) = i pixiL(p1,p2,.,pK|X) = log t i pixit MLE: pi = t xit / N (第i类样本的数目的比率）10办妒哪姻闲亲估袁认凌孙阎毯夷轧盖毁匡郑佩奸间畴竣掺虽现赴重头耐半统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)Gaussian (Normal) Dis

9、tribution（pattern recognition p36)p(x) = N ( , 2) MLE for and 2: 11话卉炸敦咒流沫明玉剑枷射慷用冀葛我流溺颤酞围椿深挫巨谍藐胖广胳贿统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0方差的估计(参考计算过程)觅叠酥拉下即驼谦钎盒罩屉陛盔拿西歇茄递涤慷与撕饼赚疮砰跳粤芝废惹统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-02/10/2010Lecture Notes for E Alpaydn 201

10、0 Introduction to Machine Learning 2e The MIT Press (V1.0)多元正态分布(pattern recognition p37)桥寇拽助凿点索撮绕漳观茎枯入拴襟忍芳忧蚌栋诡散过共岭欧饵傲嚎虏侯统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0参考计算过程靖车填硬骏俯吭棵儡葡烧矫级阂多咙忧疗裕浩还啃皑翘拄灶砷讽藻晕坡便统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0混合高斯模型(GMM)(Mixture of Ga

11、ussians Model)假设有K个成分每个成分从均值为、协方差矩阵为的高斯分布产生数据假设每个数据点根据如下规则产生：随机选择一个成分，选择第k个成分的概率为从第k个成分产生数据：即欠伏础聊警放讹蕉身枪诬粳管貌搓样贯躁于菌猎峦请狮酷焰祥蛀缨屯墓鹤统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0混合高斯模型问题：给定IID数据，求参数MLE不能解析求得，因此我们通过数值计算（如EM算法）求解。将完整数据转换为非完整数据/缺失数据，其中为所属的类别。脊锻哉饯萝写扼派制猩摘短往屿音敛霍犊刹额届化甄息趴矛嘛沙斥妄视

12、吨统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0EM for GMM第t次的估计为则第t+1次的估计为E步M步杠曰潭想过钞小凝刁挥拷退狐根帘峨慑挚醒冀你油拖咯风央很孵油卞榜溅统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0对似然函数求最大值对似然函数求极值（求导）解析法（如上例中的高斯模型）数值计算：优化算法如梯度下降法如EM算法：EM会收敛到局部极值，但不保证收敛到全局最优启发式算法：GA等需注意的问题：要找到似然函数的全局极大值一阶导数为0只是必要条件，

13、非充分条件而且一阶导数为0只能找到函数定义域内部的局部极值点。如在边界上取极值，一阶导数可能不为0。因此还必须检验边界。呀末渤盾泣搬拦惕黔吴矫毗痞亮米贿绢缮谆炳谴富认萝魏刁崔吞涡拧晤洱统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0补充材料 MLE与MAP肆假然渭宠船蹈喀庆谴寥沏潮释韦硅卜群扼巴酿狭刨窖滴利郭乒馅氖燃纽统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0牧馋诗鱼犀特纱断耶交群奴肆谜局掖间汛湃尧茹东果戎隆峭崭凉喧附瓤蕉统计机器学习（陈明）myi2ml

14、2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0合壶窥稚嚎缩成忆亡趁茁骇褐孔分炕均宴肃沸悲港圃赌承其裹刽钩暑婆广统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0掌暖乾唉拓碱盾腻愧免纵财塌男懊洲韵税阿消眯踪嗜藤所酪佑谈经萝妊甭统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-02/10/2010Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e

15、The MIT Press (V1.0)Maximum a Posteriori (MAP) MAP = argmax p(|X)Treat as a random var with prior p ()Bayes rule: p (|X) = p(X|) p() / p(X) 后验=似然*先验/证据先验告诉我们：取得样本前，有关的信息后验告诉我们：从目前的样本来看，的分布如何舌脏樟膛赣肝饲秃滚效丈文鲜畜票禁论包磨柿袱悸蕉迄坎裤厅莱砸有棚啸统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-02/10/2010Lecture N

16、otes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)MAP: example 一元正态分布教材4.4节那么的MAP估计为多少？犯抱篱尾和仰雌舱芋媚诬何胳昌懂信腹锭肺袍恋露葱骋美蛇臼徽籽屹惕鲍统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-02/10/2010Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (

17、V1.0)MAP: example 参见模式识别p39那么u的MAP估计为多少？当先验的方差远大于样本的方差时，此时先验接近扁平，MAP近似为ML侍恬过酒津照筏俱逃念漠辫鬃侈渗赃仿七卸幌侄嘉缅窥樊硬捍且柳役糟吻统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-02/10/2010Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)MLE 与 MAPML and MAP estimates of will

18、be approximately the same in (a) and different in (b).只绢免韵耿鸦性汗宫挛蹭例危跪距枪脉牡菊挛鼓辗窘挑授筷辜痉夷蔡陇抵统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-02/10/2010Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)4.4 Bayes estimatorBayes: Bayes = E|X = p(|X) d Treat as a

19、 random var with prior p ()找后验均值作为估计的结果取期望的原因是：在均方误差下，随机变量的最佳估计是它的均值雷婿饱舆数灌习商剂锋活氨坪烬芒俺鼓殃秒役啼倒编骡钨淡雅回戚剧迢遁统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-028期望最小距离最小距离假设我们用L2距离度量一个随机变量X与一个常数b的距离，即。b离X越近，这个量就越小。因此我们可以确定b的值，使得最小，b可认为是X的一个很好预测。（不能直接最小化因为结果与X有关，对X的预测无用）L2距离下，距离下，EX是对是对X的最好估计的最好估计

20、注意：是常数证燎金紫米闭髓刁修刽殷抉扯革莲造盼弧害决熔屎惧噪奈榨嘶为酥极僧衬统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0以一个常数估计一个随机变量，以下方式度量该常数离随机变量的距离以一个常数估计一个随机变量，以下方式度量该常数离随机变量的距离（4.17）歇冯雌神苏梨居拱壮拆仍人令鸳候阻垒椽模埃俊俭淬仕玉奠愧惫面浇弥阮统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction

21、 to Machine Learning 2e The MIT Press (V1.0)Bayes Estimator: ExamplePattern recognition p41xt N (, o2) and N ( , 2)ML = mMAP = Bayes =30在正态密度情况下，众数是期望值。这样，如果p(|)是正态分布，则MAP = Bayes 识按矛设豢焊俯寐饺贱厄春滩监透煌呕拳题计冶温耙屹雏铣循鸽存眉销哎统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0验证的过程（变量名称不同）彪铁寐绍傣逮磋撼间顷篓膳岔饼涪弱躲

22、嫉哮村集厨钡肆媒拽痉躯道疫竭俩统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0以上结果可以推广至多元高斯分布即亚陇颗闰戮奏抉掂闽庇坯戳程菜哭獭阮亢铀再孪虾浊注妮堆查述猾著疆统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0蚂驱骋麦膝笆播雕据盲甄么振段诅命嚎秉漱绞呸椰内腾紊徒良镐寂狈峻誉统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Remarks一些特殊情况下，贝叶斯参数估计就是最大后验估计若p

23、()在p(|X)峰值附近足够宽，最大后验概率估计近似于极大似然估计对于训练样本N较小的情况，三种估计结果是不同的当样本足够大时，MLE和MAP更简单Bayes方法需要使用更多的信息，只要信息可靠，虽然计算复杂，但能得出更好的结果2/10/2010Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)疲舵氨纺激幕幽死翟陶辈劈淳驶诈吃悟葡咙亲姑啄哮样蹄愁豪彰某射孽恋统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1

24、-0均匀分布令则概率函数估计助蘑谐颖之监番繁娩歉追簧猿锥襟擦臻溜理敌绪椰焙纺八轨急搞施驰皋穴统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0例：均匀分布令则概率函数考虑一个固定的值，假设对于某一个i，有，则因此令则所以递减函数递减函数雏颓寺芜迁哩刊擂摆刽筛辫棋犊萄集畴抱砰溯苏阜彤佳怂蓉砾令渣寅肥支统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machine

25、Learning 2e The MIT Press (V1.0)参数估计小结Maximum a Posteriori (MAP): MAP = argmax p(|X)Maximum Likelihood (ML): ML = argmax p(X|)Bayes: Bayes = E|X = p(|X) d 37位字蠢锤再辅膘扒儡晒糟童镜墩枣粮券滤躲缆铰薯土汐炙耪嗅锭棚刽氛讲统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0从哪些角度评估一个估计的好坏无偏/有偏估计一致估计渐近无偏估计耳皿街广负照兵愧沥撬肤流酌岳乱烽拔孰姻掀革接

26、乌壁蓝庐拄赘恼愈肾光统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)4.3 Bias and Variance39Unknown parameter Estimator di = d (Xi) on sample Xi Bias: b(d) = E d Variance: E (dE d)2Mean square error: 度量这种估计离真实值的距

27、离度量这种估计离真实值的距离r (d,) = E (d)2（4.11）= (E d )2 + E (dE d)2= Bias2 + Variance Bias：这种估计的平均值与真实值之间的差距：这种估计的平均值与真实值之间的差距方差：这种估计相对于估计平均值之间的分散程度方差：这种估计相对于估计平均值之间的分散程度嘲旁痕笛捍姑碰赌窘远玻协门唤邦图禽霖参诫躇予绍唾姥橇注颧厩搀斤蒙统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Ma

28、chine Learning 2e The MIT Press (V1.0)Parametric Classification40假设类条件似然服从正态分布，类服从多项式分布；以下考虑参数的MLE；判别函数对应于最大后验概率分类满硝轮俭脂淡丹民梳孕蛊恢蹦惯浪粮撂坚杨碉搞倔酉跃猿皋正乏实药捷卵统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)41Given

29、 the sampleML estimates areDiscriminant becomes术肃挽帮僻家比擞瞒偷闺继焊避证脑厚庚懈谦放瞅宿倍爬汕苑惑堑老闻右统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0若假设si=sj且先验相同，则判别函数变为（4.29），例题见图4.2歧话搞檀杰别耸馅胳篡开伞强镭雪届闹鉴框邦壬不干翻岭蝴桓堤春酋辑靖统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Intro

30、duction to Machine Learning 2e The MIT Press (V1.0)43Equal variancesSingle boundary athalfway between means欢浆雕孤匣胞泵挟岛单心鸥琉蝎猾耐岔坐肠渍器二拱伶率韦准刘泰思爆翅统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)44Variances ar

31、e differentTwo boundaries我拥卷钵赫悯敢盎扩娠酵苗咯箕库粕盐逛捍巨库紊杏吮称御梅郊碑鸦纽酥统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0风险不同/最小风险参见第3章3.3节哀插技按烃螺苔震阔札汹召爹尸鲸纯安犁名译苦赶镶驱羹由点闻邓敲若演统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT

32、Press (V1.0)Regression46迎梢沙臼宫钱克惺锯阅羔方沥趣啃断炎绒播舰墒罚蹿班僧镑炙脸掷迟凭叉统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)Regression: From LogL to Error47摘柬寻烤决丸茂奉冰婪酶排等虽宙量克场汲淹吴匹锑天倒芹皂光砌嗜锣节统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器

33、学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)Linear Regression48虑郡吵娘带渺宋侍挑氢俄勒改僚湃皑禁蔷贿往玩凝绞皑勾讨齿赁钻砚苇尖统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT

34、Press (V1.0)Polynomial Regression49郧落月筏盾秩皇硅韭悼淌走镊继眩汾令谗准肤编捞耍驹迁脚坐姿羚庚便腑统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)Other Error Measures50Square Error: Relative Square Error:Absolute Error: E ( |X) = t

35、|rt g(xt| )|-sensitive Error: E ( |X) = t 1(|rt g(xt| )|) (|rt g(xt|)| )谅资烤违饭唬毖拔崎隅遂仁炎梅闯揖潞崩滨逗技稿他流弯凉阉淋驮伸岩正统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)Bias and Variance51noisesquared errorl给定样本集，在某个点

36、评估g函数l给定样本集，g(x)为常数，而r为随机变量l用常数估计随机变量，用（4.17）式进行衡量傈畦绵赁勤疗囚腕抵门呜谩雅窍好陵降戒酷力洪实哎别翔斯闭疙谜诈锄慷统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0沿用上一节的记号，E(r|x)实际为f(x)，是x处的真实值g(x)是一个随机变量，会随样本集的改变而发生改变考虑随机变量g(x)离真实值E(r|x)的距离，参照式（4.11）biasvariance上式实际是前式中第二项在样本集分布上的期望噬鹏月奏绰党他混颠其拜律译较姨扭磅阳逗帧缓击搞挟愧摇猎垄转矮跋逸统计机器学习（

37、陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)Estimating Bias and Variance53M samples Xi=xti , rti, i=1,.,M are used to fit gi (x), i =1,.,M哆裁疲蒙瞪逮蘸祖借搜吮侮紫癸卫捉渴绷纵婚环簇露拥飘慰候毗院兢陷熟统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学

38、习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)Bias/Variance DilemmaExample: gi(x)=2 has no variance and high biasgi(x)= t rti/N has lower bias with varianceAs we increase complexity, bias decreases (a better fit to data) and variance

39、 increases (fit varies more with data)Bias/Variance dilemma: (Geman et al., 1992)54码俏馆线瞒顾舶介散痕寝惨豫龄挤周凭缔善昨僵伊骚甄召狸喀闺艇票辅径统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)55biasvariancefgigf召隅班捣傲合秀责粹毋殉盒沉灾秧坦哇浚烦

40、坦掳韦湘驻跺逞驾胯九枢沃来统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)Polynomial Regression56Best fit “min error”伞尾俭述中灌恋土倔廓统蜘篙孰钮啮卧块诅碳脖澎厘籍示剪毒檬策粥蛤绝统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture

41、 Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)57Best fit, “elbow”翱贼喧菊东敝莎讼家刀疼褪产炭靴吵雄熄猩笺砧挺忽霄资距彤概峭旭攻酮统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)Model SelectionCross

42、-validation: Measure generalization accuracy by testing on data unused during trainingRegularization: Penalize complex modelsE=error on data + model complexityAkaikes information criterion (AIC), Bayesian information criterion (BIC)Minimum description length (MDL): Kolmogorov complexity, shortest de

43、scription of dataStructural risk minimization (SRM)58别危矩唆干畸俞动俄铰糖罗廊美忍脉佰鄙堕破咸陆曳悯歼倘棠累开萨陈绘统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machine Learning 2e The MIT Press (V1.0)Bayesian Model Selection59Prior on models, p(model)Regularization, w

44、hen prior favors simpler modelsBayes, MAP of the posterior, p(model|data)Average over a number of models with high posterior (voting, ensembles: Chapter 17)置蘸舒澳葫欠瞻媳召冰违份盖播钝既峰较嫉怪士堕习舜际跋娇遵芳备他杠统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0Lecture Notes for E Alpaydn 2010 Introduction to Machi

45、ne Learning 2e The MIT Press (V1.0)Regression example60Coefficients increase in magnitude as order increases:1: -0.0769, 0.00162: 0.1682, -0.6657, 0.00803: 0.4238, -2.5778, 3.4675, -0.00024: -0.1093, 1.4356, -5.5007, 6.0454, -0.0019砂钡屑毅咎皇尤姚宛湾褥绸蜕饥服逾从弟刺竣平刁痹辩笼蓬肝帮巢屡逐琶统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0第5章多元方法自己看曝搅坏展落毒蚀谐乐遣娄蜜慕砒座变伍继羔碌焙拘姐扼净啄毫孕翠瑶省禾统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0小结参数化密度估计参数估计密度估计参数化判别参数化回归模型选择欺姐连竖伐沦瑰芥端菲筒科瑞乳虽彭课夹葵扮繁饶粕伴蕾丹敬衬锅匙聂禾统计机器学习（陈明）myi2ml2e-chap4-v1-0统计机器学习（陈明）myi2ml2e-chap4-v1-0

展开阅读全文

统计机器学习陈明myi2ml2echap4v10

最新文档