快速深度学习大数据极限学习机(简介)

资源描述

《快速深度学习大数据极限学习机(简介)》由会员分享，可在线阅读，更多相关《快速深度学习大数据极限学习机(简介)（30页珍藏版）》请在金锄头文库上搜索。

2、 Neural Network BP 2 )( 2 1 ii ydE= di desired output yi NN output 2 )( 2 1 XWfdE T ii = j T iii ij xXWfyd w E )( )(= j=1,2,n ij i w E W = BP slow gradient-based learning algorithms all the parameters of the networks are tuned iteratively local minima, improper learning rate and overfitting only wor

3、k for differentiable activation functions ELM For Training Set: (,),1,2,3 jj xyjN= activation functions: ( )( ) 1 sigmoid 1 x g xx e = + 11 ()(),1,2,3, LL iijiiijij ii g xg wxbyjN = =+= ELM Model is as follows input layer weight: i w output layer weight: i Number of the hidden layer nodes: L 11 ()()

4、,1,2,3, LL iijiiijij ii g xg wxbyjN = =+= HY= 121212 1 112121 1212222 1122 12 12 (,;,;,) ()()() ()()() ()()() , , LLN LL LL NNLNL N L T TTT L M L T TTT N M N H w ww b bbx xx g w xbg w xbg w xb g w xbg w xbg w xb g w xbg w xbg w xb Yyyy = + + + = = where ELM HY=ELM Model: 121212 (,;,;,) LLN H w ww b

5、bbx xx If w and b are given randomly, the output weights can be analytically determined,namely 1 H Y = The only one artificial setting is number of the hidden layer nodes , L H Y + = ELM ELM advantages Batch training, extremely fast learning speed better generalization performance adopt the simplest

6、 method to overcome local minima, improper learning rate and overfitting work for differentiable and nondifferentiable activation functions mathematical foundation ELM disadvantages The number of the hidden layer nodes is artificially given. H is generally a non-square matrix Geoffrey Hinton | 深度学习：

7、多隐层，逐层初始化，无监督学习 Deep Learning Deep Learning BP算法存在的问题：（1）梯度越来越稀疏：从顶层越往下，误差校正信号越来越小；（2）收敛到局部最小值：尤其是从远离最优区域开始的时候（随机值初始化会导致这种情况的发生）；（3）一般，我们只能用有标签的数据来训练：但大部分的数据是没标签的，而大脑可以从没有标签的的数据中学习； Auto Encoder 自动编码器就是一种尽可能复现输入信号的神经网络. 单层前向训练 Auto Encoder 更多其他结构的深度学习网络逐层前向训练一步反馈训练逐步反馈训练去掉去掉 ELM-AE targe

8、t output is the same as input x： the hidden node parameters are made orthogonal after being randomly generated： xy= I,1 TT ii w wb b= ELM-AE 1 H Y = 1 1 TT H HH Y C =+ T I = HX= XY= T HX= T TPX= PCA Deep Learning PCA 1 H Y = 784 20 784 Deep Learning Deep Learning 这么简单？| 深度学习，没有这么简单大而复杂的模型，训练量大，收敛，并行计算 Hadoop | 分而治之，让大象跳舞 Hadoop 架构 | 好大的一个工程！

展开阅读全文

快速深度学习 大数据 极限学习机(简介)

快速深度学习大数据极限学习机(简介)