深度学习在自然语言处理的应用

上传人:子 文档编号:46868975 上传时间:2018-06-28 格式:PDF 页数:58 大小:1.49MB
返回 下载 相关 举报
深度学习在自然语言处理的应用_第1页
第1页 / 共58页
深度学习在自然语言处理的应用_第2页
第2页 / 共58页
深度学习在自然语言处理的应用_第3页
第3页 / 共58页
深度学习在自然语言处理的应用_第4页
第4页 / 共58页
深度学习在自然语言处理的应用_第5页
第5页 / 共58页
点击查看更多>>
资源描述

《深度学习在自然语言处理的应用》由会员分享,可在线阅读,更多相关《深度学习在自然语言处理的应用(58页珍藏版)》请在金锄头文库上搜索。

1、Word Embedding:An Introduction and Its Application in Sentence ParsingYong JiangSchool of Information Science and Technology ShanghaiTech University May 15, 2015Yong Jiang (ShanghaiTech University)Word Embedding:An Introduction and Its Application in Sentence ParsingMay 15, 20151 / 50Tranditional Wo

2、rd RepresentationOutline1Tranditional Word Representation One-hot Vector Class-based Word Representations2SVD Based Methods3Iteration Based Methods:Representation Language Models Simple Neural Network Model CBOW Skip-Gram Model SENNA4Iteration Based Methods:Learning5Parsing with Word Vectors Parsing

3、 With Recursive Neural Network Parsing With Compositional Vector Grammar(CVG)6Possible Research TopicsYong Jiang (ShanghaiTech University)Word Embedding:An Introduction and Its Application in Sentence ParsingMay 15, 20152 / 50Tranditional Word RepresentationOne-hot VectorOne-hot VectorIn tranditiona

4、l NLP task, One-hot Vector is mostly used.”I”=1,0,0,.,0,0”love”=0,0,1,.,0,0”ShanghaiTech”=0,0,.,1,0”University”=0,0,.,0,1Advantage Each dimention denotes the meaning of a word.Disadvantage The dimention will be pretty high for large corpusIt cannot capture the word similarityVI Vlove= 0 = VShanghaiT

5、ech VUniversityYong Jiang (ShanghaiTech University)Word Embedding:An Introduction and Its Application in Sentence ParsingMay 15, 20153 / 50Tranditional Word RepresentationOne-hot VectorOne-hot VectorIn tranditional NLP task, One-hot Vector is mostly used.”I”=1,0,0,.,0,0”love”=0,0,1,.,0,0”ShanghaiTec

6、h”=0,0,.,1,0”University”=0,0,.,0,1Advantage Each dimention denotes the meaning of a word.Disadvantage The dimention will be pretty high for large corpusIt cannot capture the word similarityVI Vlove= 0 = VShanghaiTech VUniversityYong Jiang (ShanghaiTech University)Word Embedding:An Introduction and I

7、ts Application in Sentence ParsingMay 15, 20153 / 50Tranditional Word RepresentationClass-based Word RepresentationsClass-based Word RepresentationsClass-based Word Representations often refer to methods like LSA,LDA.Figure: Latent Dirichlet Allocation(Topic Model)Blei, David M., Andrew Y. Ng, and M

8、ichael I. Jordan. ”Latent dirichlet allocation.” JMLR (2003).Yong Jiang (ShanghaiTech University)Word Embedding:An Introduction and Its Application in Sentence ParsingMay 15, 20154 / 50Tranditional Word RepresentationClass-based Word RepresentationsClass-based Word RepresentationsFigure: Latent Diri

9、chlet Allocation(Topic Model), are all super-parameters,to be fixed in training. ,Z, are hidden variables that we want to infer. And we only observe W.Blei, David M., Andrew Y. Ng, and Michael I. Jordan. ”Latent dirichlet allocation.” JMLR (2003).Yong Jiang (ShanghaiTech University)Word Embedding:An

10、 Introduction and Its Application in Sentence ParsingMay 15, 20155 / 50SVD Based MethodsOutline1Tranditional Word Representation One-hot Vector Class-based Word Representations2SVD Based Methods3Iteration Based Methods:Representation Language Models Simple Neural Network Model CBOW Skip-Gram Model S

11、ENNA4Iteration Based Methods:Learning5Parsing with Word Vectors Parsing With Recursive Neural Network Parsing With Compositional Vector Grammar(CVG)6Possible Research TopicsYong Jiang (ShanghaiTech University)Word Embedding:An Introduction and Its Application in Sentence ParsingMay 15, 20156 / 50SVD

12、 Based MethodsSVD Based MethodsThe intuition is that we want the dimention of each word to be smaller than the dictionary of the entire corpus.1Use word co-occurrence counts of a dataset, to build a matrix X.2Perform Singular Value Decomposition on X to get a USVT.3Select the first k columns of U to

13、 get a k-dim vector.Yong Jiang (ShanghaiTech University)Word Embedding:An Introduction and Its Application in Sentence ParsingMay 15, 20157 / 50SVD Based MethodsSVD Based Methods:An exampleExample corpus: I like you. 2 I love you. 3 I hate her. 1Figure: Co-occurrence MatrixReferance:cs224d.stanford.

14、eduYong Jiang (ShanghaiTech University)Word Embedding:An Introduction and Its Application in Sentence ParsingMay 15, 20158 / 50SVD Based MethodsSVD Based Methods:An exampleFigure: Visualization of SVD methodsReferance:cs224d.stanford.eduYong Jiang (ShanghaiTech University)Word Embedding:An Introduct

15、ion and Its Application in Sentence ParsingMay 15, 20159 / 50Iteration Based Methods:RepresentationOutline1Tranditional Word Representation One-hot Vector Class-based Word Representations2SVD Based Methods3Iteration Based Methods:Representation Language Models Simple Neural Network Model CBOW Skip-G

16、ram Model SENNA4Iteration Based Methods:Learning5Parsing with Word Vectors Parsing With Recursive Neural Network Parsing With Compositional Vector Grammar(CVG)6Possible Research TopicsYong Jiang (ShanghaiTech University)Word Embedding:An Introduction and Its Application in Sentence ParsingMay 15, 201510 / 50Iteration Based Methods:RepresentationLanguage ModelsRecall:N-grams Language ModelsFor a sentence like: I a

展开阅读全文
相关资源
相关搜索

当前位置:首页 > 生活休闲 > 科普知识

电脑版 |金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号