some+natural+language+processing+(nlp)+tasks

上传人:xh****66 文档编号:61935895 上传时间:2018-12-15 格式:PPT 页数:13 大小:112.50KB
返回 下载 相关 举报
some+natural+language+processing+(nlp)+tasks_第1页
第1页 / 共13页
some+natural+language+processing+(nlp)+tasks_第2页
第2页 / 共13页
some+natural+language+processing+(nlp)+tasks_第3页
第3页 / 共13页
some+natural+language+processing+(nlp)+tasks_第4页
第4页 / 共13页
some+natural+language+processing+(nlp)+tasks_第5页
第5页 / 共13页
点击查看更多>>
资源描述

《some+natural+language+processing+(nlp)+tasks》由会员分享,可在线阅读,更多相关《some+natural+language+processing+(nlp)+tasks(13页珍藏版)》请在金锄头文库上搜索。

1、KDD Cup Task 2,Mark Craven Department of Biostatistics & Medical Informatics Department of Computer Sciences University of Wisconsin cravenbiostat.wisc.edu www.biostat.wisc.edu/craven,Task Motivation,molecular biology has entered a new era in which experimentation can be done in a high-throughput ma

2、nner microarrays can simultaneously measure the “activity” of thousands of genes under some set of conditions yeast deletion arrays can measure the activity of some “reporter” system when each of 5k genes is knocked out,key problem: it is difficult for biologists to assimilate and interpret thousand

3、s of measurements per experiment,The Problem Domain: Characterizing the Regulatome of the AHR Signaling Pathway,experimental data kindly provided by Guang Yao and Prof. Chris Bradfield McArdle Laboratory for Cancer Research University of Wisconsin the Aryl Hydrocarbon Receptor (AHR) is a member of t

4、he protein family that mediates the biological response to dioxin, hypoxia, circadian rhythm, etc. focus of project: determine which proteins affect the activity of AHR,The AHR Signaling Pathway,when a cell is exposed to say, dioxin, AHR acts to turn on/off various genes experiment motivation: which

5、 proteins (gene products) in the cell regulate how AHR does this?,Characterizing the Regulatome of the AHR Signaling Pathway,a high-throughput experiment using the Yeast Deletion Array (5k strains of yeast, each with a specified gene knocked out) for each strain insert a specially engineered AHR gen

6、e insert a “reporter” system that is activated by AHR signaling prod the AHR signaling pathway with a dose of agonist see if the reporter lights up result: we can see which genes encode proteins that affect AHR signaling,The KDD Cup Task,key computational task : help annotate/explain the results of

7、the experiment, using available data sources a proxy task for KDD Cup: develop models that can predict the experimental result for a given gene from available data sources rationale: annotation/explanation task not amenable to objective evaluation prediction task, like annotation/explanation task, i

8、nvolves eliciting patterns from available data that explain why individual genes behave as they do in the experiment,The KDD Cup Task,given: data describing a gene hierarchical (functional/localization annotation) relational (protein-protein interactions) text (scientific abstracts from MEDLINE) do:

9、 predict if knocking out the gene will have a significant effect on AHR signaling,Characteristics of the Problem,rich data sources much missing data function/localization annotations protein-protein interactions abstracts few positive instances (127 pos, 4380 neg) very “disjunctive”,Task Evaluation,

10、evaluated as a two-class problem positive: knockout has significant effect on AHR signaling but two different definitions of positive class narrow: knockout has an AHR-specific effect broad: knockout also affects a control pathway the scoring metric was the sum of the area under the ROC curve (AROC)

11、 for the two class partitions,AROC Scores for All Teams,Task 2 Winning Teams,winner Adam Kowalczyk and Bhavani Raskutti Telstra Research Laboratories honorable mention David Vogel and Randy Axelrod A.I. Insight Inc. and Sentara Healthcare Marcus Denecke, Mark-A. Krogel, Marco Landwehr and Tobias Sch

12、effer Magdeburg University George Forman Hewlett Packard Labs Amal Perera, Bill Jockheck, Willy Valdivia Granda, Anne Denton, Pratap Kotala and William Perrizo North Dakota State University,Current and Future Activity,figure out what lessons have been learned value of text? which algorithms learned

13、most accurate models? etc. determine if learned models can provide insight into the domain write articles (task overview, descriptions of winning teams methods) for SIGKDD Explorations maintain public access to data set (do Google search on KDD Cup),Acknowledgements,the experimental data was generated by Guang Yao and Prof. Chris Bradfield McArdle Laboratory for Cancer Research University of Wisconsin,

展开阅读全文
相关资源
正为您匹配相似的精品文档
相关搜索

最新文档


当前位置:首页 > 生活休闲 > 科普知识

电脑版 |金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号