英文生物信息学模板Information_ The Language of Biology

资源描述

《英文生物信息学模板Information_ The Language of Biology》由会员分享，可在线阅读，更多相关《英文生物信息学模板Information_ The Language of Biology（31页珍藏版）》请在金锄头文库上搜索。

1、Elements,,THINKS,非常感谢您,完美经典PPT 模板供大家参考,Information: The Language of Biology,Gary Strong NSF, ITR Program,Cell,Human Language,Suggestive Biology-Language Homologies,Goals Leading Toward Predictive Biology,Gene Sequence Data,Gene Identification,Protein Circuit & Regulatory Network Discovery,Biosimulat

2、ion,Structure Prediction,Natural Language Processing and Bioinformatics are Already Related,Both NL and biology are faced with data mining over massive amounts of data Applying NLP tools to biology Hidden Markov gene finders Protein grammars to predict function from sequence Protein circuit extracti

3、on from scientific literature Convergence of biological and language mining PSI Blast homology searching in genome augmented by medical literature,Model-based GENSCAN Is Best Among HMM Gene Finders,Sn = Sensitivity Sp = Specificity Ac = Approximate Correlation ME = Missing Exons WE = Wrong Exons,GEN

4、SCAN Performance Data, http:/genes.mit.edu/Accuracy.html,The Chomsky Hierarchy,Regular Languages,Context- Free Languages,Context- Sensitive Languages,Recursively Enumerable Languages,Language,Automaton,Turing Machine,Linear-Bounded,Pushdown (stack),Finite-State Machine,Grammar,UnrestrictedBaa A,Cont

5、ext-SensitiveAt aA,Context-FreeS gSc,RegularA cA,Recognition,Linear,Polynomial,NP-Complete,Undecidable,Dependency,Biology,Strictly Local,Nested,Crossing,Arbitrary,Central Dogma,Pseudoknots, etc.,Orthodox 2o Structure,Unknown,From D. Searls,Mildly CSGs for Structure Modeling,Yasuo UEMURA et al.,Tree

6、Adjunct Grammars (TAG) have been applied to modeling RNA secondary structures including pseudoknots. An efficient parsing algorithm for this grammar was developed, and applied to some computational problems concerning RNA secondary structures. Further, a (-1) frame shift grammar is constructed based

7、 on a biological observation that a (-1) frame shift might be caused from some structural features of RNA sequences. The proposed grammar was used to find candidate sequences for (-1) frame shift in Human spumaretrovirus gag and pol genes.,Phenylalanine tRNA of Yeast,Complementary base-pair interact

8、ions are shown by short gray bars.,Complementary base-pair interactions are shown by long gray bars.,Unusual base-pair interactions shown in red (which are crossed dependencies) lead to complex structures common in proteins,secondary structure,tertiary structure,Structural similarity of unusual base

9、-pair interactions to long-distance relations in NL,Data Mining for Proteins and Biological Function,Regulatory pathway research is growing exponentially, generating huge amounts of information. “Cell cycle” and “apoptosis” produced 169,293 and 29,961 hits respectively on PubMed.Going through articl

10、es manually is prohibitively time-consuming.Investigators can benefit from systematic compilation and integration of 1000s of pieces of discrete information on signal transduction.Comparisons across species can be useful for cross validation and as an aid in hypothesis generation about yet undiscove

11、red genes.Automatic techniques can be applied to the literature with enormous benefit - qualitative model construction, for example.,. staurosporine activates a JNK isoform .,. STAT4 is not activated by IL-2 .,automatic information extraction,ACTIVATION staurosporine + a JNK isoform IL-2 not STAT4,S

12、ome chemical reactions in a cell: As many as 10,000 proteins act as regulatory enzymes in mammals.,Including Biological Literature Improves Homology Search Jeffrey T Chang, Soumya Raychaudhuri, and Russ B Altman Stanford Medical Informatics Stanford University,NLP Approaches to Biological Data Model

13、ing,Gene Identification: Identification of exons, introns, and non-coding regions (approximately 10 exons per gene and 30,000 genes from Human Genome data) Sequence analysis and annotation of homologies, e.g. sequences that lead to similar 3D structural features that have implications for activity o

14、f the protein Evolutionary relationships across species Structure prediction: Context sensitive grammars to predict structure from coding sequences Potential for modeling dynamic proteins, such as those that exhibit antigenic variability Computational models of protein evolution, e.g. for structure

15、prediction from SNP sequences Combinatorial models for non-coding DNA role in regulatory circuits, such as timing of gene co-expression Circuit Discovery: Tools for assigning meaning to protein interaction patterns Template extraction of protein-protein interactions within cells from scientific lite

16、rature Construction of signal transduction networks Prediction of function effects from network disruption,Important Programmatic Dimensions from DARPA NLP Program POV,Components of a Successful Program,Data need to be shared across research community, implying annotation standards. Objective, community-wide evaluations on common tasks are needed to gauge progress. Technology transfer is not just an end-product of research, but a constant driver. Large, challenge problems bring government resources from multiple agencies.,

展开阅读全文

英文生物信息学模板Information_ The Language of Biology

最新文档