必备蛋白质的结构分析流程教程

上传人:m**** 文档编号:499964476 上传时间:2023-01-29 格式:DOC 页数:24 大小:87.01KB
返回 下载 相关 举报
必备蛋白质的结构分析流程教程_第1页
第1页 / 共24页
必备蛋白质的结构分析流程教程_第2页
第2页 / 共24页
必备蛋白质的结构分析流程教程_第3页
第3页 / 共24页
必备蛋白质的结构分析流程教程_第4页
第4页 / 共24页
必备蛋白质的结构分析流程教程_第5页
第5页 / 共24页
点击查看更多>>
资源描述

《必备蛋白质的结构分析流程教程》由会员分享,可在线阅读,更多相关《必备蛋白质的结构分析流程教程(24页珍藏版)》请在金锄头文库上搜索。

1、http:/ o TMAP (EMBL) o PredictProtein (EMBL/Columbia) o TMHMM (CBS, Denmark) o TMpred (Baylor College) o DAS (Stockholm)如果包含卷曲(coiled-coils)可以在COILS server 预测coiled coils 或者下载 COILS 程序(最近已经重写,注意GCG程序包里包含了COILS的一个版本)蛋白包含低复杂性区域?蛋白经常含有数个聚谷氨酸或聚丝氨酸区,这些地方不容易预测。可以用SEG(GCG程序包里包含了一个版本的SEG程序)检查 。如果出现以上一种情况,就应

2、该将序列打成碎片,或忽略序列中的特定区段,等等。这个问题与细胞定位结构域相关。cnlics(站内联系TA)搜索序列数据库分析任何新序列的第一步显然是搜索序列数据库以发现同源序列。这样的搜索可以在任何地方或者在任何计算机上完成。而且,有许多WEB服务器可以进行此类搜索,可以输入或粘贴序列到服务器上并交互式地接收结果。序列搜索也有许多方法,目前最有名的是BLAST程序。可以容易得到在本地运行的版本(从 NCBI 或者 Washington University),也有许多的WEB页面允许对多基因或蛋白质序列的数据库比较蛋白质或DNA序列,仅举几个例子:National Center for Bio

3、technology Information (USA) SearchesEuropean Bioinformatics Institute (UK) SearchesBLAST search through SBASE (domain database; ICGEB, Trieste)还有更多的站点最近序列比较的重要进展是发展了gapped BLAST 和PSI-BLAST (position specific interated BLAST),二者均使BLAST更敏感,后者通过选取一条搜索结果,建立模式(profile),然后用再它搜索数据库寻找其他同源序列(这个过程可以一直重复到发现不了

4、新的序列为止),可以探测进化距离非常远的同源序列。很重要的一点是,在利用下面章节方法之前,通过PSI-BLAST把蛋白质序列和数据库比较,找寻是否有已知结构。将一条序列和数据库比较的其他方法有:FASTA软件包 (William Pearson, University of Virginia, USA)SCANPS (Geoff Barton, European Bioinformatics Institute, UK)BLITZ (Compugens fast Smith Waterman search)其他方法.It is also possible to use multiple seq

5、uence information to perform more sensitive searches. Essentially this involves building a profile from some kind of multiple sequence alignment. A profile essentially gives a score for each type of amino acid at each position in the sequence, and generally makes searches more sentive. Tools for doi

6、ng this include:PSI-BLAST (NCBI, Washington)ProfileScan Server (ISREC, Geneva)HMMER 隐马氏模型(Sean Eddy, Washington University)Wise package (Ewan Birney, Sanger Centre;用于蛋白质对DNA的比较)其他方法.A different approach for incorporating multiple sequence information into a database search is to use a MOTIF. Instead

7、 of giving every amino acid some kind of score at every position in an alignment, a motif ignores all but the most invariant positions in an alignment, and just describes the key residues that are conserved and define the family. Sometimes this is called a signature. For example, H-x-x-G-x(5)-H-x(3)

8、- describes a family of DNA binding proteins. It can be translated as histidine, followed by either a phenylalanine or tryptophan, followed by an amino acid (x), followed by leucine, isoleucine, valine or methionine, followed by any amino acid (x), followed by glycine,. .PROSITE (ExPASy Geneva) cont

9、ains a huge number of such patterns, and several sites allow you to search these data:ExPASyEBIIt is best to search a few different databases in order to find as many homologues as possible. A very important thing to do, and one which is sometimes overlooked, is to compare any new sequence to a data

10、base of sequences for which 3D structure information is available. Whether or not your sequence is homologous to a protein of known 3D structure is not obvious in the output from many searches of large sequence databases. Moreover, if the homology is weak, the similarity may not be apparent at all d

11、uring the search through a larger database.One last thing to remember is that one can save a lot of time by making use of pre-prepared protein alignments. Many of these alignments are hand edited by experts on the particular protein families, and thus represent probably the best alignment one can ge

12、t given the data they contain (i.e. they are not always as up to date as the most recent sequence databases). These databases include:SMART (Oxford/EMBL)PFAM (Sanger Centre/Wash-U/Karolinska Intitutet)COGS (NCBI)PRINTS (UCL/Manchester)BLOCKS (Fred Hutchinson Cancer Research Centre, Seatle)SBASE (ICG

13、EB, Trieste)通常把蛋白质序列和数据比较都有很多的方法,这些对于识别结构域非常有用。cnlics(站内联系TA)确定结构域If you have a sequence of more than about 500 amino acids, you can be nearly certain that it will be divided into discrete functional domains. If possible, it is preferable to split such large proteins up and consider each domain sepa

14、rately. You can predict the locatation of domains in a few different ways. The methods below are given (approximately) from most to least confident. If homology to other sequences occurs only over a portion of the probe sequence and the other sequences are whole (i.e. not partial sequences), then this provides the strongest evidence for domain structure. You

展开阅读全文
相关资源
相关搜索

当前位置:首页 > 建筑/环境 > 施工组织

电脑版 |金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号