内蒙古大学基因克隆与DNA分析12Studying Genomes

上传人:东*** 文档编号:270893588 上传时间:2022-03-27 格式:PDF 页数:18 大小:548.80KB
返回 下载 相关 举报
内蒙古大学基因克隆与DNA分析12Studying Genomes_第1页
第1页 / 共18页
内蒙古大学基因克隆与DNA分析12Studying Genomes_第2页
第2页 / 共18页
内蒙古大学基因克隆与DNA分析12Studying Genomes_第3页
第3页 / 共18页
内蒙古大学基因克隆与DNA分析12Studying Genomes_第4页
第4页 / 共18页
内蒙古大学基因克隆与DNA分析12Studying Genomes_第5页
第5页 / 共18页
点击查看更多>>
资源描述

《内蒙古大学基因克隆与DNA分析12Studying Genomes》由会员分享,可在线阅读,更多相关《内蒙古大学基因克隆与DNA分析12Studying Genomes(18页珍藏版)》请在金锄头文库上搜索。

1、CHAPTER CONTENTS12.1 Genome annotation12.2 Studies of the transcriptome and proteomeChapter contents207Chapter 12Studying GenomesAt the start of the 21st century the emphasis in molecular biology shifted from the study of individual genes to the study of entire genomes. This change in emphasis was p

2、rompted by the development during the 1990s of methods for sequencing largegenomes. Genome sequencing predates the 1990swe saw in Chapter 10 how the firstgenome, that of the phage dX174, was completed in 1975but it was not until 20 years later, in 1995, that the first genome of a free-living organis

3、m, the bacteriumHaemophilus influenzae, was completely sequenced. The next five years were a water-shed with the genome sequences of almost 50 other bacteria published, along with complete sequences for the much larger genomes of yeast, fruit fly, Caenorhabditis elegans (a nematode worm), Arabidopsi

4、s thaliana (a plant), and humans. Today, thesequencing of bacterial genomes has become routine, with over 900 completed, andalmost 100 eukaryotic genomes have also been sequenced.Genome sequencing has led to the development of a new area of DNA research,loosely called post-genomics or functional gen

5、omics. Post-genomics includes the use of computer systems in genome annotation, the process by which the genes, controlsequences, and other interesting features are identified in a genome sequence, as well ascomputer-based and experimental techniques aimed at determining the functions of anyunknown

6、genes that are discovered. Post-genomics also encompasses techniquesdesigned to identify which genes are expressed in a particular type of cell or tissue, andhow this pattern of genome expression changes over time.12.1 Genome annotationOnce a genome sequence has been completed, the next step is to l

7、ocate all the genes anddetermine their functions. It is in this area that bioinformatics, sometimes referred toGene Cloning and DNA Analysis: An Introduction. 6thedition. By T.A. Brown. Published 2010 byBlackwell Publishing.9781405181730_4_012.qxd 1/13/10 9:35 Page 20712353AATTTTAGGGGCAAATCCGCGATATT

8、TAGCATCGAAGCCGATATTAATATTTTAAATTT 3 5456TFigure 12.1A double-stranded DNA molecule has six readingframes.as molecular biology in silico, is proving of major value as an adjunct to conventionalexperiments.Genome annotation is a far from trivial process, even with genomes that have beenextensively stu

9、died by genetic analysis and gene cloning techniques prior to completesequencing. For example, the sequence of the yeast Saccharomyces cerevisiae, one ofthe best studied of all organisms, revealed that this genome contains about 6000 genes.Of these, some 3600 could be assigned a function either on t

10、he basis of previous studies that had been carried out with yeast or because the yeast gene had a similarsequence to a gene that had been studied in another organism. This left 2400 geneswhose functions were not known. Despite a massive amount of work since the yeastgenome was completed in 1996, the

11、 functions of many of these orphans have still notbeen determined.12.1.1 Identifying the genes in a genome sequenceLocating a gene in a genome sequence is easy if the amino acid sequence of the proteinproduct is known, allowing the nucleotide sequence of the gene to be predicted, or ifthe correspond

12、ing cDNA has been previously sequenced. But for many genes there is noprior information that enables the correct DNA sequence to be recognized. How canthese genes be located in a genome sequence?Searching for open reading framesThe DNA sequence of a gene is an open reading frame (ORF), a series of n

13、ucleotidetriplets beginning with an initiation codon (usually but not always ATG) and ending ina termination codon (TAA, TAG, or TGA in most genomes). Searching a genomesequence for ORFs, by eye or more usually by computer, is therefore the first step ingene location. When carrying out the search it

14、 is important to remember that each DNAsequence has six reading frames, three in one direction and three in the reverse direc-tion on the complementary strand (Figure 12.1).The key to the success of ORF scanning is the frequency with which terminationcodons appear in the DNA sequence. If the DNA has

15、 a random sequence and a GC con-tent of 50%, then each of the three termination codons will appear, on average, onceevery 43= 64 bp. This means that there should not be many ORFs longer than 3040codons in random DNA, and not all of these ORFs will start with ATG. Most genes aremuch longer than this:

16、 the average lengths are 317 codons for Escherichia coli, 483codons for S. cerevisiae, and approximately 450 codons for humans. ORF scanning, inits simplest form, therefore takes a figure of 100 codons as the shortest length of a puta-tive gene and records positive hits for all ORFs longer than this.With bacterial genomes, simple ORF scanning is an effective way of locating most ofthe genes in a DNA sequence. Most bacterial genes are much longer than 100 codonsPart II The Applications of Gene Cl

展开阅读全文
相关资源
相关搜索

当前位置:首页 > 高等教育 > 工学

电脑版 |金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号