《计算机结构英文版课件:Chapter 2 Computer Evolutionand Performance》由会员分享,可在线阅读,更多相关《计算机结构英文版课件:Chapter 2 Computer Evolutionand Performance(43页珍藏版)》请在金锄头文库上搜索。
1、William Stallings Computer Organization and Architecture 7th EditionChapter 2Computer Evolution and PerformanceENIAC - backgroundElectronic Numerical Integrator And ComputerEckert and MauchlyUniversity of PennsylvaniaTrajectory tables for weapons Started 1943Finished 1946Too late for war effortUsed
2、until 1955ENIAC - The first electronic computer (1946)ENIAC - detailsDecimal (not binary)20 accumulators of 10 digitsProgrammed manually by switches18,000 vacuum tubes30 tons15,000 square feet140 kW power consumption5,000 additions per secondvon Neumann/TuringStored Program conceptMain memory storin
3、g programs and dataALU operating on binary dataControl unit interpreting instructions from memory and executingInput and output equipment operated by control unitPrinceton Institute for Advanced Studies IASCompleted 1952Structure of von Neumann machineIAS - details1000 x 40 bit wordsBinary number2 x
4、 20 bit instructionsSet of registers (storage in CPU)Memory Buffer RegisterMemory Address RegisterInstruction RegisterInstruction Buffer RegisterProgram CounterAccumulatorMultiplier QuotientStructure of IAS detailCommercial Computers1947 - Eckert-Mauchly Computer CorporationUNIVAC I (Universal Autom
5、atic Computer)US Bureau of Census 1950 calculationsBecame part of Sperry-Rand CorporationLate 1950s - UNIVAC IIFasterMore memoryIBMPunched-card processing equipment1953 - the 701IBMs first stored program computerScientific calculations1955 - the 702Business applicationsLead to 700/7000 seriesTransis
6、torsReplaced vacuum tubesSmallerCheaperLess heat dissipationSolid State deviceMade from Silicon (Sand)Invented 1947 at Bell LabsWilliam Shockley et al.Transistor Based ComputersSecond generation machinesNCR & RCA produced small transistor machinesIBM 7000DEC - 1957Produced PDP-1MicroelectronicsLiter
7、ally - “small electronics”A computer is made up of gates, memory cells and interconnectionsThese can be manufactured on a semiconductore.g. silicon waferGenerations of ComputerVacuum tube - 1946-1957Transistor - 1958-1964Small scale integration - 1965 onUp to 100 devices on a chipMedium scale integr
8、ation - to 1971100-3,000 devices on a chipLarge scale integration - 1971-19773,000 - 100,000 devices on a chipVery large scale integration - 1978 -1991100,000 - 100,000,000 devices on a chipUltra large scale integration 1991 -Over 100,000,000 devices on a chipMoores LawIncreased density of component
9、s on chipGordon Moore co-founder of IntelNumber of transistors on a chip will double every yearSince 1970s development has slowed a littleNumber of transistors doubles every 18 monthsCost of a chip has remained almost unchangedHigher packing density means shorter electrical paths, giving higher perf
10、ormanceSmaller size gives increased flexibilityReduced power and cooling requirementsFewer interconnections increases reliabilityGrowth in CPU Transistor CountIBM 360 series1964Replaced (& not compatible with) 7000 seriesFirst planned “family” of computersSimilar or identical instruction setsSimilar
11、 or identical O/SIncreasing speedIncreasing number of I/O ports (i.e. more terminals)Increased memory size Increased costMultiplexed switch structureDEC PDP-81964First minicomputer (after miniskirt!)Did not need air conditioned roomSmall enough to sit on a lab bench$16,000 $100k+ for IBM 360Embedded
12、 applications & OEMBUS STRUCTUREDEC - PDP-8 Bus StructureSemiconductor Memory1970FairchildSize of a single corei.e. 1 bit of magnetic core storageHolds 256 bitsNon-destructive readMuch faster than coreCapacity approximately doubles each yearIntel1971 - 4004 First microprocessorAll CPU components on
13、a single chip4 bitFollowed in 1972 by 80088 bitBoth designed for specific applications1974 - 8080Intels first general purpose microprocessorSpeeding it upPipeliningOn board cacheOn board L1 & L2 cacheBranch predictionData flow analysisSpeculative executionPerformance BalanceProcessor speed increased
14、Memory capacity increasedMemory speed lags behind processor speedLogin and Memory Performance GapSolutionsIncrease number of bits retrieved at one timeMake DRAM “wider” rather than “deeper”Change DRAM interfaceCacheReduce frequency of memory accessMore complex cache and cache on chipIncrease interco
15、nnection bandwidthHigh speed busesHierarchy of busesI/O DevicesPeripherals with intensive I/O demandsLarge data throughput demandsProcessors can handle thisProblem moving data Solutions:CachingBufferingHigher-speed interconnection busesMore elaborate bus structuresMultiple-processor configurationsTy
16、pical I/O Device Data RatesKey is BalanceProcessor componentsMain memoryI/O devicesInterconnection structuresImprovements in Chip Organization and ArchitectureIncrease hardware speed of processorFundamentally due to shrinking logic gate sizeMore gates, packed more tightly, increasing clock ratePropa
17、gation time for signals reducedIncrease size and speed of cachesDedicating part of processor chip Cache access times drop significantlyChange processor organization and architectureIncrease effective speed of executionParallelismProblems with Clock Speed and Login DensityPowerPower density increases
18、 with density of logic and clock speedDissipating heatRC delaySpeed at which electrons flow limited by resistance and capacitance of metal wires connecting themDelay increases as RC product increasesWire interconnects thinner, increasing resistanceWires closer together, increasing capacitanceMemory
19、latencyMemory speeds lag processor speedsSolution:More emphasis on organizational and architectural approachesIntel Microprocessor PerformanceIncreased Cache CapacityTypically two or three levels of cache between processor and main memoryChip density increasedMore cache memory on chipFaster cache ac
20、cessPentium chip devoted about 10% of chip area to cachePentium 4 devotes about 50%More Complex Execution LogicEnable parallel execution of instructionsPipeline works like assembly lineDifferent stages of execution of different instructions at same time along pipelineSuperscalar allows multiple pipe
21、lines within single processorInstructions that do not depend on one another can be executed in parallelDiminishing ReturnsInternal organization of processors complexCan get a great deal of parallelismFurther significant increases likely to be relatively modestBenefits from cache are reaching limitIn
22、creasing clock rate runs into power dissipation problem Some fundamental physical limits are being reachedNew Approach Multiple CoresMultiple processors on single chipLarge shared cacheWithin a processor, increase in performance proportional to square root of increase in complexityIf software can us
23、e multiple processors, doubling number of processors almost doubles performanceSo, use two simpler processors on the chip rather than one more complex processorWith two processors, larger caches are justifiedPower consumption of memory logic less than processing logicExample: IBM POWER4Two cores bas
24、ed on PowerPCPOWER4 Chip OrganizationPentium Evolution (1)8080first general purpose microprocessor8 bit data pathUsed in first personal computer Altair8086much more powerful16 bitinstruction cache, prefetch few instructions8088 (8 bit external bus) used in first IBM PC8028616 Mbyte memory addressabl
25、eup from 1Mb8038632 bitSupport for multitaskingPentium Evolution (2)80486sophisticated powerful cache and instruction pipeliningbuilt in maths co-processorPentiumSuperscalarMultiple instructions executed in parallelPentium ProIncreased superscalar organizationAggressive register renamingbranch predi
26、ctiondata flow analysisspeculative executionPentium Evolution (3)Pentium IIMMX technologygraphics, video & audio processingPentium IIIAdditional floating point instructions for 3D graphicsPentium 4Note Arabic rather than Roman numeralsFurther floating point and multimedia enhancementsItanium64 bitse
27、e chapter 15Itanium 2Hardware enhancements to increase speedSee Intel web pages for detailed information on processorsPowerPC1975, 801 minicomputer project (IBM) RISC Berkeley RISC I processor1986, IBM commercial RISC workstation product, RT PC.Not commercial successMany rivals with comparable or be
28、tter performance1990, IBM RISC System/6000RISC-like superscalar machinePOWER architectureIBM alliance with Motorola (68000 microprocessors), and Apple, (used 68000 in Macintosh)Result is PowerPC architectureDerived from the POWER architectureSuperscalar RISCApple MacintoshEmbedded chip applicationsP
29、owerPC Family (1)601:Quickly to market. 32-bit machine603:Low-end desktop and portable 32-bitComparable performance with 601Lower cost and more efficient implementation604:Desktop and low-end servers32-bit machineMuch more advanced superscalar designGreater performance620:High-end servers64-bit arch
30、itecturePowerPC Family (2)740/750:Also known as G3Two levels of cache on chipG4:Increases parallelism and internal speedG5:Improvements in parallelism and internal speed 64-bit organizationInternet Resourceshttp:/ Search for the Intel Museumhttp:/http:/Charles Babbage InstitutePowerPCIntel Developer Home