体系结构-研课程要点与复习_10_考试课件

资源描述

《体系结构-研课程要点与复习_10_考试课件》由会员分享，可在线阅读，更多相关《体系结构-研课程要点与复习_10_考试课件（77页珍藏版）》请在金锄头文库上搜索。

1、高级体系结构课程要点,第一章前言计算机技术快速法进步的原因技术进步Moore定律发展特点：稳定快速发展，即按Moore定律发展，即微处理器性能（按芯片上晶体管数定义）每18个月翻一番，即每年提高58%。,体系结构发展,体系结构演化过程现代计算机体系结构的组成系统结构是研究和设计最适合应用的计算机中功能结构和指令结构最佳组合方式。高级计算机体系结构研究范畴,计算机系统结构的分类 Flynn分类法-定性冯氏分类法-定量,第三章Instruction-Level Parallelism and Its Dynamic Exploitation,What is pipelining?

2、A pipeline is like an auto assemble line A pipeline has many stages Each stage carries out a different partof instruction or operation The stages, which cooperates at a synchronized clock, are connected to form a pipe,How is the pipelining Implemented?,How is the pipelining Implemented?,What makes p

3、ipelining hard to implement? 1.There is conflictabout the memory! allow WRITE-then-READin one clock cycle 2.Conflict occurs when PC update 3.Ensure the instructions in different stages do not interfere with one another 4.Does Codes impact pipeline running,How does CPI descend ? CPI=1 CPI1 CPI1 缩小CPI

4、pipelined的途径减少各种竞争造成的停顿周期数减小理想CPI,Ideal Performance for Pipelining Ideal speedup equal to Number of pipe stages,MIPS instruction format,Works in the MIPS 5 stage pipeline(1),IF (Instruction fetch cycle) IRMemPC; NPC PC=PC+4; ID (Instruction decode/register fetch cycle) A Regsrs; B Regsrt; Immsign-

5、extended immediate field of IR; Note: The first two stages of MIPS pipeline do the same functions for all kinds of instructions. EX (Execution/effective address cycle) Memory reference: ALUoutputA+Imm Register-Register ALU instruction: ALUoutputA funcB; Register-Immediate ALU instruction: ALUoutputA

6、 op Imm; Branch: ALUoutputNPC+(Imm2 ); Cond(A=0),Works in the MIPS 5 stage pipeline(2),MEM(Memory acces/branch completion cycle) Memory reference: LMD MemALUoutput or MemALUoutput B Branch: If (cond) PC ALUoutput WB (Write back cycle) Register-Register ALU instruction Regsrd ALUoutput; Register-Imme

7、diate ALU instruction Regsrt ALUoutput; Load Instruction: Regsrt LMD;,The MIPS pipelining,Pipeline hazard: the major hurdle Structural hazards These are conflicts over hardware resources Data hazards Instruction depends on result of prior computation which is not ready (computed or stored) yet Contr

8、ol hazards branch condition and the branch PC are not available in time to fetch an instruction on the next clock be resolved by Stall The stall delays all instructions issued after the instruction that was stalled,Solution imaginable for Structural hazards “ double bump” Insert stall provide anothe

9、r memory port split instruction memory and data memory use instruction buffer fully pipelined function unit,Why allow machine with structural hazard ? To reduce cost . i.e. adding split caches, requires twice the memory bandwidth. also fully pipelined floating point units costs lots of gates. It is

10、not worth the cost if the hazard does not occur very often. To reduce latency of the unit. Making functional units pipelined adds delay (pipeline overhead - Locker.) An unpipelinedversion may require fewer clocks per operation. Reducing latency has by other hazard , as we will see.,Solution imaginab

11、le for Data hazards,Interlock: insert stalls Detect: Data Hazard Logic Forwarding: reduce data hazard stalls compiler to avoid load stall,Solution imaginable for Control hazards,Move the Branch Computation Forward Simple solutions Freeze or flush the pipeline Predict-not-taken (Predict-untaken) Trea

12、t every branch as not taken Predict-taken Treat every branch as taken,Delayed branch Good case Just 1 cycle to figure out what the right branch address is So, not 2 or 3 cycles of potential NOP or stall strange case OK, its always 1 cycle, and we always have to wait And on MIPS, this instruction alw

13、ays executes, no matter whether the branch taken or not taken. (hardware scheme) Cancelling function If branch is predicted incorrectly, CPU turns the instruction in the branch delay slot into a no-op. Includes the direction that the branch is predicted to go. Can reduce the complexity for compiler

14、to select useful instructions into delayslot.,Extending the MIPS Pipeline to Handle,complex pipeline structure,Pipelining time parameter Latency the number of intervening cycles between an instruction that produces a result and an instruction that uses the result. Initiation interval the number of c

15、ycles that must elapse between instructions issue to the same unit. The out of order The new types of data hazards RAW(Read after write)Stalls arising WAW(Write after write),Instruction-Level Parallelism,CPIpipelined = Ideal pipeline CPI+ pipelined stall cycles per instruction =1+ Structual stalls +

16、 RAW stalls + WAR stalls + WAW stalls + Control stalls Basic Block ILP is quite small Data Dependence and Hazards True Data Dependence RAW( Read after write) Name dependence Anti-dependence WAR( Write after read) Output dependence WAW(Write after write),Some Property about,Dependences are a property of programs Presence of dependence indicates potential for a hazard, but actual hazard and length of any stall is a property of the pipeline hazard or length of any stall is a property of

展开阅读全文