《计算机系统结构:WinDLX 例程– 数据定向和结构相关》由会员分享,可在线阅读,更多相关《计算机系统结构:WinDLX 例程– 数据定向和结构相关(3页珍藏版)》请在金锄头文库上搜索。
1、WinDLX 例程 数据定向和结构相关 l Figure 3.10 in textbook;-; Example to illustrate RAW w and w/o data forwarding, similar to; Figure 3.10 in textbook;-main: add r1,r2,r3 ;store a new value in r1 sub r4,r1,r5 ;use r1 and r6,r1,r7 ;use r1 or r8,r1,r9 ;use r1 xor r10,r1,r11 ;use r1 nop nop nop nop add r1,r2,r3 ;st
2、ore a new value in r1 sub r4,r1,r5 ;use r1 & store new r4 and r6,r4,r7 ;use r4 or r8,r6,r9 ;use r6 xor r10,r8,r11 ;use r8 Finish: trap 0l Figures 3.11 and 3.12 in textbook;-; Example to illustrate RAW w and w/o data forwarding, similar to; Figures 3.11 and 3.12 in textbook;-main: add r1,r2,r3 lw r4,
3、0(r1) ;load into r4 using r1 sw 12(r1),r4 ;store r4 using r1 nop nop nop nop lw r1,0(r2) sub r4,r1,r5 and r6,r1,r7 or r8,r1,r9Finish: trap 0l Example to illustrate multiple data and structural hazards;-; Example to illustrate data and structural hazards ;-main: addf f1,f2,f3 multf f2,f4,f5 addf f3,f
4、3,f4 multf f6,f6,f6 addf f1,f3,f5 addf f2,f3,f4Finish: trap 0WinDLX 例程 指令再定序 To run these programs select the floating point stages in the configuration menu and set the number of division units to 3 and the delay of the addition, multiplication and division to 3 cycles. l Original program ;-; Examp
5、le to illustrate instruction scheduling;-.data.globalONEONE:.word1.text.globalmainmain:lff1,ONE;turn divf into a movecvti2ff7,f1;by storing in f7 1 innop;floating-point formatdivff1,f8,f7;move Y=(f8) into f1divff2,f9,f7;move Z=(f9) into f2addff3,f1,f2divff10,f3,f7;move f3 into X=(f10)divff4,f11,f7;m
6、ove B=(f11) into f4divff5,f12,f7;move C=(f12) into f5multff6,f4,f5divff13,f6,f7;move f6 into A=(f13)Finish: trap0l Reordered program;-; Example to illustrate instruction scheduling - reordered instructions;-.data.globalONEONE:.word1.text.globalmainmain:lff1,ONE;turn divf into a movecvti2ff7,f1;by storing in f7 1 innop;floating-point formatdivff1,f8,f7;move Y=(f8) into f1divff2,f9,f7;move Z=(f9) into f2divff4,f11,f7;move B=(f11) into f4divff5,f12,f7;move C=(f12) into f5addff3,f1,f2multff6,f4,f5divff10,f3,f7;move f3 into X=(f10)divff13,f6,f7;move f6 into A=(f13)Finish: trap0