北交模拟试卷－金锄头文库

资源描述

《北交模拟试卷》由会员分享，可在线阅读，更多相关《北交模拟试卷（3页珍藏版）》请在金锄头文库上搜索。

1、Beijing Jiaotong University Final ExaminationCourse:Computer Architecture Lecturer: Ai Lihua，Wang Bing（NOTE：4 parts in 2 hours: total 100 points）Part No.Part1Part2Part3Part4Part5TotalScoreExaminerPlease show your work CLEARLY for all problems. I hope you enjoy the test!Part1Mark only one answer for

2、each question 10 points1. A superscalar processor has ( )(a) multiple functional units (b) a high clock speed (c) a large amount of RAM (d) many I/O ports2. On-chip cache has ( )(a) lower access time than RAM(b) larger capacity than off chip cache (c) its own data bus (d) become obsolete3. ( ) data

3、hazards are not possible in the DLX in-order instruction issue and in-order execution multicycle pipeline?(a) WAR(b) WAW (c)RAW (d) RAR4.Pipelining improves CPU performance due to ( )(a) reduced memory access time (b) increased clock speed (c) the introduction of parallellism (d) additional function

4、al units5. Cache memory enhances ( )(a) memory capacity (b) memory access time (c) secondary storage capacity (d) secondary storage access time6. RISC machines typically ( )(a) have high capacity on-chip cache memories (b) have fewer registers than CISC machines (c) are less reliable than CISC machi

5、nes (d) execute 1 instruction per clock cycle.7. Which of the following is NOT a computer performance metric: ( )(a) MIPS(b) FLOPS(c) SPECbenchmark, (d) RISC8.Given a 5 stage pipeline with stages taking 1, 2, 3, 1, 1 units of time, the clock period of the pipeline is: ( )(a) 8 (b) 1/8 (c) 1/3 (d) 39

6、. The average memory access time for a machine with a cache hit rate of 90% where the cache access time is 10ns and the memory access time is 100ns is ( )(a) 55ns (b) 45ns(c) 90ns(d) 19ns10. Delayed branching is used ( )(a) to introduce delays in program execution (b) in pipelining(c) in cache memor

7、y (d) decoding instructionsPart2Fundamentals of Computer Design 10 points1. 10 points In many practical applications that demand a real-time response, the computational workload W is often fixed. As the number of processors increases in a parallel computer, the fixed workload is distributed to more

8、processors for parallel execution. Assume 20 percent of W must be executed sequentially, and 80 percent can be executed by 4 nodes simultaneously. What is a fixed-load speedup?Part 3Instruction Set Architecture 20 points2. 8 points Suppose the variable x of type int and at address 0x100 has a hexade

9、cimal value 0x01234567. The ordering of the bytes within the address range 0x100 through 0x103 depends on the type of machine. What will be arranged in memory 0x1000x103 according to Little Endian and Big Endian? 3. 12 points A model machine has 7 instructions, which frequencies are 43%, 21%, 12%, 8

10、%, 6%, 6%, and 4% respectively.3.1 Encoding operator with the minimum average code length.3.2According to 5.1, give the value of the minimum average code length.Part 4Pipelining 35 points4. 6 points Why would a designer sometimes allow structural hazards?5. 21 points Using the following code fragmen

11、t:LOOP:LW R1 , 0(R2);load R1 from address 0+R2ADDI R1, R1, #1; R1= R1+1 SW 0(R2), R1;store R1 at address 0+ R2ADDI R2, R2, #4; R2= R2+4SUB R4, R3, R2; R4= R3-R2BNEZ R4, LOOP;branch to LOOP if R4!=0Assume the initial value of R3 is R2+200. They run on a pipelined machine like DLX.5.1.According to the

12、 following format, show the timing of this instruction sequence with normal forwarding and bypassing hardware. Also assuming a register read and a write in the same clock cycle “forwards” through the register file. instructionClock123456789101112131415LW R1,0(R2)IFIDEXEMEMWBADDI R1, R1, #1SW 0(R2),

13、R1ADDI R2, R2, #4SUBR4, R3, R2BNEZ R4, LOOP 5.2.Assume that the branch is handled by predicting it as not taken. If all memory references take 1 cycle, how many cycles does this loop take to execute?5.3.In order to reduce the total cycles of 8.2, what measures could be taken? Give an explanation.6.

14、8 points For two-level branch prediction strategy with (2,2) predictor, how many bits does the branch prediction buffer need for 2K branch instructions?Part 5Memory Hierarchy 25points7. 7 points Cache design: Give short answers to the following questions.7.1Cache miss rates decrease with larger cache block sizes due to what kind of locality?7.2How many sets in fully associative cache with 64 cache bloc

展开阅读全文