计算机体系结构_第一次作业.docx

资源描述

《计算机体系结构_第一次作业.docx》由会员分享，可在线阅读，更多相关《计算机体系结构_第一次作业.docx（8页珍藏版）》请在金锄头文库上搜索。

1、计算机体系结构第一章1.11 Availability is the most important consideration for designing servers, followed closely by scalability and throughput.a. We have a single processor with a failures in time(FIT) of 100. What is the mean time to failure (MTTF) for this system?b. If it takes 1 day to get the system runn

2、ing again, what is the availability of the system?c. Imagine that the government, to cut costs, is going to build a supercomputer out of inexpensive computers rather than expensive, reliable computers. What is the MTTF for a system with 1000 processors? Assume that if one fails, they all fail.答：a. 平

3、均故障时间(MTTF)是一个可靠性度量方法，MTTF的倒数是故障率，一般以每10亿小时运行中的故障时间计算(FIT)。因此由该定义可知1/MTTF=FIT/109，所以MTTF=109/100=107。b. 系统可用性=MTTF/(MTTF+MTTR)，其中MTTR为平均修复时间，在该题目中表示为系统重启时间。计算107/(107+24)约等于1.c. 由于一个处理器发生故障，其他处理器也不能使用，所以故障率为原来的1000倍，所以MTTF值为单个处理器MTTF的1/1000即107/1000=104。1.14 In this exercise, assume that we are cons

4、idering enhancing a machine by adding vector hardware to it. When a computation is run in vector mode on the vector hardware, it is 10 times faster than the normal mode of execution. We call the percentage of time that could be spent using vector mode the percentage of vectorization. a. Draw a graph

5、 that plots the speedup as a percentage of the computation performed in vector mode. Label the y-axis “Net speedup” and label the x-axis “Percent vectorization”. b. What percentage of vectorization is needed to achieve a speedup of 2?c. What percentage of the computation run time is spent in vector

6、mode if a speedup of 2 is achieved?d. What percentage of vectorization is needed to achieved one-half the maximum speedup attainable from using vector mode?e. Suppose you have measured the percentage of vectorization of the program to be 70%. The hardware design group estimates it can speed up the v

7、ector hardware even more with significant additional investment. You wonder whether the compiler of vectorization would the compiler team need to achieve in order to equal an addition 2*speedup in the vector unit(beyond the initial 10*)?答：a. 根据加速比定义可知，增强加速比=10，如果令增强比例为x，总加速比为y，则有y=1/(1-x+x/10)。x的取值范

8、围为0，1；y的取值范围为0，10。如下图示：b. y=1/(1-x+x/10)；当y=2时，x=5/9=55.6%；c. (5/9)/10/(1/2)=1/9=11.1%d. 最大加速比理论上为10；最大加速比的一半就是5；y=1/(1-x+x/10)；当y=5时，x=8/9=88.9%e. 当前x=70%；y=1/(1-x+x/10)；可知y=2.7;如果y=22.7=5.4；y=1/(1-x+x/10)；可知x=0.91；第二章2.8 The following questions investigate the impact of small and simple caches usi

9、ng CACTI and assume a 65nm(0.065m) technology.a. Compare the access times of 64KB caches with 64byte blocks and a single bank. What are the relative access times of two-way and four-way set associative caches in comparison to a direct mapped organization?b. Compare the access times of four-way set a

10、ssociative caches with 64 byte blocks and a single bank. What are the relative access times of 32KB and 64KB caches in comparison to a 16KB cache?c. For a 64KB cache, find the cache associativity between 1 and 8 with the lowest average memory access time given that misses per instruction for a certa

11、in workload suite is 0.00664 for direct mapped, 0.00366 for two-way set associative, 0.00987 for four-way set associative and 0.000266 for eight-way set associative cache. Overall, there are 0.3 data references per instruction. Assume cache misses take 10 ns in all models. To calculate the hit time

12、in cycles, assume the cycle time output using CACTI, which corresponds to the maximum frequency a chche can operate without any bubbles in the pipeline.答：a. 直接映射：0.86ns；两路组相联：1.12ns；四路组相联：1.37ns。两路组相联访存时间是直接映射的1.12/0.86=1.30倍；四路组相联访存时间是直接映射的1.37/0.86=1.59倍。b. 16KB cache 的访存时间为1.27ns，32KB cache为1.35n

13、s，64KB cache为1.37ns。32KB cache的访存时间是16KB cache访存时间的 1.35/1.27=1.06倍；64KB cache的访存时间是16KB cache访存时间的 1.37/1.27=1.08倍；c. 平均访存时间=命中率命中时间+缺失率缺失代价；DM缺失率=0.00664/0.3=2.2%；2-way缺失率=0.00366/0.3=1.2%；4-way缺失率=0.00987/0.3=0.33%；8-way缺失率=0.000266/0.3=0.09%；DM访存所用时钟周期=0.86ns/0.5ns向上取整=2；2-way访存所用时钟周期=1.12ns/0.5

14、ns向上取整=3；4-way访存所用时钟周期=1.37ns/0.83ns向上取整=2；8-way访存所用时钟周期=2.03ns/0.79ns向上取整=3；DM缺失代价=10ns/0.5ns=20时钟周期；2-way缺失代价=10ns/0.5ns=20时钟周期；4-way缺失代价=10ns/0.83ns=13时钟周期；8-way缺失代价=10ns/0.79ns=13时钟周期；DM平均访存时间=(1-0.22)2+0.022200.5=2.3960.5=1.2ns；2-way平均访存时间=(1-0.012)3+0.012200.5=3.20.5=1.6ns；4-way平均访存时间=(1-0.003

15、3)2+0.0033130.83=2.0360.83=1.69ns8-way平均访存时间=(1-0.0009)3+0.0009130.79=30.79=2.37ns。2.11 Consider the usage of critical word first and early restart on L2 cache misses. Assume a 1MB L2 cache with 64 byte blocks and a refill path that is 16 bytes wide. Assume that the L2 can be written with 16 bytes e

16、very 4 processor cycles, the time to receive the first 16 byte block from the memory controller is 120 cycles, each additional 16 byte block from main memory requires 16 cycles, and data can be bypassed directly into the read port of the L2 cache. Ignore any cycles to transfer the miss request to the L2 cache and the requested data to the L1 cache.a. How many c

展开阅读全文