Chapter 2 Reliability and Fault Tolerance:2章,可靠性和容错性

上传人:飞*** 文档编号:51617747 上传时间:2018-08-15 格式:PPT 页数:41 大小:242KB
返回 下载 相关 举报
Chapter 2 Reliability and Fault Tolerance:2章,可靠性和容错性_第1页
第1页 / 共41页
Chapter 2 Reliability and Fault Tolerance:2章,可靠性和容错性_第2页
第2页 / 共41页
Chapter 2 Reliability and Fault Tolerance:2章,可靠性和容错性_第3页
第3页 / 共41页
Chapter 2 Reliability and Fault Tolerance:2章,可靠性和容错性_第4页
第4页 / 共41页
Chapter 2 Reliability and Fault Tolerance:2章,可靠性和容错性_第5页
第5页 / 共41页
点击查看更多>>
资源描述

《Chapter 2 Reliability and Fault Tolerance:2章,可靠性和容错性》由会员分享,可在线阅读,更多相关《Chapter 2 Reliability and Fault Tolerance:2章,可靠性和容错性(41页珍藏版)》请在金锄头文库上搜索。

1、Chapter 2: Reliability and Fault Tolerance Alan Burns and Andy WellingsReal-Time Systems and Programming Languages: Alan Burns and Andy Wellings 2 - 41AimsnTo understand the factors which affect the reliability of a system and introduce how software design faults can be toleratednTo introduce Safety

2、 and DependabilityReliability, failure and faultsFailure modesFault prevention and fault toleranceN-Version programmingDynamic RedundancyReal-Time Systems and Programming Languages: Alan Burns and Andy Wellings 3 - 41ScopeFour sources of faults which can result in system failure:nInadequate specific

3、ation not coverednDesign errors in software covered nownProcessor failure not coverednInterference on the communication subsystem not covered Real-Time Systems and Programming Languages: Alan Burns and Andy Wellings 4 - 41Safety and ReliabilitynSafety: freedom from those conditions that can cause de

4、ath, injury, occupational illness, damage to (or loss of) equipment (or property), or environmental harmBy this definition, most systems which have an element of risk associated with their use as unsafenReliability: a measure of the success with which a system conforms to some authoritative specific

5、ation of its behaviournSafety is the probability that conditions that can lead to mishaps do not occur whether or not the intended function is performedReal-Time Systems and Programming Languages: Alan Burns and Andy Wellings 5 - 41SafetynE.g., measures which increase the likelihood of a weapon firi

6、ng when required may well increase the possibility of its accidental detonationnIn many ways, the only safe airplane is one that never takes off, however, it is not very reliablenAs with reliability, to ensure the safety requirements of an embedded system, system safety analysis must be performed th

7、roughout all stages of its life cycle developmentReal-Time Systems and Programming Languages: Alan Burns and Andy Wellings 6 - 41Aspects of Dependability DependabilityAvailableReadiness for UsageReliableContinuity of Service DeliverySafeNon-occurrence of Catastrophic ConsequencesConfidentialNon- occ

8、urrence of unauthorized disclosure of informationIntegralNon- occurrence of improper alteration of informationMaintainableAptitude to undergo repairs of evolutionsReal-Time Systems and Programming Languages: Alan Burns and Andy Wellings 7 - 41Dependability TerminologyDependabilityAvailabilityConfide

9、ntialityReliabilitySafetyIntegrityMaintainabilityFault PreventionFault ToleranceFault RemovalFault ForecastingFaultsErrorsFailuresAttributesMeansImpairmentsReal-Time Systems and Programming Languages: Alan Burns and Andy Wellings 8 - 41Reliability, Failure and FaultsnThe reliability of a system is a

10、 measure of the success with which it conforms to an authoritative specification of its behaviournWhen the behaviour of a system deviates from that which is specified for it, this is called a failurenFailures result from unexpected problems internal to the system that eventually manifest themselves

11、in the systems external behaviournThese problems are called errors and their mechanical or algorithmic cause are termed faultsnSystems are composed of components which are themselves systems: hence failure - fault - error - failure - faultReal-Time Systems and Programming Languages: Alan Burns and A

12、ndy Wellings 9 - 41Fault TypesnA transient fault starts at a particular time, remains in the system for some period and then disappearsnE.g. hardware components which have an adverse reaction to radioactivitynMany faults in communication systems are transientnPermanent faults remain in the system un

13、til they are repaired; e.g., a broken wire or a software design errornIntermittent faults are transient faults that occur from time to timenE.g. a hardware component that is heat sensitive, it works for a time, stops working, cools down and then starts to work againReal-Time Systems and Programming

14、Languages: Alan Burns and Andy Wellings 10 - 41Software FaultsnCalled BugsBohrbugs: reproducible identifiable.Heisenbugs: only active under rare conditions: e.g. race conditionsnSoftware doesnt deteriorate with age: it is either correct or incorrect butnFaults can remain dormant for long periods Usu

15、ally related to resource usage e.g. memory leaksReal-Time Systems and Programming Languages: Alan Burns and Andy Wellings 11 - 41Failure ModesFailure modeValue domainTiming domainArbitrary (Fail uncontrolled)Constraint errorValue errorEarlyOmissionLateFail silentFail stopFail controlledReal-Time Sys

16、tems and Programming Languages: Alan Burns and Andy Wellings 12 - 41Approaches to Achieving Reliable SystemsnFault prevention attempts to eliminate any possibility of faults creeping into a system before it goes operationalnFault tolerance enables a system to continue functioning even in the presence of faultsnBoth approaches attempt to produces systems which have well-defined failure modesReal-Time Systems and Programming Languages: Alan Burns and A

展开阅读全文
相关资源
相关搜索

当前位置:首页 > 行业资料 > 其它行业文档

电脑版 |金锄头文库版权所有
经营许可证:蜀ICP备13022795号 | 川公网安备 51140202000112号