- Doug Smith - Mentor, A Siemens Business
The ISO 26262 automotive safety standard requires evaluation of safety goal violations due to random hardware faults to determine diagnostic coverages (DC) for calculating safety metrics. Injecting faults using simulation may be time-consuming, tedious, and may not activate the design in a way to propagate the faults for testing. With formal verification, however, faults are easily categorized by structural analysis providing a worst-case DC. Furthermore, faults are analyzed to verify propagation, safety goal violations, and fault detection, providing accurate and high confidence results. This paper describes in detail how to run a better fault campaign using formal.
The automotive functional safety standard, ISO 26262, defines two verification requirements for integrated circuits used in automotive applications—systematic failure verification and random hardware failure verification. Systematic failures are design bugs found with typical functional verification. Random hardware failure verification is where faults are injected into a design to test assumptions about the design’s safety mechanisms to see if they work. Just as functional verification uses coverage to determine completeness, the ISO standard defines a form of coverage for random hardware failures called diagnostic coverage, which represents the percentage of the safety element that is covered by the safety mechanism.
Diagnostic coverage is computed by first determining the number of different types of faults in a design. If a fault is injected and has no impact on the safety critical goal (function) or output, then it is considered safe. Likewise, any fault injected that can be detected or notified to the car driver is also considered safe. What we are really looking for are those faults that are either unprotected by any safety mechanism (known as a single-point fault) or are supposed to be covered but a safety mechanism that fails to detect it (referred to as a residual fault). The ISO standard also addresses potential faults in the safety mechanism. Typically, an automotive design will have a secondary safety mechanism like POST to check the behavior of the primary safety mechanism. In order to test the safety mechanism is correct, a fault must be injected into the design to activate the primary safety mechanism, and then a second fault injected into the primary safety mechanism to see if it still works. This is referred to as a dual-point fault, which falls into the multi-point fault category (though the standard only requires testing at max 2 faults unless there is an architectural reason to test more). Any dual-point fault not covered by the secondary safety mechanism is considered latent.
Once all the faults in a design are classified, then the ISO 26262 metrics are easy to compute. Often, fault counts are rolled up together in an FMEDA to compute the single-point fault metric (SPFM) or latent fault metric (LFM). Alternatively, the diagnostic coverage can be computed as a percentage and then used in the FMEDA in combination with process FIT rates (λFIT) for specific failure modes. (See , sections C.1-C.3 for details on the ISO metric calculations).
What's In A Name?
Electronics in cars have always been treated as black boxes—when a part fails, the part is swapped with another. So when the ISO standard refers to a safety element and a safety mechanism, they traditionally have been treated as discrete components, making safety components rather obvious.
Technology, however, allows us to squeeze everything into an IC, blurring the lines between a safety and non-safety related component. For example, registers may be shared between different safety and non-safety related parts of the design. This complicates matters because only safety related elements should be used in calculating fault metrics, but by mixing safety/non-safety logic together, the entire design becomes a safety component. The only way to justify removing non-safety related logic is to show there is independence between the safety and non-safety logic, which can be done using a dependent failure analysis (DFA). Proving independence, however, is typically not easy, and including all design logic means injecting more faults than necessary, which will most likely dilute the diagnostic coverage.
View & Download:
Read the entire It’s Not My Fault! How to Run a Better Fault Campaign Using Formal technical paper.