Failure Distribution Function
probability a device will have failed by time t
Failure Probability Density
rate at which devices are failing at time t
Hazard Function/Failure Rate
rate at which devices fail, normalised to the number of devices yet to fail (i.e. the rate at which the remaining devices are failing)
Mean Time Between Failures
fairly self explanatory...
probability a device has not failed at time t
Failure rates for the above functions are often listed in FIT (Failures In Time) where 1 FIT is one failure per billion operating hours:
A good value for non-critical systems is around 10 FIT.
Parameters Affecting Failure RatesEdit
As you can see in the material regarding Circuit Failure Mechanisms, failure rates depend on factors such as voltage, current, temperature, etc. Failure rates are normally related exponentially to these parameters. Maximum values for these parameters are supplied on datasheets, and are selected to allow a reasonable failure rate.
Mean Time To FailureEdit
As it is impossible to accurately measure the lifetimes of a sample of products before release (most consumer electronics are designed to last 10 years or more), the MTTF is a good way of estimating the lifetime of a product. The MTTF can be calculated by testing for the MTTF at a much higher temperature than the device was designed to operate at. This accelerates the failure rate dramatically, and the MTTF at the designed temperature can then be calculated using an acceleration factor.
The testing method for MTTF can also be used to identify poorly manufactured components before sale. By running components for a short time at high temperatures, currents, and voltages than they were designed for, the poorly made devices will quickly fail and the components surviving at the end of the test will hopefully have a much lower infant mortality rate. They will of course have a shorter lifespan, and this must be considered when designing a Burn-in Test.
Devices with Multiple ComponentsEdit
Assuming all components need to be operating correctly for device to work:
where pi notation is like sigma notation, but involves multiplication instead, and
The following methods can be used to improve Reliability:
- Thermal Management (analyse power components, use heat sinks)
- De-rating (use higher-rated components than necessary)
- Over-specify (use more accuracy, demand better tolerances, require more head-room)
- Review designs periodically
- Failure Analysis (using FMEA)
- Simplicity (the less there is, the less that can go wrong)
- Use good quality components
- Where possible, design failure-prone components such that they can work in parallel. A setup requiring ALL parallel components to fail for the section to fail is far more reliable.