Nowadays reliability requirements for products and systems are constantly rising due to more strict requirements by consumers. Manufacturers who are able to manage reliability of their manufactured products have significant competitive advantages. The most important issue which they face in the early stages of product design and development is reliability prediction in the initial stages of a products life.
The development of a domain-specific software architecture for intelligent
(adaptive) guidance, navigation and control for aerospace applications was discussed
by Agrawala et al. (1992). In that study, the
need to adapt to a variety of specialized target hardware systems, requirements
for high reliability and system certification and increasing demands for functional
integration and high performance computing were presented. The approach used
by Agrawala et al. (1992) exhibited three major
themes: an extensive reliance on formal models, a provision of multiple views
corresponding to multiple areas of skills and requirements and an open toolset
and layered architecture. The importance of reliability is stressed by
considering the importance of fault tolerance, security, testing and plant model
verification. However, their software approach to the problem is not useful
in the system under study, since the present system is already operational and
no changes in its software or structure is possible.
Reliability improvement or growth is one of the main objectives in any system
development effort, especially in sensitive medical equipment, aerospace and
military applications. Reliability improvement is an ongoing effort in many
industries. One may cite (Braem et al., 2008)
for example, who model probabilistic connectivity in multi-hop body sensor networks
in order to determine ways to improve reliability. Their results for two reliability
improvements are given: randomization of the schemes and repeating the schemes
received from a parent node. Moreover, Qi et al.
(2009) present results of experimental work carried out in order to find
the reliability of plastic ball grid array packages under various manufacturing
and multiple environmental loading conditions. They performed board-level temperature
cycling, vibration and combined temperature cycling and vibration testing to
quantify reliability and find ways to improve it. Todinov
(2009) addressed the issue of reliability improvement in a product using
a comparative method for improving the resistance to failure initiated by flaws.
The advantage of his proposed method for improving the resistance to failure
initiated by flaws is that it does not rely on a Monte Carlo simulation and
does not depend on knowledge of the size distribution of the flaws and the material
The aim of this study is to improve the overall reliability of an electronic navigation and guidance system. The use of high reliability parts is employed in order to improve reliability of the main computer of the system since other alternatives were much more expensive to realize. The first step in any reliability improvement program is reliability prediction.
METHODS TO PREDICT RELIABILITY OF ELECTRONIC SYSTEMS
Reliability prediction is concerned with assessing a numeric value for a reliability
indicator which is usually either failure rate (λ) or Mean Time To Failure
(MTTF), in the initial stages of design and development of a product. Reliability
prediction can be carried out using various techniques such as using past experience
with similar items, experts estimates, etc. (Denson
and Keene, 1998) presented a reliability assessment methodology for electronic
systems whereby an initial reliability prediction was first derived and then
a model was used for data fusion or integration of all reliability data. Their
model introduced a variance measure into the Mean Time Between Failures (MTBF)
However, the most credible approach to reliability prediction is utilizing existing international reliability databases and reliability prediction methods. Such reliability databases provide numeric values of reliability indicators for specific type of items. Reliability prediction methods provide, for separate groups of items, models that enable one to take into account specific operating conditions by choosing various factors and allow the calculation of a numeric value for the parts failure rate.
Stresses which parts experience during operation play a vital role in expected
lifetime. Klinger et al. (1989) has presented a
detailed analysis of the way to include the effect of various stresses on the
reliability of parts in the AT and T approach. The Arrhenius equation with various
excitation energies for various modes of failure are used in this approach.
The main difficulty with this approach is the exact value for the excitation
energy to be used for each mode of failure.
A simple and most common source for failure rate data is the MIL-HDBK-217F
(1995) which was developed by the US Department of Defense. This standard
was primarily developed for reliability prediction of military electronic components.
Nowadays, the usage of the standard is common in many non-military areas and
it is the most widely used reliability prediction method of electronic components.
The values included in the standard are based on statistical analysis of actual
field failures and are used to calculate failure rates. The standard contains
prediction about generic types of electronic components such as microcircuits,
semiconductors, tubes, lasers, resistors, capacitors, inductive devices, rotating
devices, relays, switches, connectors, interconnection assemblies, meters, quartz
crystals, lamps, electronic filters and fuses. It contains two prediction methods,
a parts count method and a parts stress method. The parts count method may be
used early in the design and development of the product, while the part stress
prediction method requires a great amount of detailed information about the
various conditions of parts used in the product and is applicable only later
when stresses and other environmental and quality factors are known for each
component. Other databases were developed later.
One may also refer to the EPRD-97 (1997) database which
contains failure rate data on electronic components such as capacitors, diodes,
integrated circuits, optoelectronic devices, resistors, thyristors, transformers
and transistors. The data included in the database has been gathered from the
early 1970s up to 1996 by long-term monitoring of the components in the
field with primary emphasis on obtaining data on relatively new component types,
different sources, application environments and quality levels. The purpose
of this database is to provide failure rate data on commercial quality components
to complete the MIL-HDBK-217F (1995) by providing data
on the component types not addressed by it. Later on, when data for failure
rate of non-electronic parts was needed, the NPRD-95 (1995)
was developed using data collected from the early 1970s up to 1994. It
contains failure rate data on a wide variety of electrical, electromechanical
and mechanical components obtained by long-term monitoring of the components
in the field. The data collection was focused on obtaining data on relatively
new component types, data on many different sources, application environments
and quality levels.
Recently, SPIDR was released in 2006 by the Alion System Reliability Center as an integrated system and parts reliability database. It contains reliability data to replace Non-electronic Part Reliability Data NPRD-95, Electronic Part Reliability Data EPRD-97, Failure Mode and Mechanism Distributions FMD-97 and Electrostatic Discharge Susceptibility Data 1995 VZAP. It contains more than a double amount of data contained in the previous two databases, namely it contains data on more than 6000 electronic, electric, electro-mechanical and mechanical component types. The database is based on nearly 40 years of experience and on the data collection. A similar effort was carried out in other countries to compile failure rate data and reliability prediction techniques. One may cite FIDES which was developed by consortium of French defense and commercial aeronautical companies and published under the supervision of the French Ministry of Defense in 2004. The advantage of FIDES over earlier methods is that it based on the physics
of failures method and supported by the analysis of test data, field returns
and existing modeling as expressed by Martin and Robert (2005).
FIDES was developed using practical failure data from the aeronautical and military
area and from manufacturers with the objective of making realistic reliability
predictions for electronic equipment, including systems operating in severe
environments such as military defense and aeronautics. The method takes into
account the failures due to manufacturing, development and stresses related
to the application field, e.g., electrical, mechanical and thermal. The method
is focused on electric, electronic and electromechanical items including integrated
circuits, discrete semiconductors, capacitors, thermistors, resistors, potentiometers,
inductors, transformers, relays, printed circuit boards, connectors and piezoelectric
parts. The FIDES provides models for components and printed wiring assemblies,
considers technological and physical factors, considers the mission profile,
considers mechanical and thermal overstresses and considers the failure rates
of a specific supplier of a component. Moreover, FIDES takes into account failures
linked to development, production, field operation or maintenance processes.
As Martin and Robert (2005) concluded the estimates for
failure rates provided by the FIDES methodology compares closely to the observed
failure rates while the MIL-HDBK 217 predictions are more conservative. However,
there is not sufficient studies done using FIDES to support the validity of
the FIDES approach and further evaluation of different systems using the FIDES
approach is needed in order to verify its consistency and accuracy.
Another difficulty in reliability prediction is rooted in the fact that in
actual system reliability prediction, one often needs to rely on several different
data sources. Using data from various sources each with differing degrees of
estimation uncertainty poses several problems as addressed by Coit
and Jin (2001), who proposed to prioritize system-reliability prediction
activities and defined a Reliability-Prediction Prioritization Index (RPPI)
to rank components based on their potential for improving the accuracy of a
system-level reliability prediction by decreasing the variance of the system-reliability
estimate. (Coit and Jin, 2001) provided several examples
and proposed additional testing or analysis for components with a high RPPI
to reduce the variance of the component reliability estimate. This is similar
to the sensitivity analysis which is a common approach in reliability studies.
Ramirez-Marques and Levitin (2008) proposed an approach
for the estimation of reliability confidence bounds based on component reliability
and uncertainty data. Their proposed approach is based on universal generating
function technique. They showed that this approach is even more effective than
pure Monte Carlo simulations due to more precise reliability estimation and
less computational effort.
RBD SYSTEM RELIABILITY PREDICTION
One may either use analytic methods for the prediction of system reliability
or computer simulations. In this study, the Reliability Block Diagram (RBD)
approach was adopted. The system was first studied in detail to discover the
way each part affected the reliability of the modules in which it was used and
then the role of each module in the reliability of the subsystem was studied.
An RBD was developed. Next the failure rate of the system was computed using
Excel spreadsheets. The reliability may be computed as follows for series parts
as in Eq. 1:
Or for nr redundant parts in parallel, reliability may be computed as in Eq. 2:
The system was assumed to be operating in its useful period of life where the exponential probability distribution function may be assumed for the reliability of the parts as in Eq. 3.
In this study, the failure rate data were mostly obtained from the most common
of these data sources, that is MIL-HDBK-217F (1995). Reliability
data for parts whose failure rate data were not available in MIL-HDBK-217F were
extrapolated from NPRD-95 (1995) to adjust for the conditions
in which the system operates.
The failure rate of electric and electronic parts is a function of their quality as well as other factors such as temperature, operating environment and other stresses such as pressure, voltage, etc as in Eq. 4.
where, πQ is the quality factor, πT, πE and πS indicate various other factors which affect the reliability.
In most common databases for failure rate, the basic procedure for calculating the failure rate is by multiplying a base failure rate by operational and environmental stress factors. An example of a semiconductor components part stress model is as in Eq. 5:
where, λg is the generic base failure rate for the part, πT is the correction factor to consider the effect of temperature on the failure rate, πA is the application factor, πR is the power rating factor, πS is the power stress factor, πc is the contact construction factor, πQ is the quality factor and πE is the environment factor.
The various operating conditions such as ground, ground mobile, naval, air or missile launch also have an effect on part reliability as shown in Table 1.
|| The various coefficients of operating conditions for various
|N/A: Not available
|| The various classes and quality factors for various semiconductor
parts per MIL-HDBK-217F (1995)
|| The various classes and quality factors for resistors and
capacitors per MIL-HDBK-217F (1995)
||The various classes and quality factors for various other
devices per MIL-HDBK-217F (1995)
|N/A: Not available
The quality of the parts being used in the system and the type of part screening also affects the reliability of a system as shown in Table 2.
The effect of quality on semiconductor parts is shown in Table 3.
Table 4 shows the various quality levels used for resistors and capacitors, whereas the effect of quality on other device types is shown in Table 5.
It is easy to see that the failure rates for various parts are vastly dependant on part quality. For example, a low quality resistor has a failure rate that is 333.3% more than one with class S quality, or the failure rate of a plastic high frequency diode is 100 times higher than a JANTXV equivalent.
THE SYSTEM BEING STUDIED
The system under study is only a small portion of the overall system in which
there are data communication links, power generation and distribution, navigation
and guidance computer, actuators and drives as well as the physical plant itself
as shown in Fig. 1.
|| The various subsystems in the total system
|| The navigation and guidance computer subsystem
||The physical interconnections between the various subsystems
of the navigation and guidance computer
The subsystem studied in this project is mainly responsible for the navigation and guidance of the vehicle. It also monitors all sensitive devices by receiving their status and issues proper warning signals in case any problems arise. A more detailed description of the navigation and guidance computer is shown in Fig. 2.
This subsystem is itself composed of many smaller units which are briefly shown
in Fig. 3.
|| The reliability block diagram for the navigation and guidance
||The range of the expected failure rate for the various parts
used in the main navigation and guidance computer in terms of quality per
||A part count summary of the various electronic parts and their
average failure rates under different operating conditions per MIL-HDBK-217F
It consists of a digital processor board, an interface
board, a converter board, a power supply board, a base board and a panel which
is used to provide the interface between the system and the operator. The electrical schematics and the wiring tables are not needed here, even
though they were studied to see how each subsystems functioning affected the
overall systems function in order to determine the reliability block diagram
of each board, each subsystem and the overall system.
The navigation and guidance computer was studied in detail to establish the functional role of its various subsystems on its overall functionality and its reliability block diagram is shown in Fig. 4.
In this study, the effect of part quality is taken into consideration by indicating the minimum, average and maximum failure rate for parts used in the system as shown in Table 6.
The components making up the various modules of the system have been shown in Table 7.
The expected minimum, average and maximum failure rates for the parts making
up the system may be computed using the part count approach per MIL-HDBK-217F
(1995) as shown in Table 8. The quality of the parts used
in the existing system have also been recorded in the table showing that an
improvement in reliability may be gained by employing high quality parts in
the various subsystems of the navigation and guidance computer.
||The failure rate for the various modules of the main navigation
and control computer in Failures per million hours
||Failure rate and the MTTF of the navigation and guidance computer
for various part qualities
||Reliability as a function of operating time indicating substantial
improvement in reliability using better higher quality parts
The total failure rate and MTTF of the navigation and guidance computer is computed using the obtained data and is shown in Table 9.
The system failure rate may be drastically reduced using high quality parts. The information shown in Table 9 shows that one may gain a factor of 18.1 times improvement in MTTF using high quality parts instead of low quality parts. The system reliability is also computed and plotted in Fig. 5 indicating a drastic reliability improvement using higher quality parts even with medium quality with QF = med and even more reliability improvement using very high quality parts with QF = min.
Since, quality of the parts used in the design of a system play a vital role in the overall reliability of that system, it is important to adopt ways to quantify the effect of part quality on reliability. Standards used for this purpose were reviewed and the MIL-217F approach for the quantification of the effect of quality of parts of system reliability was adopted in this study. The reliability of the computer system of navigation and guidance system was modeled and its reliability prediction was performed.
The use of high quality parts to increase the reliability of an electronic navigation and guidance system was shown to be very effective in this system where we could neither rely upon integration of parts, derating or increased stress testing. A substantial gain in reliability was achieved when assuming the use of parts with high quality. Of course, this comes at a price to be paid for in sensitive or military applications.
The results presented in this study were obtained using Excel spreadsheets and Borland C++ programming on an IBM PC with a 2.8 GHz Celeron Processor. Finally, it is suggested that a new standard be developed for the quantification of the reliability of parts in the developing countries either through grants from the UNESCO, the non-aligned nations or the Islamic nations for the development of products in these countries.
Author hereby acknowledge the support of the Office of International Cooperation, Office of Applied Research for the military research grant and the Office of the Vice Chancellor of Research and Technology of the Ferdowsi University of Mashhad for their support.