HOME JOURNALS CONTACT

Asian Journal of Scientific Research

Year: 2017 | Volume: 10 | Issue: 3 | Page No.: 128-138
DOI: 10.3923/ajsr.2017.128.138
Estimation the Failure Rate and Reliability of the Triple Modular Redundancy System Using Weibull Distribution
Mohammed Mohammed El Genidy and Esraa Ahmed Hebeshy

Abstract: Background and Objective: In engineering, redundancy is the duplication of critical components or functions of a system with the intention of increasing reliability and decreasing the failure rate of the system, usually in the case of fail-safe. This study aimed to create mathematical model to find the optimum values of the distribution parameters of the triple modular redundant system. Therefore, the system will be higher safety, longer lifetime and higher reliability with smaller failure rate. Methodology: The maximum likelihood estimation and order statistics were applied to estimate the optimum values of the failure rate, reliability and lifetime for the Triple Modular Redundancy (TMR) system, which contains three electronic redundancy units with Weibull distribution. Results: The numerical values of the failure rate, reliability and lifetime of the triple modular redundant system were obtained using Weibull distribution. Conclusion: Thus, the designing of TMR system can be longer lifetime and higher safety. Moreover, the risks of sudden failure and economic losses will be decreased.

Fulltext PDF Fulltext HTML

How to cite this article
Mohammed Mohammed El Genidy and Esraa Ahmed Hebeshy, 2017. Estimation the Failure Rate and Reliability of the Triple Modular Redundancy System Using Weibull Distribution. Asian Journal of Scientific Research, 10: 128-138.

Keywords: majority gate, order statistics, Triple modular redundant system and Weibull distribution

INTRODUCTION

In many safety-critical systems, such as hydraulic systems in aircraft and some parts of the control system may be to triplicate that is formally termed Triple Modular Redundancy (TMR). An error in one component may then be out-voted by the other two. The TMR system has three sub components, all three of which must fail before the system fails. Since each one rarely fails and the sub-components are expected to fail independently, the probability of all three failing is calculated to be extraordinarily small; often outweighed by other risk factors, e.g., human error. Redundancy may also be known by majority voting systems or voting logic. The TMR system can be applied in many form of redundancy, such as software redundancy in the form of N-version programming. Some memory (ECC) uses TMR hardware rather than the more common Hamming code, due to TMR system is faster than Hamming error correction hardware. Space satellite systems often use TMR hardware although satellite RAM usually uses Hamming error correction. Low carried out the estimations of the central tendency measuring of the Weibull distribution using saddle point approximation1. Simple probabilities model which was proposed for analysis of fault masking performance of hierarchical TMR networks2. Statistical problem in the parametric treatment comparison when partly interval censored failure time data under Weibull distribution via multiple imputation exist3. El Damsesy et al.4 performed a study on reliability and failure rate of the electronic system by using mixture Lindley distribution. Estimation some parameters of K-station series model was presented by El Genidy5. Joint probability distribution function of the preparing time for the defective machines in the multiple queues system has done study by El Genidy6. Pan et al.7 performed a study on the heat dissipation and thermos-mechanical reliability study for multi-chip module high power LED integrated packing with through silicon vias. Reshid and Abd Majid8 dealt with a multi-state reliability model for a gas fueled co-generated power plant. Saat et al.9 considered the ratio of the maximized likelihood and Vuong test in choosing between Weibull and Gamma distribution in application of sleep apnea, where values of the probability of correct selection were obtained by Monte Carlo simulation. Zhang et al.10 performed study on triple modular redundancy dynamic fault-tolerant system model.

In this study, order statistics and maximum likelihood method were applied to estimate the optimum values of the scale parameter at three different cases of the shape parameter of Weibull distribution. Moreover, the graphics representations showed that the failure rates and reliability of TMR system at the different values of the shape and scale parameters of Weibull distribution. Thus, the optimum values of the shape and scale parameters can be determined which make TMR system to be more efficient and safety.

MATERIALS AND METHODS

Triple Modular Redundant (TMR) system: It is a form of N-modular redundancy in which three systems perform a process and that result is processed by a majority voting system to produce a single output. If any one of the three system fails, the other two systems can correct.

Majority gate: In the TMR system, three identical logic circuits (logic gates) are used to compute the same set of specified Boolean function (Fig. 1). If there are no circuit failures, the outputs of the three circuits are identical. However, due to circuit failures, the outputs of the three circuits may be different. The majority gate output is one if two or more of the inputs of the majority gate are 1; output is 0 if two or more of the majority gate’s inputs are 0. The majority gate is a simple AND-OR circuit, suppose that the inputs to the majority gate are denoted by x, y and z, then the output of the majority gate is xy or yz or xz (Table 1). For this, the majority gate is the carrying output of a full adder or voting machine.

Order statistics: Let X1, X2, . . . , Xn be mutually independent, identically distributed continuous random variables, each having the distribution function F(x) and probability density function f(x), this means that:

Table 1:Truth table of the majority gate in the triple modular redundant system

Fig. 1: Majority gate of TMR system

Let Y1, Y2, . . . ,Yn be random variables obtained by permuting the set X1, X2 , …, Xn such that to be in increasing order. The random variable Ym is called the mth-order statistic, where, Y1 = min {X1, X2 , . . . , Xn} and Yn = max {X1, X2 , . . . , Xn}. Since X1, X2 , . . . , Xn are continuous random variables, it follows that Y1< Y2< . . . <Yn with a probability of one.

Three cases of using the order statistics, let Xi be the lifetime of the ith component in a system of n independent components:

•  Case 1: If the system is a series system, contains n parts or n components which are different in the type of service or function and are not equal of the lifetime as in the Fig. 2.

where, Y1 is the overall lifetime of the series system, Y1 = min{X1, X2 , . . . , Xn}.

•  Case 2: If the system is a parallel system, contains n parts or n components which have the same type of service or function and are not equal of the lifetime as in the Fig. 3.

where, Yn is the overall lifetime of the parallel system, Yn = max{X1, X2 , . . . , Xn}.

•  Case 3: If the system is m-out of n system (the so-called N-Tuple Modular Redundant or NMR system), then Yn-m+1 will be the system lifetime. In TMR system Ym = Y2 is the lifetime of the system, this means that there are at least two electric gate having the lifetime located in (0,y], because the output of the majority gate will be equal one if there are at least two electric gate having the input value equal to one (Table 1). The probability that exactly j of the components Xi have lifetimes located in (0,y] and (n-j) located in (y,∞) is defined as:

P{j components having lifetime located in the interval.

This means that, it has binomial distribution with parameter F(y) and number of trials equal to n. Then, the distribution function of Ym will be defined in Eq. 1:

Fig. 2: Series system of n components

Fig. 3: Parallel system of n components
  Yn is the overall lifetime of the parallel system, Yn = max {X1, X2,..., Xn}

(1)

In Eq. 1, if the system is a series system, then Y1 is a random variable represented the lifetime of the series system and the cumulative distribution function of Y1 was obtained in Eq. 2:

Put m = 1 , F(y) = p and 1-F(y) = q in Eq. 1, thus:

(2)

On the other hand, the reliability and the failure function of the series system are defined in Eq. 3 and 4:

(3)

(4)

In the case of the parallel system, Yn is a random variable represented the lifetime of the parallel system. In Eq. 1, put m = n then the cumulative distribution function of Yn is defined in Eq. 5:

(5)

Then, the reliability and the failure function of the parallel system are defined in Eq. 6 and 7:

(6)

(7)

If there are n different components, then from Eq. 3 and 6, the reliability is defined in Eq. 8 and 9:

(8)

(9)

If there are n same components, then from Eq. 3 and 6, the reliability is defined in Eq. 10 and 11:

(10)

(11)

In the case, the lifetimes of n different components have the same distribution (Exponential distribution with different parameters λi where i = 1, 2,…, n) then, from Eq. 8 and 9 the reliability and failure function are defined in Eq. 12-15:

(12)

(13)

(14)

(15)

If there are L different types of components in the series system and in addition, there are ni number of component i with parameter λi , then from Eq. 13 the failure function was obtained in Eq. 16:

(16)

In the case of the lifetimes for n same components have the same distribution (Exponential distribution with same parameters (λ) then, from Eq. 10 and 11 the reliability and failure function are defined in Eq. 17-20:

(17)

(18)

(19)

(20)

In the case of TMR system, Y2 is a random variable represented the lifetime of TMR system. In Eq. 1 , put m = 2. Then the cumulative distribution function of Y2 is defined in Eq. 21-24:

(21)

(22)

(23)

(24)

Triple modular redundancy system with exponential distribution: In the case of the lifetime is represented by the random variable (T = t) of three same components in TMR system have the same distribution which is the exponential distribution with the same shape parameter λ (Fig. 4) then, from Eq. 24 the reliability and failure function of TMR system. Since, R(t) = e‾λt, 0<1<∞ then it can shown in Eq. 25-27:

(25)

(26)

(27)

Triple modular redundancy system with Weibull distribution: In the case of the random variable (T = t) follows Weibull distribution with shape parameter (k>0) and scale parameter (λ>0) (Fig. 5.). The probability density function is defined in Eq. 28:

(28)

The Weibull distribution is related to a number of other probability distributions; in particular, if (k = 1) then the probability density function in Eq. 28 follows the exponential distribution.

Fig. 4: Triple modular redundancy system with exponential distribution

Fig. 5: Triple modular redundancy system with Weibull distribution

The cumulative distribution function, reliability and failure function of Weibull distribution are defined in Eq. 29-31:

(29)

(30)

(31)

Statistical analysis and estimation method: The main aim of this study was to obtain the optimum values of the failure rate, lifetime and reliability of TMR system using Weibull distribution. Maximum likelihood method was applied to estimate the scale parameter λ at the different values of the shape parameter k of Weibull distribution.

Mathematical model and simulation study: In this study, a mathematical model was created by using the order statistics and Weibull distribution to represent TMR system. Mathematica software was performed to find the values of the scale parameter at three different cases of the shape parameter of Weibull distribution. Moreover, graphics representations of the failure rate, lifetime and reliability of TMR system were plotted to determine the optimum values of the shape and scale parameters of Weibull distribution. Therefore, TMR system can be higher safety, longer lifetime and higher reliability with smaller failure rate. For the estimation of the scale parameter λ, simulation study was applied to find the optimum value of the scale parameter λ by generating the lifetimes (measured by year) of TMR system.

RESULTS AND DISCUSSION

The TMR system and double dual modular redundancy control systems for subsea blowout preventer were studied with applying the Bayesian network model, the results showed that the dual modular redundancy control systems has a little higher reliability than triple modular redundancy system11. Romain, a framework that provides transparent redundant multithreading as an operating system service for hardware error detection and recovery was introduced with applied to a standard benchmark suite12. As a further extension for a possible increase of hard errors in the further technology, an energy effective coverage of hard errors by dynamically adapting the redundancy between a dual and a triple module is also included in the processor13.

An analytic approach to compute the mean failure time for m out of n systems with a single cold standby unit for the wide class of lifetime distributions that can be captured by the Pearson distribution14. The most prominent reliability concerns from today’s points of view and roughly recapitulates the progress in the community so far and reliable on chip systems in the nano era were studeid15. A low area overhead single event upset recovery mechanism was presented and described its application in different self recoverable architectures, which were experimentally evaluated using a specially designed fault emulation environment16.

The failure rates of High Performance Computing (HPC) systems and also survey the fault tolerance approaches for HPC systems and issues with these approaches were performed, where the rollback-recovery techniques were used for long-running applications on HPC clusters17. Redundancy techniques are widely used to increase the reliability of combination logic circuits, soft error reliability was improved by using such techniques and based on probability of occurrence for combinations at the outputs of circuits18. A methodology for the application of Bayesian networks in conducting quantitative risk assessment of operations in offshore oil and gas industry was studied and involving translating a flowchart of operations into the Bayesian network directly19. The fault injection models for TMR systems and Dual Modular Redundant (DMR) circuits were developed in order to simulate the fault tolerant systems20.

From Eq. 24 and 30, the reliability and cumulative distribution function for TMR system with Weibull distribution are defined as in Eq. 32-35:

(32)

(33)

(34)

(35)

•  Result (1): If t → 0 , then FTMR (t; k, λ) → 0 and RTMR (t; k, λ)→1
•  Result (2): If t→∞, then FTMR (t; k, λ)→1 and RTMR (t; k, λ)→0

Then, the probability density function of TMR system using Weibull distribution was obtained as follows in Eq. 36:

(36)

The likelihood function was defined as follows in Eq. 37:

(37)

Let l(λ, k) = Ln [L(λ, k)], then in Eq. 38:

(38)

Put thus in Eq. 39:

(39)

Using the command in mathematica software: Table [Random [Integer, {5, 20}], {4}] to generate four times. Let t1 = 5, t2 = 10, t3 = 15, t4 = 20, then the solutions of the values of the scale parameter λ in the Eq. 39 at the different cases of the shape parameter k where (k < 1, k = 1, k > 1) are shown in the Table 2, where the command FindRoot of the Mathematica software was implemented on the Eq. 39 at (k = 0.5), (k = 1) and (k = 1.5).

Therefore, the cumulative distribution function in Eq. 34 was taken the Eq. 40-42:

(40)


(41)

(42)

Let U = FTMR (ti;k, λ) where, U∈[0,1] and i = 1, 2, …,10. Generating ten values (u1, u2, …, u10) of the random variable U by using the command “Random[ ]”in Mathematica software, thus the values of U were obtained in the Table 3.

In the Eq. 40-42, i = 1, 2, …,10 was solved using the command FindRoot in the Mathematica software, the values of the lifetimes ti, probability density function, cumulative distribution function, reliability and failure function of TMR system were obtained in the Table 4-6.

Comparison of results: Applying the Eq. 32 and 35 on the three cases of the values of shape parameter k and corresponding scale parameter λ, the Fig. 6-11 were obtained and it was observed that the failure rate is relatively high and the reliability is small in the case (k = 0.5), (λ = 3.76473), while the failure rate is higher and the reliability is smaller in the case (k = 1), (λ = 4.82139) than the other two cases. On the other hand, the last case (k = 1.5), (λ = 9.39632) is the best values which that the failure rate is lower and the reliability is higher when compared with the other two cases.

The symbols in the Fig. 6-11 represent the following:

Table 2:
Estimator values of the scale parameter λ at the different cases of the shape parameter (k = 0.5, k = 1 and k = 1.5)

Table 3: Values of the random variable U where, U = FTMR (t;k, λ)

Table 4: Values of the lifetime, probability density function, reliability and failure rate of TMR system at k = 0.5 and λ = 3.76473

Table 5: Values of the lifetime, probability density function, reliability and failure rate of TMR system at k = 1 and λ = 4.82139

Table 6: Values of the lifetime, probability density function, reliability and failure rate of TMR system at k = 1.5 and λ = 9.39632

Fig. 6: Failure rate of TMR system at the shape parameter k = 0.5 and the scale parameter λ = 3.76473

•  The TMR is the triple modular redundant system, it is a fault-tolerant form of N-modular redundancy, in which three systems perform a process and that result is processed by a majority-voting system to produce a single output
•  k is the shape parameter of Weibull distribution, it is a kind of numerical parameter of a parametric family of probability distributions, which affect the shape of Weibull distribution

Fig. 7: Reliability of TMR system at the shape parameter k = 0.5 and the scale parameter λ = 3.76473

•  λ is the scale parameter of Weibull distribution, it is a special kind of numerical parameter of a parametric family of probability distributions, which the larger the scale parameter, the more spread out the distribution
•  t is the lifetime of TMR system, which is the duration of the existence of the system and measured by the year unit
•  hTMR(t) is the failure rate of TMR system, which is the frequency of failures of the triple modular redundant system or component fails, expressed in failures per unit of time

Fig. 8: Failure rate of TMR system at the shape parameter k = 1 and the scale parameter λ = 4.82139

Fig. 9:Reliability of TMR system at the shape parameter k = 1 and the scale parameter λ = 4.82139

Fig. 10:Failure rate of TMR system at the shape parameter k = 1.5 and the scale parameter λ = 9.39632

•  RTMR(t) is the reliability of TMR system, which is the ability of a triple redundant system or component to perform its required functions understated conditions for a specified time

Fig. 11: Reliability of TMR system at the shape parameter k = 1.5 and the scale parameter λ = 9.39632
  t is the lifetime (years) of the triple modular redundancy system

Validation results: To confirm the obvious results, suppose that the lifetime (t = 10 years) was examined with the above three cases, thus from Eq. 32 and 35 the failure rate and reliability were obtained as follows:

If k = 0.5, λ = 3.76473 then:

•  hTMR (10; 0.5; 3.76473) = 0.210734
•  RTMR (10; 0.5 ; 3.76473) = 0.184278

If k = 1, λ = 4.82139 then:

•  hTMR (10; 1; 4.82139) = 0.848368
•  RTMR (10; 1; 4.82139) = 0.0403636

If k = 1.5, λ = 9.39632 then:

•  hTMR (10; 1.5; 9.39632) = 0.021569
•  RTMR (10; 1.5; 9.39632) = 0.366269

Therefore, when the values (k = 0.5) and (λ = 3.76473) a high failure rate was get at t = 10 years, while at the values k = 1.5 and λ = 9.39632, then a low failure rate and high reliability were obtained.

On the other hand, to reduce the failure rates and increase the reliability at the same lifetime (t = 10 years), the shape parameter k must increase, for example (k = 17), then the scale parameter λ was estimated from Eq. 39 by using the command FindRoot we get λ = 19.1732.

Fig. 12: Failure rate of TMR system at the shape parameter k = 17 and the scale parameter λ = 19.1732

Fig. 13:Reliability of TMR system at the shape parameter k = 17 and the scale parameter λ = 19.1732

Therefore, the graphic representation of the failure rate and reliability were obtained (Fig. 12, 13).

Then, the values of the failure rate will be very small and the reliability will be very high. Where hTMR (10; 17; 19.1732) = (2.4944 ×10–9) and RTMR (10; 17; 19.1732) = 1.

CONCLUSION

This study will improve the efficiency of the Triple Modular Redundant (TMR) system, reduce the risk of sudden failure, minimize financial and economic losses that affect human life. Therefore, the designing of this type of system must improve where the system contains three units in the case of fail-safe depends on the vote gate and the output determine the level of dangerous. The order statistics context and the maximum likelihood method were applied to find the failure rate, lifetime and reliability of the TMR system using Weibull distributions. If the shape parameter increases and >1, then the failure rates will decrease. On the other hand, each of the lifetime and reliability will increase in the TMR system. Thus the designing of TMR system can be longer lifetime and more safer. Furthermore, this study can be applied on the other electronic systems to be more efficiency and safety.

SIGNIFICANT STATEMENT

The triple modular redundant system is considered one of the most important systems in the field of industries of aircraft, computers and electronic devices which it affects on human life and economic. This study will enable the engineers and factories to get the best designing of the electronic systems with more safety and efficiency. Therefore, the risks of sudden failure and economic losses will be reduced.

ACKNOWLEDGMENTS

The authors would like to thank the Port Said University, Faculty of Science, Department of Mathematics and Computer Science for providing their support in this study.

REFERENCES

  • Al Mutairi Alya, O., H.C. Low, 2014. Estimations of the central tendency measures of the random-sum poisson-weibull distribution using saddlepoint approximation. J. Applied Sci., 14: 1889-1893.
    CrossRef    Direct Link    


  • Alagoz, B.B., 2008. Hierarchical Triple-Modular Redundancy (H-TMR) network for digital systems. OncuBilim Algorithm Syst. Labs. Vol. 8.


  • Alharpy, A.M. and N.A. Ibrahim, 2013. Parametric tests for partly interval-censored failure time data under Weibull distribution via multiple imputation. J. Applied Sci., 13: 621-626.
    CrossRef    Direct Link    


  • El Damsesy, M., M. El Genidy and A. El Gazar, 2015. Reliability and failure rate of the electronic system by using mixture lindley distribution. J. Applied Sci., 15: 524-530.
    CrossRef    Direct Link    


  • El Genidy, M.M., 2012. Estimation some parameters of K-station series model. Asian J. Applied Sci., 5: 307-313.
    CrossRef    Direct Link    


  • El Genidy, M.M., 2014. Joint probability distribution function of the repairing time for the defective machines in the multiple queues system. Asian J. Math. Stat., 7: 35-39.
    CrossRef    Direct Link    


  • Pan, K., Y. Guo, G. Ren and H. Lin, 2014. Heat dissipation and thermo-mechanical reliability study for multi-chip module high power LED integrated packaging with through silicon vias. Inform. Technol. J., 13: 1316-1322.
    CrossRef    Direct Link    


  • Reshid, M.N. and M.A. Abd Majid, 2011. A multi-state reliability model for a gas fueled cogenerated power plant. J. Applied Sci., 11: 1945-1951.
    CrossRef    Direct Link    


  • Saat, N.Z.M., A.A. Jemain and S.H.A. Al-Mashoor, 2008. A comparison of weibull and gamma distribution in application of sleep apnea. Asian J. Math. Stat., 1: 132-138.
    CrossRef    Direct Link    


  • Zhang, Z., D. Liu, Z. Wei and C. Sun, 2006. Research on triple modular redundancy dynamic fault-tolerant system model. Proceedings of the 1st International Multi-Symposiums on Computer and Computational Sciences, Volume 1, June 20-24, 2006, IEEE., pp: 572-576.


  • Cai, B., Y. Liu, Z. Liu, X. Tian, X. Dong and S. Yu, 2012. Using Bayesian networks in reliability evaluation for subsea blowout preventer control system. Reliab. Eng. Syst. Safety, 108: 32-41.
    CrossRef    Direct Link    


  • Dobel, B., H. Hartig and M. Engel, 2012. Operating system support for redundant multithreading. Proceedings of the 10th ACM International Conference on Embedded Software, October 7-12, 2012, Finland, pp: 83-92.


  • Yao, J., S. Okada, M. Masuda, K. Kobayashi and Y. Nakashima, 2012. DARA: A low-cost reliable architecture based on unhardened devices and its case study of radiation stress test. IEEE Trans. Nucl. Sci., 59: 2852-2858.
    CrossRef    Direct Link    


  • Van Gemund, A.J.C. and G.L. Reijns, 2012. Reliability analysis of k-out-of-n systems with single cold standby using pearson distributions. IEEE Trans. Reliab., 61: 526-532.
    CrossRef    Direct Link    


  • Henkel, J., L. Bauer, N. Dutt, P. Gupta and S. Nassif et al., 2013. Reliable on-chip systems in the nano-era: Lessons learnt and future trends. Proceedings of the 50th Annual Design Automation Conference, May 29-June 7, 2013, Austin, TX., USA., pp: 99-.


  • Legat, U., A. Biasizzo and F. Novak, 2012. SEU recovery mechanism for SRAM-based FPGAs. IEEE Trans. Nucl. Sci., 59: 2562-2571.
    CrossRef    Direct Link    


  • Egwutuoha, I.P., D. Levy, B. Selic and S. Chen, 2013. A survey of fault tolerance mechanisms and checkpoint/restart implementations for high performance computing systems. J. Supercomput., 65: 1302-1326.
    CrossRef    Direct Link    


  • El-Maleh, A.H. and F.C. Oughali, 2014. A generalized modular redundancy scheme for enhancing fault tolerance of combinational circuits. Microelect. Reliab., 54: 316-326.
    CrossRef    Direct Link    


  • Cai, B., Y. Liu, Z. Liu, X. Tian, Y. Zhang and R. Ji, 2013. Application of Bayesian networks in quantitative risk assessment of subsea blowout preventer operations. Risk Anal., 33: 1293-1311.
    CrossRef    Direct Link    


  • Petrovic, V., M. Ilic, G. Schoof and Z. Stamenkovic, 2012. SEU and SET fault injection models for fault tolerant circuits. Proceedings of the 13th Biennial Baltic Electronics Conference, October 3-5, 2012, Tallinn, Estonai, pp: 73-76.

  • © Science Alert. All Rights Reserved