Subscribe Now Subscribe Today
Research Article

Reliability and Availability Evaluation for a Multi-state System Subject to Minimal Repair

Masdi Muhammad and M. Amin Abd Majid
Facebook Twitter Digg Reddit Linkedin StumbleUpon E-mail

Effective maintenance management is essential to reduce the adverse effect of equipment failure to operation. This can be accomplished through accurately predicting the equipment failure such that appropriate actions can be planned and taken in order to minimize the impact of equipment failure to operation. This study presents a development of model based on continuous time Markov process for a degraded multi-state system to evaluate the system performance. The system degradation was quantified by five distinct level of systems production output ranging from perfect functioning state to complete failure with zero output. At any point in time, the system can experience Poisson failure from any state upon which minimal repair will be performed. This research explored a method of estimating of transition rates as well as definition of states for the Markov process by utilizing historical production data. The results indicate the applicability of Markov process in estimating the reliability and availability of multi-state systems.

Related Articles in ASCI
Similar Articles in this Journal
Search in Google Scholar
View Citation
Report Citation

  How to cite this article:

Masdi Muhammad and M. Amin Abd Majid, 2011. Reliability and Availability Evaluation for a Multi-state System Subject to Minimal Repair. Journal of Applied Sciences, 11: 2036-2041.

DOI: 10.3923/jas.2011.2036.2041

Received: October 22, 2010; Accepted: February 21, 2011; Published: April 18, 2011


Effective maintenance management is essential and critical as a way to reduce the adverse effect of equipment failures and to maximize equipment availability. The increase in equipment availability means higher productivity and thus higher profitability provided that the maintenance optimization does include the cost factor. This has lead to increase research interest in the subject of optimizing maintenance management. It is estimated that 15 to 45% of total production cost are attributed to maintenance cost with 30% of total manpower involvement (Al-Najjar and Alsyouf, 2003). This is significant; however, the consequence of an inefficient maintenance management is far beyond the direct cost of maintenance although not easily quantifiable. The maintenances high cost and low efficiency is one of the last cost saving frontier for companies to improve profitability (Lofsten, 2000).

The current research will be focusing on the development of performance evaluation model for repairable equipment subjected to degradation which in time, reduces the ability of the system to perform its intended function. In other words, the system performance degrades into discrete states prior to total failure. A repairable system is defined as a system which can be restored to satisfactory working condition by repairing or replacing the damaged components that caused the failure to occur other than replacing the whole system (Weckman et al., 2001). Performance evaluation of the model would include the evaluation of system reliability as well as system availability with respect to time. The degradation process, if left unattended, will often lead to degradation failure (Moustafa et al., 2004). The degradation can be caused by a myriad of factors including variable operating environment, fatigue, failures of non-essential components and random shocks on the system (Ramirez-Marquez and Coit, 2005).

Traditionally, reliability or availability analysis of repairable system depends upon the assumption that the system can be in a binary state; either fully working conditions or complete failures. With the assumption, numerous approaches, methodologies and models have emerged to predict the reliability of repairable systems corresponding to different repair assumptions. The models include variations of perfect renewals process which assumes perfect repair and Non-homogenous Poisson Process (NHPP) for minimal repair assumption as discussed in literatures of Krivtsov (2007), Mettas and Wenbiao (2005) and Feingold and Ascher (1984). Still, another model called Generalized Renewal Process (GRP) with the assumption that the repair process is in between perfect repair and minimal repair as proposed by Kijima and Sumita (1986) and further researched by Wolstenholme (1999), Krivtsov (2000) and Weckman et al. (2001) in (Tomaservics and Asgarpoor, 2009) to name a few.

However, there are cases as mentioned by researchers such as Soro et al. (2010), Ramirez-Marquez and Coit (2005), Donat et al. (2009) and Pham et al. (1997) that binary assumption failed to characterize actual system reliability behavior. In these cases, analysis using Multi-state System (MSS) assumption is found to be more appropriate. MSS is defined as system that can have a finite number of performance rates with various distinguished level of efficiency (Lisnianski and Levitin, 2003). Typical systems where MSS has been applied successfully are in the area of water distribution (Micevski et al., 2002), telecommunication, oil and gas supply system and power generation and transmission (Pham et al., 1997). This is due to the fact that there are a number of distinct degradation phases for the system prior to complete failure which is evident from different levels of production outputs. Common methods in accessing the performance of MSS are based on four different approaches: Extension of Boolean models to the multi-valued case, the stochastic process (Markov and semi-Markov), the universal generating function and the Monte-Carlo simulation techniques (Lisnianski and Levitin, 2003). Each approach has advantages and disadvantages depending on the system understudy.

Present research will be focusing on applying continuous time Markov process to an absorption chiller system subject to minimal repair. Minimal repair is defined as repair action that would restore the system back to a state just before failure and as such the failure rate of the system will remain the same (Pham et al., 1997). Markov process was chosen to model the system due to its versatility which can be used for finite number of states and different assumptions of repair. The primary advantage of Markov process relies on its ability to describe graphically and mathematically convenient form the time dependent transitions between the system states.


Markov chain process: The Markov chain is a discrete-time stochastic process where the conditional probability of any future events is only dependent upon current state and is independent of past history. This can be expressed mathematically (Micevski et al., 2002):


And as Markov chain assumes that the conditional probability does not change with time and is independent of t for all states i and j:


where, pij = transitional probability from state i at time t to state j at time t+1:


In many situations, a change of state for Markov process does not occur at a fixed discrete time but in a continuous time random variable as in the case of reliability analysis. When the time spent on any state (sojourn time) is assumed to be exponentially distributed and λi, I=1,…, n, is a constant transition rates from state i to state i+1, the process is the continuous time Markov chain (Lisnianski and Levitin, 2003). With repair rate μi, representing minor repair rate returning the element back to state i after failure, the general Chapman-Kolmogorov differential equations for state probabilities can be written as follows:


By knowing the initial state conditions, the Eq. 4 can be solved to obtain the state probabilities. The calculation for availability and reliability can subsequently be done once the probability at each state is determined. Assuming G(t) is the performance rate with respect to time and W is the constant demand for the system, the availability of the system is at the state where performance rate is greater than demand, that is:


Reliability function, R(t), on the other hand can be calculated by assuming that states 2, 4 and 6 which are the complete failure state, are the absorbing state and thus, when the system enters those states, it never leaves Ramirez-Marquez and Coit (2005). Thus:


where, Pi(t) is the probability of the system to be in state i.

Table 1: Limits for each states

Fig. 1: Daily performance trend for absorption chiller

Fig. 2: Absorption chiller discrete states based on daily performance

Model: The Markov chain model development for multistate system was based on performance data from a series steam absorption chillers that supply chilled water to an airport and its surrounding facilities. The absorption chillers utilize the excess heat generated from the two gas turbines as well as from auxiliary boilers to produce the chilled water. The model will be looking at the performance of an absorption chiller in the form of total Refrigeration Ton Hour (RTh) per day as shown in Fig. 1. Based on RTh output and as observed in the time series plot, there are a number of distinct states that the system can be at any one time depending on the level of daily production. This can be further validated by one way analysis describing the five different states as shown in Fig. 2. The result further substantiates the need to look at the system performance analysis as an MSS compared to using traditional binary reliability modelling framework.

The limit values for each state are as given in Table 1.

Based on the analysis, there are five distinct states for the selected system with state 1 being the highest performing state and states 2, 4 and 6 are assumed to be total failure with zero output.

Fig. 3: State transition diagram

Model assumptions:

The system may have a number of discrete states depending upon the performance rates which can vary from perfect functionality to complete failure
The system may fail randomly from any operational states and will be immediately repaired. The repairs are assumed to be minimal repair
All transition rates are constant and sojourn time at each states is exponentially distributed
The demand for each system is constant

System description: The system state transition diagram is as shown in Fig. 3. With the assumption that the system is initially in the highest performing states as depicted by the highest output in state 1. This is called nominal state. As time progress, the system can either degrade to the output performance of state 2 with transition rates of α1 or can randomly fail with failure rate λ1 following Poisson failure which occurs at an instant. If the system fails, it will be repaired minimally returning the system to previous state just before failure. The rate of minimal repair is given as μ1. When the system reaches it second degraded state (state 3), it can go either to the subsequent degradation state (state 5) or abruptly fails with probabilities α2 or λ2, respectively. If failure does occur, the system will be again minimally repaired returning the system to previous state. This process continues until the system reaches the last acceptable state (state 5) at which point, the preventive maintenance will be performed.

Estimation of transition rates: The estimation of transition rates is based on performance data. The first step is to access the state the system is in for each day based on the maximum and minimum output for each state as shown in Table 1. The limits for each state are determined based on quantile plots as shown in Fig. 4.

Table 2: Transition rates

Fig. 4: Probability plot for each state

If the output of that day falls between the maximum and minimum, say for state 3, then the system will be in the mentioned state for the rest of the day. This was done for the rest of 975 days of available data on production output. Next is to calculate the yearly transition rates between states:


This method of calculation is applied for all the states and the results for all transition rates are given in Table 2.

Governing equations: Based on state transition diagram shown in Fig. 4, the Chapman-Kolmogorov equations obtained are as follows:







System of Eq. 8-13 can be solved for each state probability by using Laplace transformation and inverse Laplace. Detailed derivations and solutions can be referred to Pham et al. (1997).


State probabilities: One of the objectives of multi state system analysis is to predict the probabilities of the system to be for each state. With the assumption that the system is in state 1 at time t = 0 and solving systems of Eq. 8-13, the state probabilities can be calculated. The results for state probabilities for 370 days of operation are as depicted in Fig. 4.

As the highest transfer rate is from state 1 to state 3, it can be observed that there is low possibility the system will be in state 1 in the long run. This is consistent with the fact that the system performance degrades with time. The system degradation from one state to aSnother can also be seen from the increasing failure rates (λ1≤λ2≤λ3) while the repair rates remain almost similar. The constant repair could be due to repair actions which were conducted by similar group of maintenance crew.

Reliability: Reliability function of the multi-state system understudy can be determined by finding out the probability that the system enters failure states which in this case, state 2, 4 and state 6 with the assumption that the states are absorbing states. This means that whenever the system enters the states it never leaves and this can be achieved by assuming all the probabilities of leaving those states are zeros. In other words, the repair rates (μ1, μ2 and μ3) are all assumed to be zero. The reliability is then calculated based on Eq. 6 and it is equal to A(t) with failure rates are all assumed to be zeros.

The system failure data was also model following a Weibull distribution with parameters shape factor, β = 0.8868 and scale factor, η = 43.58 days. The cumulative distribution function is as shown in (14). Comparisons of reliability results from both methods are shown in Fig. 5.

As observed from Fig. 6, the reliability calculation results using MSS show a difference with traditional binary model. This is due to the difference in failure rates assumptions where in MSS, the failure rates are less in earlier stage of degradation.

Fig. 5: System reliability comparison

Fig. 6: Availability plot with repair rates

On the other hand, for Weibull distribution the failure rate is decreasing with time as the value for β is calculated to be less than one (β<1).

Availability: The system availability is defined as the probability that the system is in functioning states at time t. As the MSS described in Fig. 3 can be functioning even at degraded state as long as the performance rates are more than the demand, the availability is then the sum of probabilities of being in the operating states. Assuming the demand, W, is 222.6 RTh per day, the availability is then:


The calculated A(t) as a function of operation days is as shown in Fig. 6.

As observed, the availability is decreasing with time depicting the system degradation states. It also can be seen that the availability improves with maintenance improvement as repair rates is the inverse of the mean time to repair.


The current study managed to prove the applicability of Markov process in determining the performance of an MSS which in this case is an absorption chiller, subject to minimal repair. The methodology of calculating transition rates based on daily production data is also provided. Based on the results, the proposed model could be used in practical situation when there is a need to assess the impact of different repair actions to MSS performance. The proposed model can be further improved by including the variable distribution of sojourn time. This is due to the inherent constraint of Markov process that the state sojourn time needs to be exponentially distributed which is not necessarily true. If the distribution is far from exponential, the results from model will be biased. A more appropriate process will be a semi-Markov that would allow any distribution of sojourn time (Donat et al., 2009; Tian et al., 2009) which will be further investigated.


The authors wish to thank Universiti Teknologi Petronas for providing the necessary support for this research.

1:  Al-Najjar, B. and I. Alsyouf, 2003. Selecting the most efficient maintenance approach using fuzzy multiple criteria decision making. Int. J. Prod. Econ., 84: 85-100.
CrossRef  |  Direct Link  |  

2:  Donat, R., L. Bouillaut, A. Neji and P. Aknin, 2009. Comparison of two graphical models approach for modeling of multi-component system`s reliability. Proceedings of the International Conference on Computers and Industrial Engineering, July 6-9, Troyes, pp: 1261-1266.

3:  Feingold, H. and H. Ascher, 1984. Repairable Systems Reliability: Modeling, Inference, Misconceptions and their Causes. Marcel Dekker, New York.

4:  Kijima, M. and N. Sumita, 1986. A useful generalization of renewal theory: Counting process governed by non-negative markovian increments. J. Applied Prob., 23: 71-88.
Direct Link  |  

5:  Krivtsov, V., 2000. A monte-carlo approach to modeling and estimation of generalized renewal process in repairable system reliability analysis. Ph.D. Thesis, University of Maryland.

6:  Krivtsov, V.V., 2007. Practical extensions to NHPP application in repairable system reliability analysis. Reliability Eng. Syst. Safety, 92: 560-562.
CrossRef  |  

7:  Lisnianski, A. and G. Levitin, 2003. Multi-State System Reliability: Assessment, Optimization and Applications. World Scientific Publishing Co. Ptv. Ltd., Singapore.

8:  Lofsten, H., 2000. Measuring maintenance performance-in search for a maintenance productivity index. Int. J. Product. Econ., 63: 47-58.
CrossRef  |  

9:  Mettas, A. and Z. Wenbiao, 2005. Modeling and analysis of repairable system with general repair. Proceedings of the Annual Reliability and Maintainabilty Symposium, January 24-27, 2005, Alexandaria, Virginia, pp: 176-182.

10:  Micevski, T., G. Kuczera and P. Coombes, 2002. Markov model for storm water pipe deterioration. J. Infrastructure Syst., 8: 49-56.
CrossRef  |  

11:  Moustafa, M.S., E.Y.A. Maksoud and S. Sadek, 2004. Optimal major and minimal maintenance policies for deteriorating system. Reliability Eng. Syst. Safety, 83: 363-368.
CrossRef  |  

12:  Pham, H., A. Suprasad and R.B. Misra, 1997. Availability and mean life prediction of multistage degraded system with partial repairs. Reliability Eng. Syst. Safety, 56: 169-173.
CrossRef  |  

13:  Ramirez-Marquez, J.E. and D.W. Coit, 2005. A monte-carlo simulation approach for approximating multi-state two-terminal reliability. Reliability Eng. Syst. Safety, 87: 253-264.
CrossRef  |  

14:  Soro, I.W., M. Nourelfath and D. Ait-Kadi, 2010. Performance evaluation of multi-state degraded systems with minimal repairs and imperfect preventive maintenance. Reliability Eng. Syst. Safety, 95: 65-69.
CrossRef  |  

15:  Tian, Z., G. Levitin and M.J. Zuo, 2009. A joint reliability-redundancy optimization approach for multi-state series-parallel systems. Reliability Eng. Syst. Safety, 94: 1568-1576.
CrossRef  |  

16:  Tomasevicz, C.L. and S. Asgarpoor, 2009. Optimum maintenance policy using semi-Markov decision processes. Electr. Power Syst. Res., 79: 1286-1291.
CrossRef  |  

17:  Weckman, G.R., R.L. Shell and J.H. Marvel, 2001. Modeling the reliability of repairable systems in aviation industry. Comput. Ind. Eng., 40: 51-63.
CrossRef  |  

18:  Wolstenholme, L.C., 1999. Reliability Modeling: A Statistical Approach. Chapman and Hall, New York.

©  2021 Science Alert. All Rights Reserved