INTRODUCTION
Effective maintenance management is essential and critical as a way to reduce
the adverse effect of equipment failures and to maximize equipment availability.
The increase in equipment availability means higher productivity and thus higher
profitability provided that the maintenance optimization does include the cost
factor. This has lead to increase research interest in the subject of optimizing
maintenance management. It is estimated that 15 to 45% of total production cost
are attributed to maintenance cost with 30% of total manpower involvement (AlNajjar
and Alsyouf, 2003). This is significant; however, the consequence of an
inefficient maintenance management is far beyond the direct cost of maintenance
although not easily quantifiable. The maintenances high cost and low efficiency
is one of the last cost saving frontier for companies to improve profitability
(Lofsten, 2000).
The current research will be focusing on the development of performance evaluation
model for repairable equipment subjected to degradation which in time, reduces
the ability of the system to perform its intended function. In other words,
the system performance degrades into discrete states prior to total failure.
A repairable system is defined as a system which can be restored to satisfactory
working condition by repairing or replacing the damaged components that caused
the failure to occur other than replacing the whole system (Weckman
et al., 2001). Performance evaluation of the model would include
the evaluation of system reliability as well as system availability with respect
to time. The degradation process, if left unattended, will often lead to degradation
failure (Moustafa et al., 2004). The degradation
can be caused by a myriad of factors including variable operating environment,
fatigue, failures of nonessential components and random shocks on the system
(RamirezMarquez and Coit, 2005).
Traditionally, reliability or availability analysis of repairable system depends
upon the assumption that the system can be in a binary state; either fully working
conditions or complete failures. With the assumption, numerous approaches, methodologies
and models have emerged to predict the reliability of repairable systems corresponding
to different repair assumptions. The models include variations of perfect renewals
process which assumes perfect repair and Nonhomogenous Poisson Process (NHPP)
for minimal repair assumption as discussed in literatures of Krivtsov
(2007), Mettas and Wenbiao (2005) and Feingold
and Ascher (1984). Still, another model called Generalized Renewal Process
(GRP) with the assumption that the repair process is in between perfect repair
and minimal repair as proposed by Kijima and Sumita (1986)
and further researched by Wolstenholme (1999), Krivtsov
(2000) and Weckman et al. (2001) in (Tomaservics
and Asgarpoor, 2009) to name a few.
However, there are cases as mentioned by researchers such as Soro
et al. (2010), RamirezMarquez and Coit (2005),
Donat et al. (2009) and Pham
et al. (1997) that binary assumption failed to characterize actual
system reliability behavior. In these cases, analysis using Multistate System
(MSS) assumption is found to be more appropriate. MSS is defined as system that
can have a finite number of performance rates with various distinguished level
of efficiency (Lisnianski and Levitin, 2003). Typical
systems where MSS has been applied successfully are in the area of water distribution
(Micevski et al., 2002), telecommunication, oil
and gas supply system and power generation and transmission (Pham
et al., 1997). This is due to the fact that there are a number of
distinct degradation phases for the system prior to complete failure which is
evident from different levels of production outputs. Common methods in accessing
the performance of MSS are based on four different approaches: Extension of
Boolean models to the multivalued case, the stochastic process (Markov and
semiMarkov), the universal generating function and the MonteCarlo simulation
techniques (Lisnianski and Levitin, 2003). Each approach
has advantages and disadvantages depending on the system understudy.
Present research will be focusing on applying continuous time Markov process
to an absorption chiller system subject to minimal repair. Minimal repair is
defined as repair action that would restore the system back to a state just
before failure and as such the failure rate of the system will remain the same
(Pham et al., 1997). Markov process was chosen
to model the system due to its versatility which can be used for finite number
of states and different assumptions of repair. The primary advantage of Markov
process relies on its ability to describe graphically and mathematically convenient
form the time dependent transitions between the system states.
MODEL DEVELOPMENT
Markov chain process: The Markov chain is a discretetime stochastic
process where the conditional probability of any future events is only dependent
upon current state and is independent of past history. This can be expressed
mathematically (Micevski et al., 2002):
And as Markov chain assumes that the conditional probability does not change with time and is independent of t for all states i and j:
where, p_{ij} = transitional probability from state i at time t to state j at time t+1:
In many situations, a change of state for Markov process does not occur at
a fixed discrete time but in a continuous time random variable as in the case
of reliability analysis. When the time spent on any state (sojourn time) is
assumed to be exponentially distributed and λ_{i}, I=1,…,
n, is a constant transition rates from state i to state i+1, the process is
the continuous time Markov chain (Lisnianski and Levitin,
2003). With repair rate μ_{i}, representing minor repair rate
returning the element back to state i after failure, the general ChapmanKolmogorov
differential equations for state probabilities can be written as follows:
By knowing the initial state conditions, the Eq. 4 can be solved to obtain the state probabilities. The calculation for availability and reliability can subsequently be done once the probability at each state is determined. Assuming G(t) is the performance rate with respect to time and W is the constant demand for the system, the availability of the system is at the state where performance rate is greater than demand, that is:
Reliability function, R(t), on the other hand can be calculated by assuming
that states 2, 4 and 6 which are the complete failure state, are the absorbing
state and thus, when the system enters those states, it never leaves RamirezMarquez
and Coit (2005). Thus:
where, P_{i}(t) is the probability of the system to be in state i.
Table 1: 
Limits for each states 


Fig. 1: 
Daily performance trend for absorption chiller 

Fig. 2: 
Absorption chiller discrete states based on daily performance 
Model: The Markov chain model development for multistate system was
based on performance data from a series steam absorption chillers that supply
chilled water to an airport and its surrounding facilities. The absorption chillers
utilize the excess heat generated from the two gas turbines as well as from
auxiliary boilers to produce the chilled water. The model will be looking at
the performance of an absorption chiller in the form of total Refrigeration
Ton Hour (RTh) per day as shown in Fig. 1. Based on RTh output
and as observed in the time series plot, there are a number of distinct states
that the system can be at any one time depending on the level of daily production.
This can be further validated by one way analysis describing the five different
states as shown in Fig. 2. The result further substantiates
the need to look at the system performance analysis as an MSS compared to using
traditional binary reliability modelling framework.
The limit values for each state are as given in Table 1.
Based on the analysis, there are five distinct states for the selected system with state 1 being the highest performing state and states 2, 4 and 6 are assumed to be total failure with zero output.

Fig. 3: 
State transition diagram 
Model assumptions:
• 
The system may have a number of discrete states depending upon the performance
rates which can vary from perfect functionality to complete failure 
• 
The system may fail randomly from any operational states and will be immediately
repaired. The repairs are assumed to be minimal repair 
• 
All transition rates are constant and sojourn time at each states is exponentially
distributed 
• 
The demand for each system is constant 
System description: The system state transition diagram is as shown
in Fig. 3. With the assumption that the system is initially
in the highest performing states as depicted by the highest output in state
1. This is called nominal state. As time progress, the system can either degrade
to the output performance of state 2 with transition rates of α_{1}
or can randomly fail with failure rate λ_{1 }following Poisson
failure which occurs at an instant. If the system fails, it will be repaired
minimally returning the system to previous state just before failure. The rate
of minimal repair is given as μ_{1}. When the system reaches it
second degraded state (state 3), it can go either to the subsequent degradation
state (state 5) or abruptly fails with probabilities α_{2} or λ_{2},
respectively. If failure does occur, the system will be again minimally repaired
returning the system to previous state. This process continues until the system
reaches the last acceptable state (state 5) at which point, the preventive maintenance
will be performed.
Estimation of transition rates: The estimation of transition rates is
based on performance data. The first step is to access the state the system
is in for each day based on the maximum and minimum output for each state as
shown in Table 1. The limits for each state are determined
based on quantile plots as shown in Fig. 4.
Table 2: 
Transition rates 


Fig. 4: 
Probability plot for each state 
If the output of that day falls between the maximum and minimum, say for state
3, then the system will be in the mentioned state for the rest of the day. This
was done for the rest of 975 days of available data on production output. Next
is to calculate the yearly transition rates between states:
This method of calculation is applied for all the states and the results for all transition rates are given in Table 2.
Governing equations: Based on state transition diagram shown in Fig. 4, the ChapmanKolmogorov equations obtained are as follows:
System of Eq. 813 can be solved for each
state probability by using Laplace transformation and inverse Laplace. Detailed
derivations and solutions can be referred to Pham et
al. (1997).
RESULTS AND DISCUSSION
State probabilities: One of the objectives of multi state system analysis
is to predict the probabilities of the system to be for each state. With the
assumption that the system is in state 1 at time t = 0 and solving systems of
Eq. 813, the state probabilities can be
calculated. The results for state probabilities for 370 days of operation are
as depicted in Fig. 4.
As the highest transfer rate is from state 1 to state 3, it can be observed
that there is low possibility the system will be in state 1 in the long run.
This is consistent with the fact that the system performance degrades with time.
The system degradation from one state to aSnother can also be seen from the
increasing failure rates (λ_{1}≤λ_{2}≤λ_{3})
while the repair rates remain almost similar. The constant repair could be due
to repair actions which were conducted by similar group of maintenance crew.
Reliability: Reliability function of the multistate system understudy can be determined by finding out the probability that the system enters failure states which in this case, state 2, 4 and state 6 with the assumption that the states are absorbing states. This means that whenever the system enters the states it never leaves and this can be achieved by assuming all the probabilities of leaving those states are zeros. In other words, the repair rates (μ_{1}, μ_{2} and μ_{3}) are all assumed to be zero. The reliability is then calculated based on Eq. 6 and it is equal to A(t) with failure rates are all assumed to be zeros.
The system failure data was also model following a Weibull distribution with parameters shape factor, β = 0.8868 and scale factor, η = 43.58 days. The cumulative distribution function is as shown in (14). Comparisons of reliability results from both methods are shown in Fig. 5.
As observed from Fig. 6, the reliability calculation results
using MSS show a difference with traditional binary model. This is due to the
difference in failure rates assumptions where in MSS, the failure rates are
less in earlier stage of degradation.

Fig. 5: 
System reliability comparison 

Fig. 6: 
Availability plot with repair rates 
On the other hand, for Weibull distribution the failure rate is decreasing
with time as the value for β is calculated to be less than one (β<1).
Availability: The system availability is defined as the probability
that the system is in functioning states at time t. As the MSS described in
Fig. 3 can be functioning even at degraded state as long as
the performance rates are more than the demand, the availability is then the
sum of probabilities of being in the operating states. Assuming the demand,
W, is 222.6 RTh per day, the availability is then:
The calculated A(t) as a function of operation days is as shown in Fig. 6.
As observed, the availability is decreasing with time depicting the system degradation states. It also can be seen that the availability improves with maintenance improvement as repair rates is the inverse of the mean time to repair.
CONCLUSION
The current study managed to prove the applicability of Markov process in determining
the performance of an MSS which in this case is an absorption chiller, subject
to minimal repair. The methodology of calculating transition rates based on
daily production data is also provided. Based on the results, the proposed model
could be used in practical situation when there is a need to assess the impact
of different repair actions to MSS performance. The proposed model can be further
improved by including the variable distribution of sojourn time. This is due
to the inherent constraint of Markov process that the state sojourn time needs
to be exponentially distributed which is not necessarily true. If the distribution
is far from exponential, the results from model will be biased. A more appropriate
process will be a semiMarkov that would allow any distribution of sojourn time
(Donat et al., 2009; Tian
et al., 2009) which will be further investigated.
ACKNOWLEDGMENT
The authors wish to thank Universiti Teknologi Petronas for providing the necessary support for this research.