Maintainability analysis provides quantifiable assessment of the performance and effectiveness of the maintenance and support system so that further improvement actions can be made. This study presents a systematic and practical approach for conducting maintainability analysis at operational phase. The proposed approach is demonstrated via a case study of a gas compression train system. The results indicate the approach is effective in identifying key contributors to system downtime and estimating maintainability measures for future maintenance system improvement and planning.
PDF Abstract XML References Citation
How to cite this article
The applications of maintainability analysis of plant maintenance field data are not widespread among industrial practitioners compared to those of reliability analysis. Nevertheless, maintainability analysis is highly critical for ongoing efforts in reducing operations and maintenance costs, thus should be appropriately considered in every phase of a system life cycle (Blanchard et al., 1995). At the operation and support phase, the on-going maintainability analysis provides quantifiable assessment of the performance and effectiveness of the maintenance and support system, identification of equipment, system and process high cost and downtime drivers and evaluation of maintainability measures and prediction. The results of the analysis are then used as valuable information for operation, maintenance and design personnel to make the maintenance system more effective, plan logistic support requirement (i.e., workers, tools and materials), carry out improvement actions to reduce operation costs and achieve current system operation performance targets, which are always changing as the results of plant decreasing profit margin and escalating operation cost trend. There are a limited number of studies in the literature about the maintainability analysis applied at the operation and support phase. Examples of them can be found in references (Thangamani et al., 1995; Alvi, 1997; Hajeeh and Chaudhuri, 2000). Most studies, however, are not specifically on maintainability per se, but rather are part of other larger studies such as availability and RAM analysis. Consequently, the analyses done are generally not comprehensive and lack details in the methodology used. Moreover, many assume constant maintenance downtime or repair rate in the analysis model mainly for ease of calculations. Some analyses, nevertheless, adopt the assumption after the results of some statistical data analysis (Barabady and Kumar, 2008; Elevli et al., 2008). In real world applications, however, certain maintenance data do not exhibit such a steady state condition, thus any prediction based on the constant repair rate assumption will likely to produce incorrect results. The existence of a trend in data can be due to the deterioration or improvement in the maintenance system. The improvement trend in maintainability seen in a system is generally a direct result of effective improvement actions carried out in the maintenance and support system to reduce the downtime. In some systems such as an offshore system, the improvement trend may be prominent only after few years of system commencement, taking into consideration the learning curve and period to achieve stable operation. Many studies on system maintainability fall short of considering this data pattern and instead tend to blindly use all the data acquired since the beginning of system operation in estimating maintainability measures.
This study aims to present a general framework of a practical, systematic and detailed approach for analyzing maintainability of a system at operational phase. The proposed approach is demonstrated by a case study of a gas compression train system at an offshore platform. This study also addresses the issue of maintenance data having improvement trend and proposes a simple method for estimating their maintainability measures more effectively. The scope of the study is on the analysis of Corrective Maintenance (CM) downtime.
MATERIALS AND METHODS
The proposed approach to maintainability analysis of a system in plant at operational and support phase can be illustrated by a generic framework in Fig. 1. In general, it involves six major steps:
|Step 1:||Setting objectives: The most important factor for successful maintainability study is having clear definition of the specific purpose to be achieved at the end of the analysis (Denson, 2006). Only by having unambiguous objectives in the beginning and consistently sticking to it throughout the whole analysis process, can a proper and effective analysis be accomplished (Ansell and Phillips, 1989). The objective of the maintainability study has high influence on the approach and method of modeling and analysis used (Aven and Jensen, 1999)|
|Step 2:||Definition of system, failure and downtime: The definition of system under studied, system boundary and operating states, failure event and modes need to be clearly specified to put the subsequent analysis steps in the right perspective and to minimize uncertainties associated with the data. A distinct system boundary shall identify what are components within the system and what are excluded from it. The boundary also defines what data are to be collected. Other system information such as its descriptions, applications, operating mode and environment conditions have also to be clearly specified. At this stage, it is also important to define all assumptions made in the maintainability model and determine hierarchical level (system, subsystem, component etc.,) of which the data will be collected and analysis will be conducted|
|Step 3:||Data gathering: The quality and accuracy of maintainability analysis is highly correlated to the quality of the data collected. High quality data attributes include completeness of the data, compliance with data formats and reliable sources of data (ISO 14224, 1999). The primary source of data in this research comes from in-house plant maintenance data. Data gathering step is usually the most time and effort consuming activity due the nature of the data and their sources. There are many data available in a plant such as those from maintenance, engineering, vendor reports, SAP (CMMS) etc. Besides, they exist in various forms, thus choosing the relevant one and translating them into distribution and failure statistics can be a challenging task and normally requires considerable engineering judgment. To overcome these issues, good cooperation and constant feedback from plant personnel are highly required|
|Step 4:||Exploratory analysis: Exploratory data analysis, first introduced by Tukey (1997), is the process of using statistical tools and techniques to investigate data sets in order to gain insight about the data, understand their important characteristics, identify outliers or errors, disclose underlying structure and extract important factors (NIST/SEMATECH, 2003) and assist in model formulation (Chatfield, 1985). Because of this apparent significance, many researchers propose the use of exploratory analysis at the beginning of any plant reliability data analysis process (Ansell et al., 1994; Blischke and Murthy, 2000; Andrews and Moss, 2002; OConnor et al., 2002; Todinov, 2005; Muhammad and Majid, 2011). Some of the common tools used include simple plots like histogram, stem and leaf, box-whiskers, Pareto, scattered diagram and time series trend. These methods are significantly useful to get a feel about the data, identifying possible errors in the data and key factors affecting system downtime performance|
|Step 5:||Inferential analysis: The purpose of this step is to determine the best statistical model to represent the data. According to Knezevic (2009), two commonly used methods for analysis of the empirical downtime data are the parametric and distribution approaches. In the parametric approach, the main interest is to get the mean downtime, which is computed by dividing the sum of all downtime hours by the total number of downtime events. In the distribution approach, the downtime is expressed in term of probability distributions, where the downtime is treated as random variable since every failure event will always result in different downtime duration due to different failure modes, components failure and skill level of maintenance people (Ebeling, 1996). Due to this, the distribution approach offers more information than the parametric approach (Knezevic, 2009), thus, is the preferred method in evaluating maintainability measures. The most commonly used probability distributions to describe maintenance downtime are the exponential, normal and log-normal (Blanchard et al., 1995). For downtime data of non-repairable items, the assumption of Independent and Identically Distributed (IID) is generally hold, thus the data can be straightly modeled by statistical distribution. For repairable data of a single equipment or system, however, the data should be tested for IID assumption first before they can be fitted into any distribution. The important of ensuring the data are IID before they can be used for prediction model cannot be emphasized enough. The existence of trend exhibits that the data are not in steady state thus cannot be justifiably fitted into any statistical probability distribution. When data have monotonically increasing trend, the more suitable model is non-homogeneous Poisson process (NHPP). To estimate the best parameters for the statistical distribution, method such as Maximum Likelihood Estimator (MLE) method can be employed. Subsequently, analytical test such as one-sample Kolmogorov-Smirnov (KS test) can be used to determine the best fit distribution to model the data|
|Step 6:||Estimation of maintainability measures: Based on the appropriate distribution selected and its associated parameters, the maintainability measures can be determined. These include maintainability function, Mean Downtime (MDT), Mean Time to Repair (MTTR) and percentage restoration time. The obtained measures are then to be interpreted accordingly to provide a basis for suitable recommendations for system improvement (e.g., which equipment is critical, hence, should be focused on by management).|
|Fig. 1:||Proposed generic methodology for maintainability analysis at operational phase|
CASE STUDY: A GAS COMPRESSION TRAIN SYSTEM
The system under studied is a parallel Gas Compression Train (GCT) consists of two trains, part of a gas compression system on an offshore installation. In this system raw gas from well undergoes various treatment processes and later is compressed to higher pressure by a centrifugal compressor driven by a gas turbine it is transferred to onshore facilities via pipeline. The main objectives of this case study are as follows:
|•||To demonstrate the application of the proposed approach for effective maintainability analysis|
|•||To identify critical factors/subsystems affecting the system CM downtime|
|•||To assess the maintainability measures of the system which are useful for predicting future maintenance system and resources requirements|
Plant maintenance data: System downtime for GCT is contributed by the external events (emergency shutdown (ESD), plant shutdown, turnaround and system standby) and the internal events (corrective maintenance and planned preventive maintenance). All of the downtime data should be distinctly identified and categorized so that the appropriate data are being captured and used for the analysis.
The maintenance data can be categorized into 10 subsystems as described in Table 1. The data for the study are collected for the period of 2002, where the offshore platform was first commissioned, until 2008. Table 2 shows the CM downtime data for both trains which are combined and arranged chronologically.
Exploratory analysis: The availability of the system since it begins operation is shown in Fig. 2. The plot indicates deterioration in the system performance in 2003 and 2004, before it rebounds in 2005 and maintains a good trend from 2006 onwards. To understand what cause this variation in availability, one needs to look at the rate occurrence of failures (ROCOF) and the downtime duration, since the availability of the system is the function of these two factors.
A closer look at the ROCOF and the downtime duration per CM event plot (Fig. 3) shows that the availability performance is highly influenced by the variation in downtime duration, rather than by ROCOF as the performance of ROCOF does not vary very much.
|Table 1:||Subsystems of gas compression train system|
|Table 2:||Downtime data of gas compression train system|
|Fig. 2:||GCT annual availability|
|Fig. 3:||ROCOF and downtime per CM event trend|
The improvement in the system availability since 2005 is hence mainly due to significant reduction in downtime duration. There are many factors contribute to the above trend but the most influential factor is the improvement actions carried out by the engineering, maintenance and production team in the plant. Based on the inputs from engineering personnel, those important initiatives include.
Spare parts and technician logistics enhancement: Many critical spares had been placed at the sites, which were previously being stored at warehouse/supplier base on onshore or OEM vendors oversea. Turbo-machinery technicians are also stationed at the platform to advise material personnel on the spare part requirement. A Pit crew concept, which focuses on team efforts, early planning and streamlining work during shutdown, was also implemented (Hasnan et al., 2004).
Engine and compressor change-out policy: It is suspected that the turbine engine and gas compressor failures which caused high downtime during 2003-2005 periods are caused by over utilization of the equipment. A prudent approach has been taken to ensure that the equipment change-out action will be carried effectively according to the standard industrial practice.
Supplier contract procedure improvement: A Long Term Service Agreement (LTSA) with major OEM suppliers was implemented replacing the old bidding process resulted in improved maintenance services and part delivery by the suppliers.
Pareto analysis: Based on a Pareto analysis of total CM downtime during the studied period for all subsystems, as shown in Fig. 4, major downtime contributors are gas compressor (63.0%), gas turbine (27.7%) and starter system (5.0%) and lube oil system (3.0%). Further investigation on the whole seven operation years, it is found that high gas compressor downtime occurred in 2003 and 2004, but since then, it has shown drastic reduction indicating the improvement activities carried out by the team paid off. Nevertheless, downtime due to lube oil system has shown increasing trend towards the end of observation period. This should be one of the areas that management needs to investigate and focus on.
Trend analysis: The graph of cumulative number downtime over cumulative downtime hours indicates that there is an obvious improvement trend since 2006, as indicated by a concave up plot trend. The calculated Laplace test value, U = 6.04, which is larger than the critical value of 1.95 at 95% confidence level, also confirms the fact that the downtime is in improving trend. The serial correlation test, however, indicates that the data are independent since the data plot are randomly scattered.
|Fig. 4:||Pareto of downtime by subsystems|
|Fig. 5:||Steady state region in the data plot|
|Table 3:||KS goodness-of-fit test|
|Significance <0.005 indicates not a good fit|
|Table 4:||GCT CM maintainability measures|
The trend test indicates that there is a trend, thus, the existing data is not in a steady state (IID) and not appropriate to be used in the next analysis either with the distribution or parametric approach. The trend however is not monotonic. A closer look at the cumulative plot highlights that in the last four years of operation, the data seem to level off (Fig. 5). This steady state region can be demonstrated by constructing a simple linear regression line using a least-squares method on those data. The resulted line has large value of coefficient of determination, R2 at 0.903, which indicates a good measure of goodness of fit of the regression line to the data. To test whether the relationship is significance, a statistical test can be done using F test (Anderson et al., 2002), with the null hypothesis that there is no significance relationship between two variables. The F test calculation resulted in F value of 300 which is greater than the critical value of 7.5 for Type I error, α = 0.01, thus, indicates that the null hypothesis can be rejected. Given this significance statistical relationship, we can confidently assume that the latest data in the recent four years of operation can be established as appropriate data for representing the actual current downtime performance and can be used as the basis for evaluating maintainability or downtime measures. The constant downtime rate predicted based on the slope of the linear line is 24.2 h per downtime.
Inferential analysis: Three commonly used statistical probability distributions (exponential, normal and log-normal) are chosen to model the downtime data of the steady state region. Table 3 shows the results of the calculated distributions parameters using MLE and values of KS test. The calculations of MLE and KS test are done using statistical software; Weibull++7 and SPSS. Based on the result, log-normal distribution is found to be the best fit distribution.
Maintainability measures analysis: Table 4 lists the maintainability measures of GCT based on steady state region and log-normal distribution. Besides the mean downtime, the length of downtime at various percentages of probabilities (10, 50 and 90) of maintenance tasks to be completed can also be determined. This information is beneficial for management in maintenance system planning and for determining the costing, maintenance scheduling, technical and non-technical man-power planning and availability projection. The mean downtime is at 28.7 h. The estimated downtime at 90% maintenance task completion rate is 59.3 h. Conversely, the calculated downtime (log-normal is found to be the best fit distribution) of all seven-year operational data resulted in 89.1 and 150.6 h for mean and 90% completion rate, respectively. This result is rather pessimistic. For comparison, a set of available downtime data for 2009 is examined and based on the log-normal distribution (best fit distribution), the mean downtime is 6.4 h with standard deviation of 12.75 h. This estimation is relatively closer to the steady state region result thus indicates that the proposed method is practical to be applied for establishing proper downtime distribution. The estimation using NHPP model, on the other hand, results in higher mean downtime at 120 h, due to poor data fitting.
This study has demonstrated through a case study a systematic approach to maintainability analysis at the operation phase. The analysis is found highly critical as it provides an insight on the current performance of the system, hence should be performed comprehensively by plant management on a regular basis. The proposed approach is found to be effective in highlighting various factors affecting system downtime that require further feedback and improvement actions. In the case study presented, the availability of the system is highly influenced by the downtime performance. Major contributors to downtime are from gas compressor and turbine failures. While both factors have shown significant reduction due to effective improvement actions, downtime due to lube oil system has shown an increasing trend. The proposed steady state condition approach is shown to be a practical way to estimate maintainability measures of a system having non-monotonic downtime improvement trend.
The authors wish to thank Universiti Teknologi PETRONAS for providing the necessary support for this research.
- Alvi, J.S., 1997. Availability analysis of integrated gasification combined cycle power plants with backup fuel and NOMx reduction. Reliabil. Eng. Syst. Safety, 55: 85-94.
- Barabady, J. and U. Kumar, 2008. Reliability analysis of mining equipment: A case study of a crushing plant at Jajarm Bauxite Mine in Iran. Reliabil. Eng. Syst. Safety, 93: 647-653.
- Hajeeh, M. and D. Chaudhuri, 2000. Reliability and availability assessment of reverse osmosis. Desalination, 130: 185-192.
- Elevli, S., N. Uzgoren and M. Taksuk, 2008. Maintainability analysis of mechanical systems of electric cable shovels. J. Sci. Ind. Res., 67: 267-271.
- Thangamani, G., T.T. Narendran and R. Subramanian, 1995. Assessment of availability of a Fluid Catalytic Cracking Unit through simulation. Reliabil. Eng. Syst. Safety, 47: 207-220.