ABSTRACT
The artificial intelligence modeling of nonstationary rainfall-runoff has some restriction in accuracy of simulation base on complexity and nonlinearity of training patterns. Statistical preprocessing of trainings could determine homogeneity of rainfall-runoff patterns before modeling in artificial intelligence. In this study, the new hybrid model of artificial intelligence in conjunction with statistical clustering is introduced. Statistical pre-processing effects of 360 rainfall-runoff patterns considered before modeling using Radial Basis Function Neural Networks (RBFNNs). In the first step all 360 monthly rainfall-runoff patterns classify by cluster analysis in 4 groups and each class modeled by different RBFNNs topology. Results of 4 cluster base-RBFNNs compare with no action one and the optimized structure of Hybrid Cluster base-RBFNN models of Nazloochaei river flow present. Results show that clustering of rainfall-runoff patterns and modeling of each dataset by different RBFNNs has higher accuracy than no preprocessing of patterns in prediction and modeling of river flow.
PDF Abstract XML References Citation
How to cite this article
DOI: 10.3923/ajaps.2009.150.159
URL: https://scialert.net/abstract/?doi=ajaps.2009.150.159
INTRODUCTION
The Bases of risk management is modeling and Prediction of Natural Hazard. Intelligence models are distributed parallel processors that learn relationship between input and output signals and present the optimized topology for simulation of systems. In practice, many of the real-world dynamical system signals exhibit two distinct characteristics: nonlinearity and non-stationary in the sense that statistical characteristics change over time due to either internal or external nonlinear dynamics (Coulibaly and Baldwin, 2005). The rainfall-runoff modeling has nonlinearity process according to the temporal and spatial distribution of Precipitation and other parameters. A major concern in prediction of hydrological events is whether a given process should be modeled as linear or as nonlinear. the evidence of nonstationarity of some existing long hydrological records has raised a number of questions to adequacy of the conventional statistical methods. There is a different kind of hydrological models to simulate discharge of river basin but complexity of rainfall runoff processing leads some restriction and problems in river flow modeling (Nouri and Abghari, 2007). According to this complexity that caused nonlinear relation between flow and rainfall, researchers try to found better methodology to develop more accurate modeling. Artificial Neural Networks (ANNs), which are found more suited to nonlinear input-output mapping. Recent reviews reveal that more than 90% of the applications of Artificial Neural Networks (ANNs) for water resources variables modeling is the standard feedforward Neural Networks (Coulibaly and Baldwin, 2005) but, also Radial basis Function Neural Networks (RBFNNs) has high capability to modeling of hydrological Process (Mason et al., 1996; Dibik and Solomatine, 2001; Nouri and Abghari, 2007). Preprocessing of dataset before training in artificial intelligence has better effect on training time and modeling accuracy (Nouri and Abghari, 2007).
Hu et al. (2001) developed Range-Dependent Hybrid Neural Networks (RDNN), which are virtually threshold ANNs, to forecast annual and daily streamflows. Pal et al. (2003) proposed a hybrid ANN model that combines the Self-Organizing Feature Map (SOFM) and the MLP network for temperature prediction, where the SOFM serves to partition the training data. Implementation of feedforward NNs (Hsu et al., 1995; Lorrai and Sechi, 1995), Recurrent NNs (Gao and Joo, 2005), Cluster Base Multi layer Perceptron (Wang et al., 2006), RBFNNs (Mason et al., 1996), Fuzzy NNs (Jang, 1993; Jang et al., 1997) fuzzy-logic based hybrid modeling (See and Openshaw, 1999) and SOM-cluster based hybrid modeling (Abrahart and See, 2000) Bayesian-concept based modular ANN (Zhang and Govindaraju, 2000; Nayak et al., 2004; Dogan, 2005; Kisi, 2005, 2006), models to Prediction of Time Series also proposed in Cigizoglu (2004), Baratti et al. (2003), Yurekli et al. (2004), Nilsson et al. (2005), Dulkashi et al. (2006), Chen et al. (2006), Bhattacharaya and Solomatine (2006) and Jain and Srinivasulu (2006). Wang et al. (2006) develop the hybrid Threshold base Multi layer perceptron and Cluster Base-MLP to prediction of daily streamflow. Mason et al. (1996) and Dibik and Solomatine (2001) show that accuracy of river flow prediction using RBFNNs is better than MLP and deterministic model, Hec-HMS. Samani et al. (2007) demonstrate that using principal Component analysis as a preprocessing of dataset and reduction of data dimension, MLP has more ability to modeling of Pump test in well. Also, Ceylan and Ozbay (2007) demonstrate that classification of ECG using some preprocessing of data like FCM, Wavelet Transformation and Principle component analysis Comparison with ANN is better than MLP modeling without preprocessing. In this study Cluster analysis consider for determine the homogeneity of training patterns and each class of rainfall-runoff patterns train and test in different RBF networks. Investigation of monthly homogeny training patterns to river flow modeling and comparison of result with no preprocess RBF modeling of nonstationary time series is main aim of this study.
MATERIALS AND METHODS
Study Area
One of the major tributary of Nazloochaei River Basin in North West of Iran selected for the current study (Fig. 1). The basins area is 2014 km2 and main river drainage to Urmia Lake. Stream flow processes always influenced severely by irregular rainfall event. Three hundred and sixty monthly data records of 5 precipitation gauge and hydrometric station from 1976-2006 considered for RBF and Hybrid RBF modeling.
Radial Base Function Neural Networks
Activation function of hidden layer in the RBF Neural Networks defined as radial symmetric basis functions such as the Gaussian function and output Layer is linear Function (Wasserman, 1993). RBFNNs recommended by researcher Because of fast training time and generalization of RBF in hydrological process (Mason et al., 1996; Dibik and Solomatine, 2001). Figure 2 show the structure of RBFNNs model. Learning process in the RBFNs is updating of the matrix weight in each iteration base on model output and comparison of estimated discharge with observant one to determine error.
![]() | |
Fig. 1: | Nazloochaei River basin in the North West of Iran |
![]() | |
Fig. 2: | The basic structure of radial base function neural networks |
The output of neurons in hidden layer of RBFNNs (3) is radial difference function (2) of Precipitation gauge data Matrix as an inputs pi = (p1i, p2i,..., pki) and weight vectors wj = (w1i, w2i,..., wmj) to minimize the error of estimate discharge by model (Mason et al., 1996). The RBFNNs topology programming was computed using the MATLAB R2008a software package.
![]() | (1) |
![]() | |
Fig. 3: | Effect of different spread coefficient in Gaussian function shape |
![]() | (2) |
There is different choice to Yout = f(δj) but usual one is Eq. 3:
![]() | (3) |
Which δj is radial difference between Precipitation pi and weights wij and σ define as Spread Coefficient.
Neurons are added to the RBF network until the sum-squared error falls beneath an error goal. These types of networks tend to take more neurons than feedforward NNs with tansig or logsig neurons in the hidden layer. Because sigmoid neurons can have outputs over a large region of the input space, while RBF neurons only respond to relatively small regions of the input space. In addition to optimization parameter of networks like number of neurons, training algorithm, improvement of weights, there is extra parameter for optimization and flexibility of Gaussian activation function of RBFNs named as spread coefficient. Spread Coefficient (σ in Eq. 3) determines the selectivity of neurons. Using different among of Spread the Gaussian activation function could contract and expansion (Fig. 3). Therefore if spread is small the radial basis function is very steep and the neuron with the weight vector closest to the input will have a much larger output than other neurons. As spread becomes larger the radial basis functions slope becomes smoother and several neurons can respond to an input vector. Thus the training process of RBFNNs is performed by deciding on how many hidden neuron there should be for modeling and the sharpness of the Gaussians using spread parameter. While the optimized weights are estimated using Simple Back propagation algorithm like for Multi Layer Perceptron approximation are kept fixed for RBF Networks modeling.
Models Performance Measures
There is an extensive literature on model forecasting evaluation indices (Wang et al., 2006). The R squared and equivalently Root Mean Squared Error (RMSE) is popular model measure because they are very sensitive to even small errors, which is good for comparing small differences of estimated and observed discharge on models.
![]() | (4) |
![]() | (5) |
Which n the numbers of observations, average discharge of river, Qi the observed discharge of hydrometric station and
is the estimated discharge of RBF model.
Cluster Analysis of Monthly Rainfall-Runoff Patterns
Seasonality in the water year period leads to some incongruous in dataset for training process of Neural Networks. Most of the hydrological models problems are overcoming seasonality on simulation. The objective of cluster analysis is the classification of objects according to similarities among them and organizing of data into groups. Different classifications can be related to the algorithmic approach of the clustering techniques. Variables that have high pairwise correlations are assigned to the same cluster, whereas those having low pairwise correlations are assigned to different clusters (Kamel and Selim, 1994). Generally, cluster analysis is based on two ingredients: Distance measure and Cluster algorithm. Distance can be measured among the data vectors themselves, or as a distance form a data vector to some prototypical object of the cluster.
One of the recommended distance measures for quantification of (dis-) similarity of objects is Euclidean method (Eq. 6). For cluster algorithm, K-means (Eq. 7) can be seen as an optimization problem which could minimize the sum of squared within-cluster distances. Each cluster in the partition is defined by its member objects and by its centroid. The centroid for each cluster is the point to which the sum of distances from all objects in that cluster is minimized. In the presented hybrid model K-means uses an iterative algorithm that minimizes the sum of distances from each rainfall-runoff training patterns to its cluster centroid, over all clusters. This algorithm moves training patterns between clusters until the sum cannot be decreased further. The result is a set of clusters that are as compact and well-separated as possible.
![]() | (6) |
![]() | (7) |
Which, de (i, j) is Euclidean distance between precipitation pij and pjk, W(C) Minimize the sum of squared within-cluster distances.
Using this method 360 dataset of rainfall-runoff separated in 4 clusters, 124 training set in cluster one, 116 in cluster two, 52 in cluster three and 68 in cluster 4. Each clusters rainfall-runoff patterns divided in two sections for training and validation process of hybrid models. 70% percent of each class pattern used for training and other residual used for models validation. Four hybrid models, Cl 1-RBFNNs, Cl 2-RBFNNs, Cl 3-RBFNNs and Cl 4-RBFNNs, developed. Figure 4 show the structure of Hybrid Cluster Base-RBFNNs.
![]() | |
Fig. 4: | The structure of hybrid model (cluster base-RBFNNs) |
Table 1: | Rainfall-runoff RBFNN optimized topology without any preprocessing of patterns |
![]() |
RESULTS AND DISCUSSION
The rainfall-runoff processes usually have pronounced seasonal means, variances and dependence structures and the under-lying mechanisms of streamflow are likely to be quite different during low, medium and high flow periods, especially when extreme events occur. Result of RBFNNs modeling of all 360 training rainfall-runoff patterns without any preprocessing show in Table 1. The optimized topology of training and testing obtain by spread coefficient 0.4. Considering the Table 1 show that model accuracy using R2 for training and validation phase is 76.21, 68.94% and RMSE for training and validation phase 0.8802 and 0.9808. this result show the seasonality modeling problem for time series that even RBFNN has a restriction in training of relation between precipitation input vector and discharge output.
After clustering of all datasets in 4 classes, each classes train and test in different Radial Base Function Neural Networks (Fig. 5) and Cluster Base-RBFNNS developed. Because of homogenous in training cluster and reduction of seasonality in the hybrid model, accuracy of each class improved to some extend in comparison to no action on dataset. Result of each hybrid models Cl 1-RBFNNs, Cl 2-RBFNNs, Cl 3-RBFNNs, Cl 4-RBFNNs show in Table 2-5. The optimized topology of the Cl 1-RBFNNs in training and testing obtain by spread 0.3. The Table 2 show that model accuracy using R2 for training section is 90.95%, 82.37% and RMSE is 0.0058 and 0.0987.
![]() | |
Fig. 5: | Dendrogram of cluster analysis of training dataset |
Table 2: | Optimized topology of model 1 (cluster 1-RBFNN) |
![]() |
Table 3: | Optimized topology of model 2 (cluster 2-RBFNN) |
![]() |
It is recognized that data preprocessing can have a significant effect on model performance (Maier and Dandy, 2000). Results show that Clustering of Rainfall-Runoff patterns and modeling of each dataset by different RBFNNs has higher accuracy than no preprocessing of Patterns in prediction and modeling of river flow. Considering Table 6 show the capability of preprocessing of rainfall-runoff modeling using Cluster base-RBFNNs that could demonstrate other preprocessing methods like data reduction using PCA by Samani (2007).
Table 4: | Optimized topology of model 3 (cluster 3-RBFNN) |
![]() |
Table 5: | Optimized topology of model 4 (cluster 4-RBFNN) |
![]() |
![]() | |
Fig. 6: | Monitoring of models accuracy in Training and testing process using different spread coefficients |
Table 6: | Topology of all 5 models RBFNN and Cluster base-RBFNNs |
![]() |
Figure 6 show the Monitoring of spread coefficient of Gaussian Function in 5 developed models. This result demonstrates that R square of each cluster base models in selective spread has better accuracy than No preprocessing in training and testing procedure.
CONCLUSION
There is a big difference between extreme data in the seasonal hydrological time series that clustering could separate homogenous data. Using homogenous training rainfall-runoff patterns modeling is much easier than no action. Cluster base-RBFNNs could use as a hybrid model to overcoming of nonlinearity modeling of river flow modeling. This comparison shows that preprocessing of training dataset as the hybrid RBFNNs model has better optimization than RBFNNs for simulating of the rainfall-runoff process in Nazloochaei river basin. RBFNN has efficient training algorithm (vs. multi-layer NN) but large training set is a problem. Cluster base-RBFNN hybrid model show high capability for prediction. It would be interesting to further compare the hybrid Cluster-based RBFNN approach with other hybrid techniques, such as Bayesian-concept based MLP, fuzzy-logic NNs modeling and SOM-cluster based hybrid modeling.
REFERENCES
- Abrahart, R.J. and L. See, 2000. Comparing neural network and autoregressive moving average techniques for the provision of continuous river flow forecasts in two contrasting catchments. Hydrol. Process, 14: 2157-2172.
CrossRefDirect Link - Baratti, R., B. Cannas, A. Fanni, M. Pintus, G.M. Sechi and N. Toreno, 2003. River flow forecast for reservoir management through neural networks. J. Neurocomput., 55: 421-437.
CrossRef - Bhattacharya, B. and D.P. Solomatine, 2006. Machine learning in sedimentation modeling. J. Neural Networks, 19: 208-214.
CrossRef - Ceylan, R. and Y. Ozbay, 2007. Comparison of FCM, PCA and WT techniques for classification ECG arrhythmias using artificial neural network. Expert Syst. Applic., 33: 286-295.
CrossRefDirect Link - Chen, Y., B. Yang and J. Dong, 2006. Time-series prediction using a local linear wavelet neural network. Neurocomputing, 69: 449-465.
CrossRef - Coulibaly, P. and C.K. Baldwin, 2005. Nonstationary hydrological time series forecasting using nonlinear dynamic methods. J. Hydrol., 307: 164-174.
CrossRef - Dibik, Y.B. and D.P. Solomatine, 2001. River flow forecasting using artificial neural networks. Hydrol. Oceans Atmos., 26: 1-7.
CrossRef - Dogan, E., 2005. Suspended sediment load estimation in lower sakarya river by using artificial neural networks. Fuzzy logic neuro-fuzzy models. Elect. Lett. Sci. Eng., 1: 22-32.
Direct Link - Gao, Y. and E.M. Joo, 2005. NARMAX time series model prediction: Feedforward and recurrent fuzzy neural network approaches. Fuzzy Sets Syst., 150: 331-350.
CrossRef - Hu, T.S., K.C. Lam and S.T. Ng, 2001. River flow time series prediction with a range-dependent neural network. Hydrol. Sci. J., 46: 729-745.
Direct Link - Hsu, K.I., H.V. Gupta and S. Sorooshian, 1995. Artificial neural network modeling of the rainfall-runoff process. Water Resour. Res., 31: 2517-2530.
CrossRefDirect Link - Jain, A. and S. Srinivasulu, 2006. Integrated approach to model decomposed flow hydrograph using artificial neural network and conceptual techniques. J. Hydrol., 317: 291-306.
CrossRef - Jang, J.S.R., 1993. ANFIS: Adaptive-network-based fuzzy inference system. IEEE Trans. Syst. Man Cybern., 23: 665-685.
CrossRefDirect Link - Kisi, O., 2005. Suspended sediment estimation using neuro-fuzzy and neural network approaches. Hydrol. Sci. J., 50: 683-696.
CrossRefDirect Link - Kisi, O., 2006. Daily pan evaporation modelling using a neuro-fuzzy computing technique. J. Hydrol., 329: 636-646.
CrossRefDirect Link - Kamel, M.S. and S.Z. Selim, 1994. New algorithm for solving the fuzzy clustering problem. Pattern Recog., 27: 421-428.
CrossRef - Lorrai, M. and H.M. Sechi, 1995. Neural nets for modeling rainfall-runoff transformations. Water Resour. Manage., 9: 299-313.
CrossRefDirect Link - Maier, H.R. and G.C. Dandy, 2000. Neural networks for the prediction and forecasting of water resources variables: A review of modeling issues and applications. Environ. Model. Software, 15: 101-124.
CrossRef - Mason, J.C., R.K. Price and A. Ternme, 1996. A neural network model of rainfall-runoff using radial basis functions. J. Hydraulic Res., 34: 537-548.
Direct Link - Nayak, P.C., K.P. Sudheer, D.M. Rangan and K.S. Ramasastri, 2004. A neuro-fuzzy computing technique for modeling hydrological time series. J. Hydrol., 291: 52-66.
CrossRef - Nilsson, P., C.B. Uvo and R. Bentsen, 2005. Monthly runoff simulation: Comparing and combining conceptual and neural network models. J. Hydrol., 317: 1-20.
CrossRef - Nouri, M. and H. Abghari, 2007. Simulation of rainfall-runoff using RBFNNs base on probabilistic neural network classification. Proceedings of the 3rd Conference on Watershed Management Kerman and Water Resources Management, May 10-11, 2007, Iranian Society of Irrigation and Water Engineering Press, pp: 1108-1113.
- Pal, N.R., S. Pal, J. Das and K. Majumdar, 2003. SOFM-MLP: A hybrid neural network for atmospheric temperature prediction. IEEE Trans. Geosci. Remote Sensing, 41: 2783-2791.
CrossRefDirect LinkINSPEC - Samani, N., M. Gohari-Moghadam and A. Safavi, 2007. A simple neural network model for the determination of aquifer parameters. J. Hydrol., 340: 1-11.
CrossRef - See, L. and S. Openshaw, 1999. Applying soft computing approaches to river level forecasting. Hydrol. Sci. J., 44: 763-778.
Direct Link - Wang, W. Van, P. Gelder and J.K. Vrijling, 2006. Forecasting daily stream flow using hybrid ANN models. J. Hydrol., 324: 383-399.
CrossRef - Wasserman, P.D., 1993. Advanced Methods in Neural Computing. 1st Edn. John Wiley and Sons, Inc., New York, USA., ISBN:0442004613 pp: 250.
Direct Link - Yurekli, K., K. Kurunc and H. Simsek, 2004. Prediction of daily maximum streamflow based on stochastic approaches. J. Spatial Hydrol., 4: 1-12.
Direct Link - Zhang, B. and S. Govindaraju, 2000. Prediction of watershed runoff using bayesian concepts and modular neural networks. Water Resour. Res., 36: 753-762.
CrossRefDirect Link - Karunasinghe, D.S.K. and S.Y. Liong, 2006. Chaotic time series prediction with a global model artificial neural network. J. Hydrol., 323: 92-105.
CrossRef
BLISSAG Bilal Reply
I search for THIS PAPER