INTRODUCTION
Although conceptual and physicallybased models are the main tool for depicting
hydrological variables and understanding the physical processes taking place
in a system, they do have practical limitations. When data is not sufficient
and getting accurate predictions is more important than conceiving the actual
physics, empirical models remain a good alternative method and can provide useful
results without a costly calibration time. ANN models are such black box models
with particular properties which are greatly suited to dynamic nonlinear system
modeling. The advantages of ANN models over conventional simulation methods
have been discussed in detail by French et al. (1992). ANN applications
in hydrology vary from realtime to event based modeling. This model has been
used for rainfallrunoff modeling, precipitation forecasting and water quality
modeling (Govindaraju and Rao, 2000). One of the most important features of
ANN models is their ability to adapt to recurrent changes and detect patterns
in a complex natural system. More concepts and applications of ANN models in
hydrology have been discussed by the ASCE (2000). Neural networks have also
been previously applied with success in temporal groundwater level prediction
(Coulibaly et al., 2001ac). The ANN methodology has been applied also
to forecast rainfall (Luk et al., 2001). Parkin et al. (2001)
used ANNs, coupled with a 3D numerical model, to model riveraquifer interactions.
In the geotechnical domain, Kurup and Dudani (2002) used ANN to profile the
stress history of clays from piezocone penetration tests. In chalky media, some
researches can be mentioned for example, forecasting of turbid floods in a karstic
media (Beaudeau et al., 2001) and determination of aquifer outflow influential
parameters, simulation and forecasting of aquifer outflow in a fissured chalky
media (Lallahem and Mania, 2002, 2003a,b). Recently, ANNs have been successfully
used for identifying the temporal data necessary to calculate groundwater level
in only one piezometer (Lallahem et al., 2005). In this study, several
different neural networks are evaluated in order to reach conclusions regarding
the efficiency of this forecasting technique in groundwater level prediction
and finding a good technique for presenting spatiotemporal ANN model for Tabriz
aquifer.
An ANN is a computational approach inspired by the human nervous system. Its data processing paradigm is made up of highly interconnected nodes (neurons) that map a complex input pattern with a corresponding output pattern (Kohonen, 1988; Hagan et al., 1996). ANN is used to define the network topology as well as to simulate the learning, validation and testing phases without imposing any functional relationships between the independent and dependent variables. With this architecture, ANN methodology has proven to be a powerful black box model, which excels at function approximation and pattern recognition. Added to that, it is more robust and flexible than other types of Black box models. Several artificial neural networks are applied in order to reach the best structure for study area. Two popular neural network models are the Feedforward Neural Network (FNN) and Recurrent Neural Network (RNN). For training these networks different algorithms can be used. In present research Gradient descent with momentum and adaptive learning rate backpropagation (GDX), LevenbergMarquardt (LM) and Bayesian Regularization (BR) are used. Some researches explain the details of the used networks and algorithms in ANNs modeling (Coulibaly et al., 2001a; Nadiri et al., 2006).
Several aspects of the architecture of neural networks that focus on the prediction of variables associated with hydrology are covered by Maier and Dandy (2000). Their suggestions were followed in the development of the current model. The structure of the network is determined by trial and error. The size of the input and hidden layer of the network has been variable depending on the prediction horizon, whereas the output layer has a single node. The number of nodes in the hidden layer and the stopping criteria were optimized in terms of obtaining precise and accurate output. Finally, the activation function of the hidden layer was set to a hyperbolic tangent sigmoid function as this proved by trial and error to be the best in depicting the nonlinearity of the modeled natural system, among a set of other options (linear and log sigmoid). It is noteworthy that there is no well established direct method for selecting the number of hidden nodes for an ANN model for a given problem. Thus the common trialanderror approach remains the most widely used method.
Two different criteria are used in order to evaluate the effectiveness of each network and its ability to make precise predictions. The Root Mean Square Error (RMSE) calculated by:
where, y_{i}, ŷ_{i} and N are the observed data, the calculated data and the number of observations, respectively. RMSE indicates the discrepancy between the observed and calculated values. The lowest the RMSE, the most accurate the prediction is. Also, the R^{2} efficiency criterion (Determination coefficient), given by:
Which representing the percentage of the initial uncertainty explained by the model. The best fit between observed and calculated values, which is unlikely to occur, would have RMSE = 0 and R^{2} = 1.
MATERIALS AND METHODS
The study area is located in the Tabriz plain (Fig. 1), northwestern
Iran. The Tabriz area lies in East Azarbaijan province, which is structurally
part of the central Iran unit. It is wedged between the Zagros and Alborz mountain
systems. The area includes formations of Devonian to Quaternary age affected
by various geologic movements, most strongly those of Alpine origin. The mean
elevation is 1340 m above sea level. The prevailing climate of the Tabriz area
has semiarid characteristics. During the wet season, the area is under the
influence of middlelatitude westerlies and most of the rain that occurs over
the region during this period is caused by depressions moving over the area,
after forming in the Mediterranean Sea on a branch of the polar jet stream in
the upper troposphere. The average annual rainfall is 261.1 mm, during 1950
until 2005 years. Mean daily temperatures vary from 22°C in January up
to 40°C in July with a yearly average of 9°C. The dominant winds over
the area blow from the Northeast and the Southwest. In general, mean monthly
relative humidity at the Tabriz Airport meteorological station is relatively
high during the NovemberFebruary period, ranging from 75 to 80% and lower during
July and August, when it is about 35 to 45%. Pan evaporation measured during
the water year 20052006 was 1294 mm, (the water year has been fixed from the
20th September to 19th September of the following year and is used in all hydrologic
discussions). The Ajichay River is the only perennial river in the study area
and other temporal drainages join this main river. Tabriz area formations compose
of Miocene faces that it has covered alluvial sediments unevenness and has formed
steep strata east to west. Miocene bedrock in this area is high, so alluvial
sediment is very thin.

Fig. 1: 
Geologic map and localization of study zone 

Fig. 2: 
Groundwater flow direction, piezometric map and positions
of piezometers 
Tabriz city in the southeast and south has been over lied Miocene beds and
PlioQuaternary faces that consist of semi compressed conglomerate beds with
sandstone, carbonate, tuff, agglomerate and marl. These formations have formed
Tabriz aquifer.
The quantitative study of the piezometric fluctuations accounts for the general tendencies of the seasonal or the interannual variations that are translated in a very different way according to the hydrogeologic context. Annual piezometric fluctuation depends on the aquifer hydraulics characteristics, the position of the basins upstream and downstream limits, the groundwater depth, the replenishment time and essentially the internal aquifer geometry. These characteristics are the major parameters intervening in the mode and the chronology of the piezometric events. By analyzing the piezometric map (lower water bodies) (Fig. 2), we can distinguish despite existence some anomalies in direction of groundwater flow, general direction of groundwater in study area is east to west. Cause of these anomalies is over abstraction of pumping wells in farming field (central and southwest of map). The data utilized in this study were collected over 10 years (from April 1995 to March 2004) with a 1month time interval. Data collected consist of the following categories: (i) observed piezometry of 16 piezometers (Central (CP), 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16), (ii) rainfall in Tabriz airport station, (iii) mean temperature in Tabriz airport station, (iv) discharge of Ajichay river in Vaniyar station. Figure 2 shows position of piezometers in the study area.
RESULTS AND DISCUSSION
After the number of neurons in the first, hidden and last layers were fixed and in order to make the results statistically more plausible, three different partitions of the normalized database were used for the simulation. The database was divided into training, validation and test groups. For the ANN models described in this study, 60% of the available data were used for training, 20% were used for the validation and 20% were reserved for testing. There are two important steps in ANN modeling. The first is to ensure the network extracts the necessary features from the data. This is based on processing training data of the ANN, which must be available for learning purposes. The training data base must be sufficiently representative to provide adequate knowledge retrieval, as needed in future reasoning activities of the Neural Network (NN). The percentage of data needed for training, validation and testing are problem specific. However, in this study, several models were tried with percentages of training data varying from 45 to 60%. The second important step in ANN modeling is to find the optimal number of neurons in the hidden layer which discussed before.
According to recently researches (Coulibaly et al., 2001a; Lallahem et al., 2005) effective factors in fluctuation of groundwater are temperature, rainfall and mean discharge of basin. Because of typical hydrology and hydrogeology of every basin, effective delay times of each factor for special basins are different. In this study for finding the best structure of ANN model for the study area, a piezometer that can be had overall characteristics of the study sector was used. This piezometer called Central Piezometer (CP) (Fig. 2). Therefore to reach the best data set and delay time (t_{0} is present time and 1 month lag toward forecasting time), the following data set as input neurons are examined: (i) Temperature, rainfall and discharge of Ajichay river (with t_{0} and t_{0}1delay), water level in CP piezometer (with t_{0}, t_{0} 1 and t_{0}2delay) and water level in two nearest lateral piezometer (with t_{0} and t_{0}1delay), (ii) Temperature, rainfall and discharge of Ajichay river (with t_{0} and t_{0}1delay), water level in CP piezometer (with t_{0}, t_{0} 1 and t_{0}2 delay), (iii) Temperature, rainfall and discharge of Ajichay river (with t_{0} and t_{0}1delay), water level in CP piezometer (with t_{0} and t_{0} 1delay), (iv) Temperature, rainfall and discharge of Ajichay river (with t_{0} delay), water level in CP piezometer (with t_{0} delay). Each of data set is used to train by (13, 7, 1) (i.e. 13 nodes in input, 7 nodes in hidden, 1 node in output layers), (9, 6, 1), (8, 5, 1), (4, 2, 1) structures, respectively. Table 1 shows the results of considering structures with different input data sets. The target vector of structures for whole data set was water level forecasting during 24 months (20032004).
The resultants that are presented in Table 1 revealed the best data set is 4. Therefore groundwater fluctuation effective parameters in study sector for ANNs models are just Temperature, rainfall and discharge of Ajichay river (with t_{0} delay), water level in CP piezometer (with t_{0} delay).Fluctuations of water level in the nearest lateral piezometers have intangible effects on central piezometer water level. Its reason can be complexity of Tabriz aquifer. To reach high efficiency structure for study area, the fourth data set is used.
For every input variables (fourth data set), the time series was divided into
3 different subsets: One subset for training the neural network (19952000),
one for model validation (20012002) and one for model testing (20032004).
Since our goal is to predict future groundwater depths, any information concerning
aquifer fluctuation was considered as it would inhibit the efficiency of our
data driven model. By means of trial and error, an optimum network and parameter
configuration for all two networks (FNN, RNN) were derived. The input layer
in all networks consisted of 4 input nodes for precipitation, temperature and
stream flow and groundwater level. The output of the network is a prediction
of the well level at time step t_{0}+1. The number of hidden neurons
for both the RNN and FNN was determined to 2 through trial and error. This number
of neurons seems to be both time efficient and adequate to handle the rather
small amount of data of our problem. Other parameters that were adjusted in
order to achieve more accurate results were the goal value of the error function
of the network during calibration, calculated by the Root Mean Square Error
(RMSE) and determination coefficient (R^{2}), the learning rate of the
training algorithms, the number of epochs or feeds of each network. The need
of adjustment of these parameters lies in the danger of overtraining a network,
an effect that is analogous to overfitting a polynomial function. According
the presented results in Table 2, the best overall performance
for the given problem was achieved by the feedforward network trained with the
LevenbergMarquardt algorithm and the second best by the recurrent neural network
trained with the same algorithm.
Table 1: 
RMSE% of 6 networks  algorithms with different input data 

Table 2: 
R^{2} and RMSE for training step 

Table 3: 
R^{2} and RMSE for testing step 

As we can see from Table 3, even though the feedforward network
trained with the GDX algorithm seems to explain the groundwater level change
(R^{2} = 0.785), its results are shifted rendering the method unsuitable
for the problem.
The most unsuitable network was the recurrent neural network trained by the Bayesian Regularization algorithm. This may indicate that RNN requires more complex training algorithms (Coulibaly et al., 2001ac). The rest of the networks performed relatively well but tended to overestimate the observed dataset. Also all the networks performed well for 1 month ahead predictions. The most promising techniques seem to be those using the feedforward neural network trained with the LevenbergMarquardt algorithm that underestimates part of the groundwater level during 24 months (20032004). The physical meaning of this result is that the structure of this model allows its weights to adjust to values that depict the trends of the natural system we are simulating. For validating purposes, an 24monthsahead prediction is made and compared with observed values. Figure 3A and B show how 6 combinations predicted the groundwater levels for this 24 months period (20032004). After achieving the best structure and data set for forecasting study area groundwater level, in order to present spatiotemporal pattern, in addition of central piezometer model, modeling of other piezometers is necessary. Because presentation of spatiotemporal model by ANNs for groundwater level is an innovation that has not been carried out, it had very difficulty because of limitation of ANNs applications for spatial forecasting. The first step is preparing models for selected piezometers (2, 3, 6, 10, 16, 8 and 13) among study area piezometers (2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 and 16) and rest of piezometers are hold to test the model. In order to prepare each model of selected piezometers, all steps that are carried out for presentation of central piezometer model previously are again performed. Because the weights and biases of each selected piezometers model are required, the inputs and output neurons are fixed, the most important point in modeling considered piezometers is fixing hidden layer neurons. Table 4 shows summarized results of the training, validation and testing steps for selected piezometers networks with (4, 2, 1) structure.
According to Table 4 and considering the assumed limitation in fixing nodes in input layer (4), output layer (1) and hidden layer (2), the obtained results of the considered piezometers models are acceptable.
The algorithm of the method for spatiotemporal forecasting, explains by following steps:
• 
Extraction of the weights and biases from the ANN which were
used for each piezometer modeling with fixed nodes in the layers 
• 
Classification of node weights with same position for whole
models 
• 
Considering correlation between position of piezometers and
every class of weights and biases 
• 
Framing spatiotemporal ANN model (4, 2, 1) with obtained weights
and biases by regression relations and 
• 
Testing spatiotemporal model for different locations of the
study area 

Fig. 3: 
Comparison of testing results via observed values. (A) FNN
networks (B) RNN networks 
Table 4: 
Results of selected piezometers in the training, validation
and testing steps 

Table 5 presents obtained weights and biases of considered
piezometers models. In the Table 59 w and
b are presentives of abbreviation of weights and biases respectively, in this
manner subscript numbers show the number of node and its order respectively.
After classification of the same order and the number of weights and biases,
vectorial distance (R) of each piezometer from a fixed assumed point as the
area origin (Left corner of study area map, Fig. 2, X = 590000
m, Y = 4200000 m) is measured and then correlation between similar vectorial
distance of each piezometer and their classified weights and biases of the ANN
models are computed by Table curve software. The most accurate fitted functions
with determination coefficients accompanying relevant coefficients are shown
by Table 68.
The next step is cross validation of the presented spatiotemporal model that
is carried out by the remaining piezometers. Weights and biases of remaining
piezometers are computed by the obtained regression functions (Table
68) and are shown in Table 9.
These parameters are used for forecasting water level in the left piezometers and then the computed results are compared with observed water levels. Results of considered models summarized in Table 10. By this way, this model qualified for forecasting water level in whole of study sector even without any piezometers. According to the results, efficiency of the model has inverse relation with piezometer distance of central piezometer. Therefore in additional incredible advantages of the presented model, it has regarding limitation. As can be seen in table 10, the best result is related to piezometer 9 that is the nearest piezometer to central piezometer. Other piezometers results are also acceptable but it clears the aquifer has high complexity. It causes to the efficiency of the same structures have been different values (0.650.82). So the most important factor in the model result can be complexity of the aquifer and the position of every piezometer.
Table 5: 
(A) The hidden layer weights (B) The output layer weights
and biases of the model 

Table 6: 
The best fitted functions (y is the weight and x = R) for
the hidden layer weights 

Table 7: 
The best fitted functions for the output layer weights 

Table 8: 
The best fitted functions for biases 

Table 9: 
(A) Weights of the first layer nodes, (B) Weights and biases
of the second layer nodes and the first layer biases 

Table 10: 
Results of left piezometers forecasting 

CONCLUSION
Neural networks have proven to be an extremely useful method of empirical forecasting of hydrological variables. In this study, an attempt was made to identify the most stable and efficient neural network configuration for predicting groundwater level in the Tabriz aquifer and to present spatiotemporal model as an innovation. Tabriz aquifer is complex and multilayer so has some difficulties for modeling and groundwater level forecasting. Total of six different ANN configurations were tested in terms of optimum results for a prediction horizon of 24 months. The most suitable configuration for this task was proved to be a 421 feedforward network trained with the Levenberg Marquardt method as it showed the most accurate predictions of the decreasing groundwater levels. Then optimal structure is applying for forecasting of selected water level piezometers in Tabriz aquifer. From the results of the study, it can also be inferred that the LevenbergMarquardt algorithm is more appropriate for this problem since the RNN also performs well when trained with this method. This structure is also used for presenting spatiotemporal model. According to presented algorithms this model is obtained by computing correlation of vertical distance of assumed point with weights and biases of models. By this way all of the piezometer water levels in study area can be forecasting by using obtained weights and biases based on regression functions. In general, the results of the case study are satisfactory and demonstrate that neural networks can be a useful spatiotemporal prediction tool in the area of groundwater hydrology. Most importantly, this paper presents indications that neural networks can also be applied in cases where the aquifer is complex and we need in addition to temporal prediction, spatial forecasting.
In this research, it was tried to use vectorial distance as spatial factor (R) but in opinion of the authors, it will be better to use local coordinate of every considered piezometer (i.e., X, Y), in stead of vertical distance (R). This suggestion can be study as another research.