
Research Article


A Comparison of Parametric and Nonparametric Density Functions for
Estimating Annual Precipitation in Iran 

P. Haghighat Jou,
A.M. AkhoondAli,
A. Behnia
and
R. Chinipardaz



ABSTRACT

The main purpose of this study is to compare parametric
density functions with nonparametric Fourier series to estimate annual
precipitation for five old rain gauge stations (Bushehr, Isfahan, Meshed,
Tehran and Jask) in Iran. The parametric density functions include normal,
two and three parameters lognormal, two parameter gamma, Pearson and
logPearson type 3 and Gumbel extreme value type 1. The nonparametric
approach is Fourier series method. Annual precipitation data from the
mentioned stations were fitted to all density functions including Fourier
series. Results showed that the Fourier series predict annual precipitation
much better than other parametric methods. Thus, the Fourier series can
be used as a better alternative approach for precipitation frequency analysis.







INTRODUCTION
To design structure size and reservoir capacity of a single or multiple
dam system for flood control, water storage or release and power production,
volume of annual flood is needed for this multiple use of water. Since,
in many areas, rivers are not gauged to monitor annual runoff, this term
is usually calculated from the annual precipitation. Estimation of annual
precipitation is essential in many other aspects such as watershed management,
water resources management, water requirement for macro scale irrigation
design, flood and drought studies and monitoring of climate change. All
these aspects are fully or partly engaged in annual precipitation.
Annual precipitation in Iran is mostly occurred between November and
April for a period of 6 months, with annual mean equal to 250 mm. Climate
of Iran is categorized as arid and semiarid. Precipitation frequency
analysis is generally carried out using parametric methods in which a
statistical distribution such as normal, two and three parameter lognormal,
two parameter gamma, Pearson and logPearson type 3 and Gumbel extreme
value type 1 are fitted to annual series of data. These methods have been
successfully applied in many cases, but, have some disadvantages because
of not fitting to the observed data very well or diverting from the extreme
tails. Some other problems involved in selection of these methods, are
difficulties to choose the best distribution function and estimation of
their parameters (particularly for skewed data).
Against parametric approaches, several nonparametric methods such as variable
kernel method (Lall et al., 1993) have been introduced
in recent years to estimate probability density function and distribution function
of hydrologic events. Guo (1991) proposed a nonparametric
variable kernel estimation model which provides an alternative way in flood
quantile estimation when historical floods data are available. It is shown that
the nonparametric kernel estimator fitted the real data points closer than its
parametric counterparts. Gingras and Adamowski (1992)
applied both Lmoments and nonparametric frequency analysis on the annual maximum
floods. By coupling nonparametric frequency analysis with Lmoment analysis,
it is possible to confirm the Lmoment selection of unimodal distribution, or
to determine that the sample is actually from a mixed distribution. Thus, the
nonparametric method helps to identify the underlying probability distribution,
particularly when samples arise from a mixed distribution. Moon
et al. (1993) compared selected techniques for estimating exceedance
frequencies of annual maximum flood events at a gaged site. They applied four
tail probability and a variable kernel distribution function estimators and
concluded that the variable kernel estimator appears useful because it automatically
gives stable and accurate flood frequency estimates without requiring a distributional
assumption. Adamowski (1996) developed a nonparametric
method for lowflow frequency analysis and compared with two commonly used parametric
methods, namely, logPearson Type 3 and Weibull distributions. The numerical
analysis indicates that the nonparametric method fits better the data and gives
more accurate results than currently used parametric methods. Adamowski
(2000) applied a Gaussian (normal) kernel function for regional analysis
of Annual Maximum (AM) and Partial Duration (PD) flood data by nonparametric
and Lmoment methods. The results pointed out deficiencies in currently used
parametric approaches for both AM and PD series, since traditional regional
flood frequency analysis procedures assume that all floods within a homogeneous
region are generated by the same, often unimodal distribution, while this is
not always true and the data series may be multimodal. Faucher
et al. (2002) compared the performance of parametric and nonparametric
methods in estimation of flood quantiles. The logPearson type 3, two parameter
lognormal and generalized extreme value distributions were used to fit the simulated
samples. It was found that nonparametric methods perform quite similarly to
the parametric methods. They compared six different kernel functions include
biweight, normal, Epanechnikov, extreme value type 1, rectangular and Cauchy.
They found no major differences between the first four above mentioned kernels.
Ghayour and Asakereh (2005) used the Fourier series method
to estimate monthly temperature of Meshed. The results showed that this method
can properly fit to both continuous and discrete time series of data and can
reveal the trend of them. Behnia and Haghighat Jou (2007)
applied Fourier series to estimate annual flood probability of the Great Karoun
river flowing southwest of Iran. Then, the predicted results from the application
of this method were compared to results of seven parametric methods including
normal, two and three parameter log normal, two parameter gamma, Pearson and
logPearson type 3 and Gumbel extreme value type 1. Results of this comparison
showed a better ability for Fourier series method. Karmakar
and Simonovic (2008) used a 70 years flood peak flow, volume and duration
data set and applied nonparametric methods based on kernel density estimation
and orthonormal series to determine the nonparametric distribution functions
for Red River in US. They selected the subset of the Fourier series consisting
of cosine functions as orthonormal series. They found that nonparametric method
based on orthonormal series is more appropriate than kernel estimation for determining
marginal distributions of flood characteristics as it can estimate the probability
distribution function over the whole range of possible values.
Fourier series is also used in this study, however, for annual precipitation
frequency analysis. Results of this proposed nonparametric method will
be compared to the results of above mentioned parametric methods using
the same annual data sets of precipitation from five rain gauge stations
in Iran. Then, the more accurate method will be introduced.
MATERIALS AND METHODS
This study was conducted from October 2007 at the Department of Hydrology
and Water Resources, Faculty of Water Sciences Engineering, Shahid Chamran
University, Ahwaz, Iran and currently is going on.
Climate of Iran
Iran is located in the middle east region, between Caspian sea at
North and Persian gulf and Oman sea, at South of it. This area is situated
in northern hemisphere and surrounded by eastern longitude of 44 to 63°
and northern latitude of 25 to 40° Iran has a variable climate. In
the northwest, winters are cold with heavy snowfall and subfreezing temperatures
from November to February. Spring and fall are relatively mild, while,
summers are dry and hot. Winters are mild in southern area and summers
are very hot. There is a wide range of temperature from 36°C in the
north to 54°C in the south. On the Khuzestan plain which is at southwest,
heat is accompanied by high humidity. In general, climate of Iran is categorized
as arid or semiarid in which most of precipitation falls from November
through April. In most of the country, yearly precipitation averages 240
mm or less. The major exceptions are the higher mountain valleys of the
Zagros and the Caspian coastal plain, where, precipitation averages at
least 500 mm annually. In the western part of the Caspian sea, rainfall
exceeds 1000 mm annually and is distributed relatively throughout the
year. This contrasts with some basins of the Central Plateau that receive
100 mm or less of precipitation annually. Precipitation varies widely,
from less than 100 mm in the southeast to about 2000 mm in the Caspian
region. The northern and western parts of Iran have four distinct seasons.
Toward the south and east, spring and autumn become increasingly short
and ultimately merge in an area of mild winters and hot summers.
Sources of Precipitation
There are five distinct sources of precipitation over Iran which are
westerly winds blowing from Mediterranean sea, southwesterly winds flow
from the Horn of Africa and northern winds which flow from Siberia. These
winds produce rainfall on northwestern, western and southwestern parts
of the country in winter. Southeasterly Monsoon winds blowing from Indian
Ocean, which produce scanty and scattered rainfalls on southeast in summer
and northerly winds blowing from Caspian sea which only produce relatively
heavy rainfall on littoral provinces i.e., Gilan, Mazandaran and Golestan
throughout the year.
Rain Gauge Stations Selection
The annual precipitation data from five old rain gauge stations in
Iran were selected to be analyzed. These stations include Bushehr, Isfahan,
Meshed, Tehran and Jask. Figure 1 shows geographical
location of the stations on the map of Iran. There are many Synoptic and
Meteorological stations in Iran, but, the mentioned stations were selected
because they have long length records. The record lengths of these stations
range between 84 to 113 years. The data were collected from two references
including World Weather Records and Meteorological year books of Iran
which are published by Iranian Meteorological Organization. Data up to
year 1960 were collected from the first reference and the rest of them
up to year 2004 were collected from the second reference. The sample sizes
of data and date of establishment for each of the stations are given in
Table 1 and the geographical characteristics of the
stations are presented in Table 2. The statistical characteristics
of annual precipitation data for these five stations are listed in Table
3 to be used in proposed methods.

Fig. 1: 
Geographical location of rain gauge stations on the
map of Iran 
Table 1: 
The sample sizes and date of establishment of stations


Table 2: 
Geographical characteristics of the stations 

Table 3: 
Basic statistics of annual precipitation for the old
stations 

Fourier Series Method
A description of the distributions and parameters estimation methods are
not presented in this study, because, they are available in other publications
such as Kite (1988), Haan (1977),
Rao and Hamed (1999), Yue et al.
(1999), Yue (2000) and El Adlouni
et al. (2008). Therefore, only Fourier series is described here.
Kronmal and Tarter (1968) proposed the Fourier series
method as a feasible nonparametric approach to estimate probability density
and distribution functions. To design density estimators, orthogonal functions
were developed by Devroye and Gyorfi (1985). For an orthogonal
system, probability density function,
and cumulative distribution,
can be considered as follows (Kronmal and Tarter, 1968):
and
where, Ã¢_{k} and Ã‚_{k} are computed from a random sample
of size n, which have the following property: E(Ã¢_{k}) = a_{k}
and E(Ã‚) = A_{k}. Specifically, for Fourier estimators of the density
and distribution functions a_{k} and A_{k} are the kth Fourier
coefficient with respect to orthogonal functions Ïˆ_{k} (x). Kronmal
and Tarter (1968) expressed the coefficients a_{k} and A_{k}
in terms of the trigonometrical moments and
, for all kâ‰¥0:
where,
and zero elsewhere. Then, the estimators
are:
where, is the mean of n samples and [a, b] is the interval of interest. To
determine the optimal number of Fourier terms, Kronmal and
Tarter (1968 ) suggested that the mth term should be included if:
In this study, due to this fact that incorporating higher Fourier terms
improves the goodness of fit to observed data, all of the data were applied
for numerical analysis. In this study for comparison of the parametric
and nonparametric methods, the mean relative deviation (MRD) and the Mean
Square Relative Deviation (MSRD) were used to measure the goodness of
fit of above mentioned methods. These statistical terms are defined as
follows:
where, x and
are the observed and estimated annual precipitation respectively
and n is sample size. In addition, observed and estimated data will be
compared graphically as well.
Data Analysis
Annual precipitation data from five mentioned rain gauge stations
in Iran were fitted to seven parametric functions and Fourier series as
a nonparametric approach to compare their ability for proper fitness.
For parametric methods, their parameters were estimated by both methods
of moments and maximum likelihood procedures, then using Eq.
8 and 9 to calculate MRD and MSRD values. Table
4, 5 lists these values which range from 2.233 to
65.968 and from 6.998 to 122493.250, respectively. For nonparametric method,
the values of x were calculated using Eq. 6. Then, the
values of MRD and MSRD were calculated using as x in Eq.
8 and 9. Range of these values are from 0.350 to
2.630 and from 0.280 to 84.004, respectively and all of them are cited
in last columns of Table 4, 5 under
abbreviation of Fourier series (FS). The quantiles at return periods were
estimated by the best fitted parametric distribution and Fourier series
functions to each data set. These values were obtained for return period`s
from 1.0101 through 100 years and are presented in Table
6.
Table 4: 
Values of MRD using parametric methods and Fourier series 

^{1}Parameters were estimated by the method
of moments. ^{2}Parameters were estimated by the maximum likelihood
procedure. N = Normal, P3 = Pearson type 3, LN2 = 2 par. lognormal,
LP3 = logPearson type 3, LN3 = 3 par. lognormal, G = Gumbel extreme
value type 1, G2 = 2 par. Gamma, FS = Fourier series method 
Table 5: 
Values of MSRD using parametric methods and Fourier
series 

Table 6: 
Annual precipitation quantiles estimated by parametric
distributions and the Fourier series method 

RESULTS AND DISCUSSION
According to Table 4, 5 and comparing
the values of MRD and MSRD, it is impossible to choose a unique parametric
distribution function for fitting to all data sets of stations. For example,
the best fitted distributions for stations are ordered so that Pearson
type 3 fits to Bushehr data ( due to its high skewness), logPearson type
3 fits to Isfahan and Jask data (due to their lower skewness) and three
parameters lognormal fits to Meshed and Tehran data (due to their low
skewness). However, considering the values of MRD and MSRD for the eight
methods, it is undoubtedly apparent that the Fourier series method fits
to all data sets much better than other parametric distributions and its
selection among other methods is easy with due regard to very low values
of MRD and MSRD. Hence, the Fourier series method is easy to apply and
can be used as a robust approach both in fitting to data and estimation
of quantiles. Figure 2 and 3 show
the fitting of annual precipitation from Jask station and the relative
goodness of fit is similar to those for the other stations.
In spite of parametric distributions which assume that the data follow certain
models, the Fourier series have no such restrictions. Therefore, there are two
advantages for application of Fourier series over parametric approaches and
the aims of current study are achieved. The achieved results of this study are
also similar to that of Behnia and Haghighat Jou (2007)
at which they concluded that the Fourier series fits to the data better than
other parametric methods as mentioned in introduction. However, they used Fourier
series for Great Karoun river annual flood probability analysis in Iran. Furthermore,
Karmakar and Simonovic (2008) used a 70 years flood peak
flow, volume and duration data set for Red River in US and selected the subset
of the Fourier series consisting of cosine functions as orthonormal series.
They found that the method based on orthonormal series is more appropriate than
kernel estimation for determining marginal distributions of flood characteristics
as it can estimate the probability distribution function over the whole range
of possible values. Also, we applied Fourier series as a nonparametric approach
for annual precipitation probability analysis in Iran. Although, our results
conform the results of Behnia and Jou (2007) and Karmakar
and Simonovic (2008) however, our findings are for annual precipitation
over Iran not for floods. Therefore, from this point of view, results are significant
and unique findings of the current research in the country.

Fig. 2: 
Observed and fitted Fourier series for Jask station 

Fig. 3: 
Observed and fitted logPearson type 3 for Jask station 
CONCLUSION
Annual precipitation data sets from five old rain gauge stations (Bushehr,
Isfahan, Meshed, Tehran and Jask) in Iran were fitted to seven parametric
density functions (normal, two and three parameter lognormal, two parameter
gamma, Pearson and logPearson type 3 and Gumbel extreme value type 1)
and Fourier series. The aim of these fittings was to compare parametric
density functions with nonparametric Fourier series to estimate annual
precipitation for the mentioned stations. To do this, the values of MRD
and MSRD for both approaches were calculated and compared. Since the minimum
values were belonged to Fourier series approaches, concluded that this
method estimates the annual precipitation for the stations much better
than the other methods. Also, users will not be confused with selection
of a unique approach to be applicable for all data sets. These are two
advantages for application of Fourier series over parametric approaches
to estimate annual precipitation in Iran.

REFERENCES 
Adamowski, K., 1996. Nonparametric estimation of lowflow frequencies. J. Hydraulic Eng., 122: 4650. CrossRef  Direct Link 
Adamowski, K., 2000. Regional analysis of annual maximum and partial duration flood data by nonparametric and Lmoment methods. J. Hydrol., 229: 219231. CrossRef 
Behnia, A.K. and P.H. Jou, 2007. Estimating Karoun annual flood probabilities using fourier series method. Proceeding of the 7th International River Engineering Conference, February 1315, 2007, Shahid Chamran University, Ahwaz, Iran, pp: 17 Direct Link 
Devroye, L. and L. Gyorfi, 1985. Nonparametric Density Estimation: The L _{1} view. 1st Edn., John Wiley and Sons, New York, ISBN:0471816469
El Adlouni, S., B. Bobée and T.B.M.J. Ouarda, 2008. On the tails of extreme event distributions in hydrology. J. Hydrol., 355: 1633. CrossRef  Direct Link 
Faucher, D., P.F. Rasmussen and B. Bobee, 2002. Nonparametric estimation of quantiles by the kernel method (In French). J. Water Sci., 15: 515541. Direct Link 
Ghayour, H.A. and H. Asakereh, 2005. Application of the fourier models in estimating monthly temperature and its predictability. Case Study: Meshed Temperature Geographical Research Quarterly, No. 77, pp: 8398 (In Persian).
Gingras, D. and K. Adamowski, 1992. Coupling of nonparametric frequency and Lmoment analyses for mixed distribution identification. J. Am. Water Resourc. Associat., 28: 263272. Direct Link 
Guo, S.L., 1991. Nonparametric variable kernel estimation with historical floods and paleoflood information. Water Resourc. Res., 27: 9198. CrossRef  Direct Link 
Haan, C.T., 1977. Statistical Methods in Hydrology. 1st Edn., Iowa State University Press, Ames. Iowa, ISBN: 081381510X, pp: 378 Direct Link 
Karmakar, S. and S.P. Simonovic, 2008. Bivariate flood frequency analysis using copula with parametric and nonparametric marginals. Proceedings of the 4th International Symposium on Flood Defence: Managing Flood Risk, Reliability and Vulnerability, May 68, 2008, Toronto, Ontario, Canada, pp: 150 Direct Link 
Kim, K.D. and J.H. Heo, 2002. Comparative study of flood quantiles estimation by nonparametric models. J. Hydrol., 260: 176193. CrossRef 
Kite, G.W., 1988. Frequency and Risk Analysis in Hydrology. 4th Edn., Water Resources Publications Littleton, Colorado 801612841, USA., ISBN: 0918334640, pp: 257
Kronmal, R. and M. Tarter, 1968. The estimation of probability densities and cumulative by fourier series method. J. Am. Statist. Assoc., 63: 925952. Direct Link 
Lall, U., Y.I. Moon and K. Bosworth, 1993. Kernel flood frequency estimators: Bandwidth selection and Kernel choice. Water Resourc. Res., 29: 10031016. CrossRef  Direct Link 
Moon, Y.I., U. Lall and K. Bosworth, 1993. A comparison of tail probability estimators for flood frequency analysis. J. Hydrol., 151: 343363. CrossRef  Direct Link 
Park, B.U. and J.S. Marron, 1990. Comparison of data driven bandwidth selectors. J. Am. Statist. Assoc., 85: 6672. Direct Link 
Rao, A.R. and K.H. Hamed, 1999. Flood Frequency Analysis. 1st Edn., CRC Press, Boca Raton, FL., ISBN10: 0849300835, Pages: 376
Yue, S., T.B.M.J. Ouarda, B. Bobee, P. Legendre and P. Bruneau, 1999. The gumbel mixed model for flood frequency analysis. J. Hydrol., 226: 88100. CrossRef  Direct Link 
Yue, S., 2000. The gumbel logistic model for representing a multivariate storm event. Adv. Water Resourc., 24: 179185. CrossRef 



