Research Article
Normal Cloud Heavy-Tailed Model Research Based on the Semi-Invariantion
School of Electrical and Information Engineering, Hunan International Economics University, Changsha, China
If you want to study the uncertainty of things, it can not avoid the randomness and fuzziness of these things, especially in these uncertain knowledge formalization. Normal cloud model is proposed by Li et al. (2011) and Li and Liu (2004) to obtain qualitative and quantitative transformation model in this phenomenon. With the deepening of uncertainty research, uncertainty is the charm in this world and only the uncertainty itself can be determined. In all of the uncertainty, the randomness and fuzziness are essential. It is considered that probability theory and fuzzy mathematics are deficient in dealing with uncertainty, Professor Li Deyi put forward the normal cloud concept on the basis of the probability theory and fuzzy mathematics and studied the correlation between fuzziness and randomness. It has a short span of ten years from Li Deyi proposing cloud model, it has been successfully applied to data mining, decision analysis, intelligent control and many other fields (Xu and Wang, 2014; Wang and Yang, 2013; Mi and Li, 2013; Wang and Xu, 2013).
Since the 1960s, there are a large number of international research literature on heavy-tailed distribution (Embrechts et al., 1997; Mandjes and Borst, 2000; Baltrunas, 2001). What is a heavy-tailed distribution exactly? Until now, a precise definition has not yet been come to describe it, even its name is not completely unified, sometimes heavy-tailed distribution is also known as fat tails, thick tail or long tail distribution. In the case of no confusion, these are collectively referred to the heavy-tailed distribution.
Embrechts introduced the following heavy-tailed definition (Embrechts et al., 1997), which is also the most widely used definition.
According to definition 1, a random variable X is heavy-tailed distribution, if it does not exist index order moments, that is, for any λ>0:
where, F (x) is the distribution function of X.
According to definition 2, random variable X is said to be heavy-tailed, if:
where, μ, σ are X expectations and standard deviation (Werner and Upper, 2004).
The normal distribution kurtosis is 3, therefore, the nature is greater than the kurtosis. However, this definition applies only to the four moments which exist. Another is more intuitive definition to determine the heavy-tailed distribution.
According to definition 3, if the density function decay to 0 with power exponent, the distribution function is the heavy-tailed, density function decays to 0 with exponential function, the distribution function is referred to as light tail (Chen and Liu, 2009).
Heavy-tailed distribution is the density function with long "tail" distribution (Li et al., 2013; Rojo, 2013). In the application domain, its physical meaning can be interpreted as that the probability of extreme event is not 0, such as substantial claims in the insurance industry, the amount of a claim is very large in the N claims, so that it is trivial in the other N-1 claim with respect to this claim, such large claims issues will need to use heavy-tailed distribution to handle. Such problems are the so-called extreme events, such as earthquakes, floods, stock market crash. There is a strong value for studying such problems in reality, especially a large number of insurance companies went bankrupt after the 911 incident and the extreme events have become the new frontier research focus. Therefore, the study of heavy-tailed distribution has also been more and more attention from scholars.
In this study, it is proved by this semi-invariantion of the characteristic function (Wang, 2007) that the normal cloud model is with heavy-tailed distribution. Numerical fast calculation of the normal cloud probability distribution density of method is studied and the theoretical analysis and experimental simulation are made. These studies can provide a new theoretical foundation for speech enhancement, speech coding, speech recognition.
METHODOLOGY
Normal cloud model and its heavy tail deduction
Normal cloud model: The probability distribution of cloud droplets in the domain space is a random variable X, which consists of all cloud droplets (Li et al., 2011; Li and Liu, 2004), the probability density of X is calculated by Eq. 1:
(1) |
Then, probability density of En is expressped as follows Eq. 2:
(2) |
Then, probability density of X can be expressed by the following Eq. 3:
(3) |
Cloud droplets X expectation E(X) = Ex, its variance D(X) = En2+He2. Cloud droplet is generated by the normal cloud algorithm, it is a random variable with the expectation Ex, the variance En2+He2, it presents Pan-normal distribution. The parameter He is a measure of the degree of deviation from the normal distribution and it can be used to reflect an non-uniformity and non-dependent phenomenon which the concept factors impact in the domain space. If He = 0, cloud droplet is the normal distribution, if En = 0, He = 0, cloud droplets always appear at the desired point Ex, there is not uncertainty.
To a consensus qualitative concept, all quantitative determining degree of cloud droplets is the random variable Y, the probability density is a fixed form without three digital characteristics in the cloud model. Under algorithm of normal cloud generator, the determining degree of cloud droplets constitute the random variable Y, each certainty can be seen as a sample which is generated by the random variables:
The probability density function of the cloud droplet determining degree Y is expressed as follows in Eq. 4:
(4) |
Normal cloud model reveals the cognitive mechanism. Qualitative concepts are converted into a large number of quantitative cloud droplets, cloud droplets exhibit different degree of certainty, but to form a general consensus, the cognitive rules is consistent which are reflected to determining degree of cloud droplets in different people, the common cognitive mechanism is expressed in different language values between the different qualitative concept (Barabasi and Albert, 1999).
Normal cloud distribution characteristic function and semi-invariantion
Definition 4: If the random variable X distribution function is F (x), the characteristic function n(t) is defined as follows in Eq. 5:
(5) |
Normal random variable X~N (α, σ2) characteristic function is (Wang, 2007):
Definition 5: If the random variable X characteristics function is φ(t), then Eq. 6 is:
(6) |
where, χk is called the random variable X k-th semi-invariantion (Wang, 2007).
Let the random variable Xs k-order moment mk = E(Xk) exists, there are the relations (Eq. 7-8):
(7) |
(8) |
The kurtosis of a random variable X is expressed as follows Eq. 9:
(9) |
Normal cloud model heavy-tailed characteristic proof: As in definition 2 and above Eq. 9: As long as it is proved that the fourth moment in the normal cloud can be positive.
Let the random variables X~N (EX, En, He), its characteristic function φ(t) is expressed as follows (Eq. 10):
(10) |
The derivation of the Eq. 10 is applied to the normal distribution characteristic function Eq. 11:
(11) |
Taylor series is commenced at t = 0 point:
(12) |
(13) |
(14) |
(15) |
(16) |
Normal cloud distribution variance is calculated by:
From Eq. 16:
(17) |
According to Eq. 17 and 9 and definition 2, when He>0, the normal cloud model owns heavy-tailed distribution. Finally, the following Eq. 18 can be obtained:
(18) |
Thus 3<K (X)<9.
Normal cloud model parameter estimation and its probability density numerical calculation
Normal cloud model parameter estimation: Let the random variable X follows the normal cloud distribution C (EX, EN, He), EX is the expectations, En is entropy, He is hyper entropy. Then:
(19) |
(20) |
Input: The quantitative value of N cloud droplets and the degree of each cloud droplets on behalf of concept (xi, yi).
Output: Expectation Ex, entropy En and hyper entropy He are expressed in the qualitative concept of the N cloud droplets.
Normal cloud distribution probability density analysis: NC (EX, En, He), the normal distribution of the probability density (Eq. 3) is a special function of the integrator. Integrand:
(21) |
Normal cloud distribution probability density numerical integration design: Let random variable X obey the normal cloud distribution NC (EX, En, He), the EX desired, En entropy, He is hyper entropy.
Step 1: Sampling point design: The interval (-16, 16) K decile, then:
y<En sampling point:
y>En sampling point:
Step 2: Numerical integration of inter-cell design: y>En:
Step 3: The probability density of the integrand g(y, x) s numerical integrator for a fixed value of x = x0:
(22) |
Simulation and performance analysis
Detection performance analysis: Take (EX, En, He) = (0, 1, 0.1) in Eq. 21, draw g(y, 0.5), g (y, 1), g (y, 2) curve are in Fig. 1.
The normal distribution NC (EX, En, He) probability density the integrand function g (y, x) amount of information is concentrated in the vicinity of the entropy En. Numerical integration in frequency the En near the sampling whichever is greater, to take a small sampling frequency from En far.
Normal clould distribution instance validation with comparative analysis: Normal clould distribution is NC (0, 1, 0.1) and its the probability density (Eq. 3) is calculated, its symbolic integration MATLAB programs is Eq. 23:
(23) |
Take x0 = 1, regarded as f = 0.23955 and that the true value of time-consuming 8.6 sec.
Fig. 1: | g(y, x) curve of normal cloud distribution density function is integrable functions |
Fig. 2: | Normal cloud numerical integration of the probability density distribution NC (0, 1, 0.1) curve |
Numerical integration (Eq. 22) be regarded as:
The same computer with 0.0301 sec:
Time efficiency of only 0.3% of year. NC (0, 1, 0.1) numerical integration of the probability density as shown in Fig. 2.
Normal cloud speech model: The pure speech The birch canoe slid on the smooth planks comes from the file sp01.wav of the Aurora speech database (http: //www.utdallas.edu/~loizou/speech/ noizeus/ Last Time Access on this date 2014-08-29), background noise is normality noise, it is selected from Noisex-92 database (http://spib.rice.edu/spib/select_noise.html Last Time Access on this date 2014-08-29). When the signal kurtosis K(X)≤3 or K(X)≥9, the vicinity value in the interval border is taken as K(X). Normal cloud model is tested by pure speech and noisy speech with babble noise signal-to-noise ratio SNR = 5 and 0 dB, speech simulation results in normal cloud distribution is shown in Fig. 3. RMS in Fig. 3 is mean square fitting error and there are 200 decile intervals in the Histogram.
Normal distribution models were built with three data above, normal model results are in Fig. 4.
The comparison results of their mean square fitting errors (RMS) are shown in Table 1 in both cases.
Strategy performance analysis: Pure voice is heavy-tailed system, normal cloud model is better. There is much noisy in speech, it approaches more to the normal model.
The natural language can be expressed into qualitative control empirience, which is converted to the language control rule by normal cloud model (Gao et al., 2005). We can achieve quantitative to qualitative well, from qualitative to quantitative control mapping. The cloud model controller is designed based on this method, the control strategy is clear, intuitive and reasoning is simple, just digital characteristic parameters and control rules are slightly modified, you can achieve different control maps.
Fig. 3(a-f): | Normal cloud model (a-c) Speech signal and the noisy speech signal and (d-f) Distribution density |
Fig. 4(a-c): | Distribution density with normal model |
Table 1: | Comparison results of mean square fitting errors (RMS) |
Simulation results show that the design of the controller is successful, robustness is strong. Normal cloud model application is prospect.
The heavy tailed property of normal cloud model is proved by the semi-invariant characteristic function in this study, a normal cloud model kurtosis is theoretically introduced and it ranges from 3 to 9. These results provides a theoretical basis for in-depth application of the normal cloud model. Numerical fast calculation of the normal cloud probability distribution density of method is studied and the theoretical analysis and experimental simulation are made. Normal cloud distribution density function is not an explicit expression, numerical integration is required, in order to improve operational efficiency, we performed numerical integration techniques in this design. These are applied to the study of the digital integrator normal cloud mixture distribution speech modeling, these studies can provide a new theoretical foundation for speech enhancement, speech coding, speech recognition and so on. Also it provides a method for calculating normal cloud distribution in artificial intelligence applications.
The concept of heavy-tailed or long-tailed densities (or distributions) have attracted much well-deserved attention in the literature. The keywords long-tailed is quickly searched in Google and the statistics retrieves is almost 12 million items. The concept has become a pillar of the theory of extremes and through its connection with outlier-prone distributions, long-tailed distributions also play a central role in the theory of robustness. The concept of tail heaviness is ubiquitous by now, it appears in a diverse set of disciplines, such as economics, communications, atmospheric sciences, climate modeling, social sciences, physics, modeling of complex systems, etc. Nevertheless, the precise meaning of long-tails or heavy tails remains somewhat elusive. Thus, in a substantial portion of the early literature, long-tailedness meant that the underlying distribution was capable of producing anomalous observations in the sense and they were too far from the main body of observations.
A lot of random factors will lead to a normal distribution, the phenomena is completely dominated by a large number of random and it will be more in nature, the normal distribution may play a better role in dealing with natural phenomena. In human society or the existence place of life, there are usually different levels of competition and other factors, the fittest is survival, the priority is attached to the criteria which is applicable in many cases, the power-law distribution such heavy-tailed distribution has also been studied by researchers in the different field. However, the phenomena or human behavior in human society will not be completely dominated by random factors, because the human have rational thinking and people choose the most self-serving actions. These will not let that the rich are getting richer, no matter the size of the take-all, which will be similar to the national government intervention. Therefore, the real social phenomenon is both extreme events, there are a large number of intermediate composition and there is the intermediate state between the normal distribution (egalitarian) and power-law such a heavy-tailed distribution (unbalanced), that is the expect heavy-tailed distribution. Cloud model is the middle between the normal and heavy-tailed distribution.