1. 程式人生 > >Generalized normal distribution and Skew normal distribution

Generalized normal distribution and Skew normal distribution

distrib cdc related 判斷 right bce close led C4D

Density Function

The Generalized Gaussian density has the following form:

技術分享圖片

where 技術分享圖片 (rho) is the "shape parameter". The density is plotted in the following figure:

技術分享圖片

Matlab code used to generate this figure is available here: ggplot.m.

Adding an arbitrary location parameter, 技術分享圖片, and inverse scale parameter, 技術分享圖片

, the density has the form,

技術分享圖片

技術分享圖片

Matlab code used to generate this figure is available here: ggplot2.m.

Generating Random Samples

Samples from the Generalized Gaussian can be generated by a transformation of Gamma random samples, using the fact that if 技術分享圖片 is a 技術分享圖片 distributed random variable, and 技術分享圖片

is an independent random variable taking the value -1 or +1 with equal probability, then,

技術分享圖片

is distributed 技術分享圖片. That is,

技術分享圖片

where the density of 技術分享圖片 is written in a non-standard but suggestive form.

Matlab Code

Matlab code to generate random variates from the Generalized Gaussian density with parameters as described here is here:

gg6.m

As an example, we generate random samples from the example Generalized Gaussian densities shown above.

技術分享圖片

Matlab code used to generate this figure is available here: ggplot3.m.

Mixture Densities

A more general family of densities can be constructed from mixtures of Generalized Gaussians. A mixture density, 技術分享圖片, is made up of 技術分享圖片 constituent densities 技術分享圖片 together with probabilities 技術分享圖片 associated with each constituent density.

技術分享圖片

The densities 技術分享圖片 have different forms, or parameter values. A random variable with a mixture density can be thought of as being generated by a two-part process: first a decision is made as to which constituent density to draw from, where the 技術分享圖片 density is chosen with probability 技術分享圖片, then the value of the random variable is drawn from the chosen density. Independent repetitions of this process result in a sample having the mixture density 技術分享圖片.

As an example consider the density,

技術分享圖片

技術分享圖片技術分享圖片

Matlab code used to generate these figures is available here: ggplot4.m.


The generalized normal distribution or generalized Gaussian distribution (GGD) is either of two families of parametric continuous probability distributions on the real line. Both families add a shape parameter to the normal distribution. To distinguish the two families, they are referred to below as "version 1" and "version 2". However this is not a standard nomenclature.

Version 1

Generalized Normal (version 1)
Probability density function

技術分享圖片

Cumulative distribution function

技術分享圖片

Parameters 技術分享圖片 location (real)
技術分享圖片 scale (positive, real)
技術分享圖片 shape (positive, real)
Support 技術分享圖片
PDF 技術分享圖片

技術分享圖片 denotes the gamma function
CDF

技術分享圖片

技術分享圖片 denotes the lower incomplete gamma function
Mean 技術分享圖片
Median 技術分享圖片
Mode 技術分享圖片
Variance 技術分享圖片
Skewness 0
Ex. kurtosis 技術分享圖片
Entropy 技術分享圖片[1]

Known also as the exponential power distribution, or the generalized error distribution, this is a parametric family of symmetric distributions. It includes all normal and Laplacedistributions, and as limiting cases it includes all continuous uniform distributions on bounded intervals of the real line.

This family includes the normal distribution when 技術分享圖片 (with mean 技術分享圖片 and variance 技術分享圖片) and it includes the Laplace distributionwhen 技術分享圖片. As 技術分享圖片, the density converges pointwise to a uniform density on 技術分享圖片.

This family allows for tails that are either heavier than normal (when 技術分享圖片) or lighter than normal (when 技術分享圖片). It is a useful way to parametrize a continuum of symmetric, platykurticdensities spanning from the normal (技術分享圖片) to the uniform density (技術分享圖片), and a continuum of symmetric, leptokurticdensities spanning from the Laplace (技術分享圖片) to the normal density (技術分享圖片).

Parameter estimation

Parameter estimation via maximum likelihood and the method of moments has been studied.[2] The estimates do not have a closed form and must be obtained numerically. Estimators that do not require numerical calculation have also been proposed.[3]

The generalized normal log-likelihood function has infinitely many continuous derivates (i.e. it belongs to the class C of smooth functions) only if 技術分享圖片 is a positive, even integer. Otherwise, the function has 技術分享圖片 continuous derivatives. As a result, the standard results for consistency and asymptotic normality of maximum likelihood estimates of 技術分享圖片 only apply when 技術分享圖片.

Maximum likelihood estimator

It is possible to fit the generalized normal distribution adopting an approximate maximum likelihood method.[4][5] With 技術分享圖片initially set to the sample first moment 技術分享圖片, 技術分享圖片 is estimated by using a Newton–Raphson iterative procedure, starting from an initial guess of 技術分享圖片,

技術分享圖片

where

技術分享圖片

is the first statistical moment of the absolute values and 技術分享圖片 is the second statistical moment. The iteration is

技術分享圖片

where

技術分享圖片

and

技術分享圖片

and where 技術分享圖片 and 技術分享圖片 are the digamma function and trigamma function.

Given a value for 技術分享圖片, it is possible to estimate 技術分享圖片 by finding the minimum of:

技術分享圖片

Finally 技術分享圖片 is evaluated as

技術分享圖片

Applications

This version of the generalized normal distribution has been used in modeling when the concentration of values around the mean and the tail behavior are of particular interest.[6][7] Other families of distributions can be used if the focus is on other deviations from normality. If the symmetry of the distribution is the main interest, the skew normal family or version 2 of the generalized normal family discussed below can be used. If the tail behavior is the main interest, the student t family can be used, which approximates the normal distribution as the degrees of freedom grows to infinity. The t distribution, unlike this generalized normal distribution, obtains heavier than normal tails without acquiring a cusp at the origin.

Properties

The multivariate generalized normal distribution, i.e. the product of 技術分享圖片 exponential power distributions with the same 技術分享圖片 and 技術分享圖片 parameters, is the only probability density that can be written in the form 技術分享圖片 and has independent marginals.[8] The results for the special case of the Multivariate normal distribution is originally attributed to Maxwell.[9]

Version 2

Generalized Normal (version 2)
Probability density function

技術分享圖片

Cumulative distribution function

技術分享圖片

Parameters 技術分享圖片 location (real)
技術分享圖片 scale (positive, real)
技術分享圖片 shape (real)
Support 技術分享圖片
技術分享圖片
技術分享圖片
PDF 技術分享圖片, where
技術分享圖片
技術分享圖片 is the standard normal pdf
CDF 技術分享圖片, where
技術分享圖片
技術分享圖片 is the standard normal CDF
Mean 技術分享圖片
Median 技術分享圖片
Variance 技術分享圖片
Skewness 技術分享圖片
Ex. kurtosis 技術分享圖片

This is a family of continuous probability distributions in which the shape parameter can be used to introduce skew.[10][11]When the shape parameter is zero, the normal distribution results. Positive values of the shape parameter yield left-skewed distributions bounded to the right, and negative values of the shape parameter yield right-skewed distributions bounded to the left. Only when the shape parameter is zero is the density function for this distribution positive over the whole real line: in this case the distribution is a normal distribution, otherwise the distributions are shifted and possibly reversed log-normal distributions.

Parameter estimation

Parameters can be estimated via maximum likelihood estimation or the method of moments. The parameter estimates do not have a closed form, so numerical calculations must be used to compute the estimates. Since the sample space (the set of real numbers where the density is non-zero) depends on the true value of the parameter, some standard results about the performance of parameter estimates will not automatically apply when working with this family.

Applications

This family of distributions can be used to model values that may be normally distributed, or that may be either right-skewed or left-skewed relative to the normal distribution. The skew normal distribution is another distribution that is useful for modeling deviations from normality due to skew. Other distributions used to model skewed data include the gamma, lognormal, and Weibull distributions, but these do not include the normal distributions as special cases.

The two generalized normal families described here, like the skew normal family, are parametric families that extends the normal distribution by adding a shape parameter. Due to the central role of the normal distribution in probability and statistics, many distributions can be characterized in terms of their relationship to the normal distribution. For example, the lognormal, folded normal, and inverse normal distributions are defined as transformations of a normally-distributed value, but unlike the generalized normal and skew-normal families, these do not include the normal distributions as special cases.
Actually all distributions with finite variance are in the limit highly related to the normal distribution. The Student-t distribution, the Irwin–Hall distribution and the Bates distribution also extend the normal distribution, and include in the limit the normal distribution. So there is no strong reason to prefer the "generalized" normal distribution of type 1, e.g. over a combination of Student-t and a normalized extended Irwin–Hall – this would include e.g. the triangular distribution (which cannot be modeled by the generalized Gaussian type 1).
A symmetric distribution which can model both tail (long and short) and center behavior (like flat, triangular or Gaussian) completely independently could be derived e.g. by using X = IH/chi.


Skew normal distribution

Skew Normal
Probability density function

技術分享圖片

Cumulative distribution function

技術分享圖片

Parameters 技術分享圖片 location (real)
技術分享圖片 scale (positive, real)
技術分享圖片 shape (real)
Support 技術分享圖片
PDF 技術分享圖片
CDF 技術分享圖片
技術分享圖片 is Owen‘s T function
Mean 技術分享圖片 where 技術分享圖片
Variance 技術分享圖片
Skewness 技術分享圖片
Ex. kurtosis 技術分享圖片
MGF 技術分享圖片
CF 技術分享圖片

In probability theory and statistics, the skew normal distribution is a continuous probability distribution that generalises the normal distribution to allow for non-zero skewness.

Definition

Let 技術分享圖片 denote the standard normal probability density function

技術分享圖片

with the cumulative distribution function given by

技術分享圖片,

where erf is the error function. Then the probability density function (pdf) of the skew-normal distribution with parameter 技術分享圖片 is given by

技術分享圖片

This distribution was first introduced by O‘Hagan and Leonard (1976). A popular alternative parameterization is due to Mudholkar and Hutson (2000), which has a form of the c.d.f. that is easily inverted such that there is a closed form solution to the quantile function.

A stochastic process that underpins the distribution was described by Andel, Netuka and Zvara (1984).[1] Both the distribution and its stochastic process underpinnings were consequences of the symmetry argument developed in Chan and Tong (1986), which applies to multivariate cases beyond normality, e.g. skew multivariate t distribution and others. The distribution is a particular case of a general class of distributions with probability density functions of the form f(x)=2 φ(x) Φ(x) where φ() is any PDF symmetric about zero and Φ() is any CDF whose PDF is symmetric about zero.[2]

To add location and scale parameters to this, one makes the usual transform 技術分享圖片. One can verify that the normal distribution is recovered when 技術分享圖片, and that the absolute value of the skewness increases as the absolute value of 技術分享圖片increases. The distribution is right skewed if 技術分享圖片 and is left skewed if 技術分享圖片. The probability density function with location 技術分享圖片, scale 技術分享圖片, and parameter 技術分享圖片 becomes

技術分享圖片

Note, however, that the skewness of the distribution is limited to the interval 技術分享圖片.

Estimation

Maximum likelihood estimates for 技術分享圖片, 技術分享圖片, and 技術分享圖片 can be computed numerically, but no closed-form expression for the estimates is available unless 技術分享圖片. If a closed-form expression is needed, the method of moments can be applied to estimate 技術分享圖片 from the sample skew, by inverting the skewness equation. This yields the estimate

技術分享圖片

where 技術分享圖片, and 技術分享圖片 is the sample skew. The sign of 技術分享圖片 is the same as the sign of 技術分享圖片. Consequently, 技術分享圖片.

The maximum (theoretical) skewness is obtained by setting 技術分享圖片 in the skewness equation, giving 技術分享圖片. However it is possible that the sample skewness is larger, and then 技術分享圖片 cannot be determined from these equations. When using the method of moments in an automatic fashion, for example to give starting values for maximum likelihood iteration, one should therefore let (for example) 技術分享圖片.

Concern has been expressed about the impact of skew normal methods on the reliability of inferences based upon them.[3]

Differential equation

The differential equation leading to the pdf of the skew normal distribution is

技術分享圖片,

with initial conditions

技術分享圖片

廣義高斯分布:亞高斯信號,高斯信號,超高斯信號

一個信號的高斯性是通過其峭度定義的。在信號x的均值為零的條件下,其峭度定義如下: kurt(x)=E{x^4}-3[E{x^2}]^2 <0 次高斯信號 (亞高斯信號 kurt(x) =0 高斯信號 >0 超高斯信號 當我們拿到任意信號x的一個樣本後,可通過如下的計算求其峭度,進而判斷高斯性: 假設x是1*N的行向量: x=x-mean(x)*ones(1,N); %去均值 KurtX=mean(x.^4)-3*(mean(x.^2))^2; %求峭度 均勻分布的信號是次高斯信號拉普拉斯分布的信號是超高斯信號語音信號是超高斯信號。根據中心極限定理的意義,N個不同分布信號的聯合分布有高斯化的趨勢,所以信號的非高斯性是盲信號分離一個很好的優化判據。 相對於高斯信號,亞高斯信號更平坦多峰,超高斯信號更尖銳且有更長的尾巴。 對於高斯分布的信號,二階統計量足以描述其特性,但是對於通信系統中典型的通信信號,其分布通常是欠高斯的,所以二階統計量不足以描述其特性,必須用更高階統計量描述其特性。 非平穩信號:可以簡單地理解為分布參數或者分布律隨時間發生變化。 高斯信號:是分布規律符合正態分布的非平穩信號 而非平穩高斯信號:就是信號的分布律不隨時間變化,總是高斯的,但分布參數(均值和方差)卻是隨時間變化的。 一般對於非平穩信號,主要有時頻分析和小波分析。 補充: 高斯信號就是信號的各種幅值出現的機會滿足高斯分布的信號。 站在ICA上說,高斯信號的壞處就是,它看起來就是一堆玉米(順便廢話:它的概率密度曲線看起來確實很像玉米堆),你在一堆玉米上再倒上一堆玉米,得到的仍然是一堆玉米,看不出來是由原來兩堆玉米混起來的,所以在理論上是不可分離的。 超高斯分布比高斯分布更加集中 亞高斯分布比高斯分布平坦 超高斯:四階累積量大於0 亞高斯:四階累積量小於0

Generalized normal distribution and Skew normal distribution