關於使用scipy.stats.lognorm來模擬對數正態分佈的誤區
阿新 • • 發佈:2018-11-27
lognorm方法的引數容易把人搞蒙。例如lognorm.rvs(s, loc=0, scale=1, size=1)中的引數s,loc,scale, 要記住:loc和scale並不是我們通常理解的對數變化後資料的均值mu和標準差sigma,如下面所述:
The probability density function for lognorm is:
lognorm.pdf(x, s) = 1 / (s*x*sqrt(2*pi)) * exp(-1/2*(log(x)/s)**2) for x > 0, s > 0.
lognorm takes s as a shape parameter.
The probability density above is defined in the “standardized” form. To shift and/or scale the distribution use the loc and scale parameters. Specifically, lognorm.pdf(x, s, loc, scale) is identically equivalent to lognorm.pdf(y, s) / scale with y = (x - loc) / scale.
A common parametrization for a lognormal random variable Y is in terms of the mean, mu, and standard deviation, sigma, of the unique normally distributed random variable X such that exp(X) = Y. This parametrization corresponds to setting s = sigma and scale = exp(mu).
(源自參考文件2)
所以要得到一般意義上符合對數正態分佈的隨機變數X(即,logX服從n(mu,sigma^2)),需要令lognorm中的引數s=sigma,loc=0,scale=exp(mu)。(詳細論述見參考文件3和4)
參考文件:
[1]如何在Python中實現這五類強大的概率分佈
http://python.jobbole.com/81321/
[2]scipy.stats文件
https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.lognorm.html#scipy.stats.lognorm
[3]How do I get a lognormal distribution in Python with Mu and Sigma?
[4]Fitting log-normal distribution in R vs. SciPy
http://stats.stackexchange.com/questions/33036/fitting-log-normal-distribution-in-r-vs-scipy