regression | p-value | Simple (bivariate) linear model | 線性回歸 | FDR | BH | R代碼
P122
入門:散點圖、線性擬合、擬合參數slope
進階:統計檢驗,多重矯正FDR
入門R代碼
height <- c(176, 154, 138, 196, 132, 176, 181, 169, 150, 175) bodymass <- c(82, 49, 53, 112, 47, 69, 77, 71, 62, 78) plot(bodymass, height) plot(bodymass, height, pch = 16, cex = 1.3, col = "blue", main = "HEIGHT PLOTTED AGAINST BODY MASS", xlab = "BODY MASS (kg)", ylab = "HEIGHT (cm)")
進階
eruption.lm = lm(eruptions ~ waiting, data=faithful) summary(eruption.lm) help(summary.lm)
Call: lm(formula = eruptions ~ waiting, data = faithful) Residuals: Min 1Q Median 3Q Max -1.2992 -0.3769 0.0351 0.3491 1.1933 Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) -1.87402 0.16014 -11.7 <2e-16 *** waiting 0.07563 0.00222 34.1 <2e-16 *** --- Signif. codes: 0 ’***’ 0.001 ’**’ 0.01 ’*’ 0.05 ’.’ 0.1 ’ ’ 1 Residual standard error: 0.497 on 270 degrees of freedom Multiple R-squared: 0.811, Adjusted R-squared: 0.811 F-statistic: 1.16e+03 on 1 and 270 DF, p-value: <2e-16
Decide whether there is a significant relationship between the variables in the linear regression model of the data set faithful at .05 significance level.
NULL hypothesis: no relationship between x and y, so the slope is zero.
假設誤差服從正態分布,基於零假設,我們要檢驗以下統計量是否顯著。
統計量:(b-B)/sb follows a Student’s t distribution with n-2 degrees of freedom, where sb=s/√(∑(X-Mean(X))2) is the standard error of b.
medium專題
這個非常值得一看,回歸裏的系數和p-value分別是什麽含義。
How to Interpret Regression Analysis Results: P-values and Coefficients
null hypothesis:coefficient is 0,如果p-value小於0.05,我們就可以拒絕零假設。
multiple testing
Benjamini and Hochberg‘s method
aggregated FDR
FDR with group info
Hu, James X., Hongyu Zhao, and Harrison H. Zhou. "False discovery rate control with groups." Journal of the American Statistical Association 105.491 (2010): 1215-1227.
待續~
regression | p-value | Simple (bivariate) linear model | 線性回歸 | FDR | BH | R代碼