【翻譯】RAINBOWR Github Repo的Readme

阿新 • • 發佈：2021-07-02

寫在前面

原文連結：https://github.com/KosukeHamazaki/RAINBOWR/blob/master/README.md
最近看這個包的使用方法，順手把Readme翻譯了，侵權刪

正文

通過使用R優化權重進行可靠的關聯推理（Reliable Association INference By Optimizing Weights with R，RAINBOWR）

作者：Kosuke Hamazaki

日期：2019/03/25 （上次更新：2020/10/26）

注意！！

`RAINBOWR`的論文已經在PLOS Computational Biology上釋出了（ https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007663

）。如果您在您的文章中使用了`RAINBOWR`，請引用如下：

Hamazaki, K. and Iwata, H. (2020) RAINBOW: Haplotype-based genome-wide association study using a novel SNP-set method. PLOS Computational Biology, 16(2): e1007663.

`RAINBOWR`包的穩定版本已經在CRAN (Comprehensive R Archive Network)上可用。

`RAINBOWR`的較老版本名為`RAINBOW`，位於https://github.com/KosukeHamazaki/RAINBOW 。

我們將包名由RAINBOW修改為RAINBOWR的原因是：當我們向CRAN上傳提交我們的包時，原始的RAINBOW包名與rainbow包( https://cran.r-project.org/package=rainbow ) 衝突了。

這個Repo中儲存的是R包RAINBOWR的相關程式碼。接下來，我們將描述如何安裝以及如何使用RAINBOWR

`RAINBOWR`是什麼？

RAINBOWR（Reliable Association INference By Optimizing Weight with R，通過使用R優化權重進行可靠的關聯推理）是一個包，用於執行以下幾種型別的GWAS。

使用RGWAS.normal

函式執行對單個SNP的GWAS；
使用RGWAS.multisnp函式執行對SNP集合（或者基因集合）的GWAS（同時對多個SNP進行檢驗）；
使用RGWAS.epistasis函式執行對錶觀（SNP集合與SNP集合相互作用）效應的檢驗（非常慢且不可靠）。

RAINBOWR還提供了一些函式來解決線性混合效應模型。

使用EMM.cpp函式求解單核線性混合效應模型；
使用EM3.cpp函式求解多核線性混合效應模型（對於一般的核心，沒有那麼快）；
使用EM3.linker.cpp函式求解多核線性混合效應模型（對於線性核心，速度較快）。

通過利用這些功能，你可以評估基因組遺傳率並進行基因組預測（GP）。

最後，RAINBOWR還提供了其他有用的功能。

qq和manhattan函式用於繪製QQ圖和曼哈頓圖；
modify.data函式用於匹配表型和標記物基因型資料；
CalcThresold函式用於計算GWAS結果的閾值；
See函式用於檢視資料的簡要檢視（類似於head函式，但更有用）；
genetrait函式用於從標記基因型（marker genotype）生成偽表型值；
SS_GWAS函式用於總結GWAS結果（僅用於模擬研究）；
estPhylo和estNetwork函式用於估計系統發育樹或單倍型網路和單倍型效應，對感興趣的單倍型塊採用非線性核。

安裝

RAINBOWR的穩定版本現在可以在CRAN (Comprehensive R Archive Network)上找到。RAINBOWR的最新版本也可以在GitHub的KosukeHamazaki/RAINBOWR倉庫中找到，請在R控制檯中執行以下程式碼。

#### Stable version of RAINBOWR ####
install.packages("RAINBOWR")  


#### Latest version of RAINBOWR ####
### If you have not installed yet, ...
install.packages("devtools")  

### Install RAINBOWR from GitHub
devtools::install_github("KosukeHamazaki/RAINBOWR")

如果你在安裝過程中遇到一些錯誤，請檢查以下軟體包是否正確安裝。(我們刪除了對rgl包的依賴性！)

Rcpp,      # install `Rtools` for Windows user
plotly,
Matrix,
cluster,
MASS,
pbmcapply,
optimx,
methods,
ape,
stringr,
pegas,
ggplot2,
ggtree,      # install from Bioconducter with `BiocManager::install("ggtree")`
scatterpie,
phylobase,
haplotypes,
rrBLUP,
expm,
here,
htmlwidgets,
Rfast

在RAINBOWR中，由於部分程式碼是用Rcpp（R中的C++）編寫的，請檢查你是否能在R中使用C++。對於Windows使用者，你應該安裝Rtools。

如果你有一些關於安裝的問題，請通過電子郵件聯絡我們（[email protected]）。

使用說明

首先，匯入RAINBOWR包並載入示例資料集。這些示例資料集包括標記基因型（用{-1, 0, 1}評分，1,536個SNP晶片（Zhao等人，2010; PLoS One 5(5): e10780）），帶有物理位置的地圖，以及表型資料（Zhao等人，2011; Nature Communications 2:467）。這兩個資料集都可以從Rice Diversity主頁（http://www.ricediversity.org/data/）上下載。

### Import RAINBOWR
require(RAINBOWR)

### Load example datasets
data("Rice_Zhao_etal")
Rice_geno_score <- Rice_Zhao_etal$genoScore
Rice_geno_map <- Rice_Zhao_etal$genoMap
Rice_pheno <- Rice_Zhao_etal$pheno

### View each dataset
See(Rice_geno_score)
See(Rice_geno_map)
See(Rice_pheno)

你可以通過See函式檢查原始資料格式。然後，選擇一個性狀（這裡是Flowering.time.at.Arkansas）為例。

### Select one trait for example
trait.name <- "Flowering.time.at.Arkansas"
y <- Rice_pheno[, trait.name, drop = FALSE]

對於GWAS，首先你可以通過MAF.cut函式去除MAF<=0.05的SNP。（譯者注：這一步屬於QC）

### Remove SNPs whose MAF <= 0.05
x.0 <- t(Rice_geno_score)
MAF.cut.res <- MAF.cut(x.0 = x.0, map.0 = Rice_geno_map)
x <- MAF.cut.res$x
map <- MAF.cut.res$map

接下來，我們通過使用calcGRM函式估計加性基因組關係矩陣（additive genomic relationship matrix, additive GRM）。

### Estimate genomic relationship matrix (GRM) 
K.A <- calcGRM(genoMat = x)

然後，我們通過modify.data函式將這些資料修改為RAINBOWR的GWAS格式。

### Modify data
modify.data.res <- modify.data(pheno.mat = y, geno.mat = x, map = map,
                               return.ZETA = TRUE, return.GWAS.format = TRUE)
pheno.GWAS <- modify.data.res$pheno.GWAS
geno.GWAS <- modify.data.res$geno.GWAS
ZETA <- modify.data.res$ZETA

### View each data for RAINBOWR
See(pheno.GWAS)
See(geno.GWAS)
str(ZETA)

ZETA是一個基因組關係矩陣（GRM）及其設計矩陣的列表。

最後，我們可以利用這些資料進行GWAS。

首先，我們通過RGWAS.normal函式進行單SNP的GWAS，具體如下：

### Perform single-SNP GWAS
normal.res <- RGWAS.normal(pheno = pheno.GWAS, geno = geno.GWAS,
                           ZETA = ZETA, n.PC = 4, P3D = TRUE)
See(normal.res$D)  ### Column 4 contains -log10(p) values for markers
### Automatically draw Q-Q plot and Manhattan by default.

接下來，我們通過RGWAS.multisnp函式進行SNP集合的GWAS。

### Perform SNP-set GWAS (by regarding 11 SNPs as one SNP-set)
SNP_set.res <- RGWAS.multisnp(pheno = pheno.GWAS, 
                              geno = geno.GWAS, 
                              ZETA = ZETA, 
                              n.PC = 4, 
                              test.method = "LR", 
                              kernel.method = "linear",
                              gene.set = NULL,
                              test.effect = "additive", 
                              window.size.half = 5, 
                              window.slide = 11)
See(SNP_set.res$D)  ### Column 4 contains -log10(p) values for markers

你可以通過設定window.slide = 1來制I型那個滑動視窗的SNP集合的GWAS。你也可以通過給gene.set引數指定以下資料集來執行基因集合的（或者基於單倍型的）GWAS。

輸入資料如下：

gene (or haplotype block)	marker
gene_1	id1000556
gene_1	id1000673
gene_2	id1000830
gene_2	id1000955
gene_2	id1001516
...	...

幫助

如果你在使用RAINBOWR執行GWAS之前需要一些幫助資訊，請通過?{function_name}檢視每個函式的幫助。

你也可以通過以下方式檢查如何確定每個引數。

RGWAS.menu()

RGWAS.menu函式會詢問一些問題，通過回答這些問題，該函式會告訴你如何確定使用哪個函式以及如何設定引數。

參考文獻

Kennedy, B.W., Quinton, M. and van Arendonk, J.A. (1992) Estimation of effects of single genes on quantitative traits. J Anim Sci. 70(7): 2000-2012.

Storey, J.D. and Tibshirani, R. (2003) Statistical significance for genomewide studies. Proc Natl Acad Sci. 100(16): 9440-9445.

Yu, J. et al. (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nat Genet. 38(2): 203-208.

Kang, H.M. et al. (2008) Efficient Control of Population Structure in Model Organism Association Mapping. Genetics. 178(3): 1709-1723.

Kang, H.M. et al. (2010) Variance component model to account for sample structure in genome-wide association studies. Nat Genet. 42(4): 348-354.

Zhang, Z. et al. (2010) Mixed linear model approach adapted for genome-wide association studies. Nat Genet. 42(4): 355-360.

Endelman, J.B. (2011) Ridge Regression and Other Kernels for Genomic Selection with R Package rrBLUP. Plant Genome J. 4(3): 250.

Endelman, J.B. and Jannink, J.L. (2012) Shrinkage Estimation of the Realized Relationship Matrix. G3 Genes, Genomes, Genet. 2(11): 1405-1413.

Su, G. et al. (2012) Estimating Additive and Non-Additive Genetic Variances and Predicting Genetic Merits Using Genome-Wide Dense Single Nucleotide Polymorphism Markers. PLoS One. 7(9): 1-7.

Zhou, X. and Stephens, M. (2012) Genome-wide efficient mixed-model analysis for association studies. Nat Genet. 44(7): 821-824.

Listgarten, J. et al. (2013) A powerful and efficient set test for genetic markers that handles confounders. Bioinformatics. 29(12): 1526-1533.

Lippert, C. et al. (2014) Greater power and computational efficiency for kernel-based association testing of sets of genetic variants. Bioinformatics. 30(22): 3206-3214.

Jiang, Y. and Reif, J.C. (2015) Modeling epistasis in genomic selection. Genetics. 201(2): 759-768.

Hamazaki, K. and Iwata, H. (2020) RAINBOW: Haplotype-based genome-wide association study using a novel SNP-set method. PLOS Computational Biology, 16(2): e1007663.

【翻譯】RAINBOWR Github Repo的Readme

寫在前面

正文

通過使用R優化權重進行可靠的關聯推理（Reliable Association INference By Optimizing Weights with R，RAINBOWR）

作者：Kosuke Hamazaki

日期：2019/03/25 （上次更新：2020/10/26）

注意！！

`RAINBOWR`的論文已經在PLOS Computational Biology上釋出了（ https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007663

）。如果您在您的文章中使用了`RAINBOWR`，請引用如下：

`RAINBOWR`包的穩定版本已經在CRAN (Comprehensive R Archive Network)上可用。

`RAINBOWR`的較老版本名為`RAINBOW`，位於https://github.com/KosukeHamazaki/RAINBOW 。

`RAINBOWR`是什麼？

安裝

使用說明

幫助

參考文獻

【翻譯】RAINBOWR Github Repo的Readme

【翻譯】為Rust應用快速地構建體積小的映象

【翻譯】【Go】Go Modules的用法

【翻譯】.NET 5中的效能改進

【翻譯】Scriban README 文字模板語言和.NET引擎

【翻譯】Promises/A+規範

【翻譯】.NET 5 Preview8釋出

【翻譯】.NET 5 Release Candidate 1中的ASP.NET Core更新

【Git】為GitHub新增ssh配置

【翻譯】Django Channels 官方文件 -- Tutorial

【爬蟲】獲取Github倉庫提交紀錄歷史的指令碼 python

【翻譯】面向自然語言處理的深度學習(一)

【翻譯】Spring 5 WebFlux入門

【翻譯】製作Debian包

【翻譯】全基因組關聯分析教程：質量控制和統計分析【第二部分：軟體介紹&質量控制】

【翻譯】全基因組關聯分析教程：質量控制和統計分析【第三部分：種群分層控制和關聯統計計算】

【翻譯】高效numpy指北

【翻譯】RAINBOW：採用新型SNP-set方法的基於單倍型的全基因組關聯分析【第一部分：摘要和引言】

【原創】關於github那些事:如何把專案提交到coding上/gitLab上

【翻譯】MPE標準（MIDI Polyphonic Expression）

【翻譯】RAINBOWR Github Repo的Readme

寫在前面

正文

通過使用R優化權重進行可靠的關聯推理（Reliable Association INference By Optimizing Weights with R，RAINBOWR）

作者：Kosuke Hamazaki

日期：2019/03/25 （上次更新：2020/10/26）

注意！！

RAINBOWR的論文已經在PLOS Computational Biology上釋出了（ https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007663 ）。如果您在您的文章中使用了RAINBOWR，請引用如下：

RAINBOWR包的穩定版本已經在CRAN (Comprehensive R Archive Network)上可用。

RAINBOWR的較老版本名為RAINBOW，位於https://github.com/KosukeHamazaki/RAINBOW 。

RAINBOWR是什麼？

安裝

使用說明

幫助

參考文獻

相關推薦

`RAINBOWR`的論文已經在PLOS Computational Biology上釋出了（ https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1007663

）。如果您在您的文章中使用了`RAINBOWR`，請引用如下：

`RAINBOWR`包的穩定版本已經在CRAN (Comprehensive R Archive Network)上可用。

`RAINBOWR`的較老版本名為`RAINBOW`，位於https://github.com/KosukeHamazaki/RAINBOW 。

`RAINBOWR`是什麼？