QIIME 2:可重複、互動和擴充套件的微生物組資料分析流程
文章目錄
2010年發表於Nature Methods的QIIME[發音同chime]是微生物組領域最廣泛使用的擴增子資料分析流程,截止2018年12月16日,Google Scholar統計引用 13,126次。隨著近年來測序通量的提高,大規模研究開展,其軟體架構己法滿足當前微生物組可重複分析的要求。
為滿足當前大資料、可重複分析的需求,由QIIME第一作者Gregory Caporaso領銜於2016年從頭開發的QIIME2,於2018年全面接檔QIIME。文章正在投稿中,全文的預印版於10月24日釋出於Peer J預印本,於本月3號更新了第二版,今天一起來了解一下關於QIIME2的最新訊息吧!
QIIME2:可重複、可互動、適用範圍廣和可擴充套件的微生物組資料科學
QIIME 2: Reproducible, interactive, scalable, and extensible microbiome data science
PeerJ Preprints
DOI: https://doi.org/10.7287/peerj.preprints.27295v2
第一作者:Evan Bolyen1, Jai Ram Rideout1, Matthew R Dillon1, Nicholas A Bokulich1,
通訊作者:J Gregory Caporaso1,21
主要單位:美國,北亞利桑那大學,病原和微生物組研究所
本文共有112位作者,74家單位參與,除第一作者和通訊作者外,其他作者按姓氏字母順序排列。作者和單位列表見附錄。
摘要
我們推出了開源的微生物組資料科學平臺——QIIME 2,適用於微生物生態領域研究人員的科學家、工程師,以及臨床醫生和決策者。QIIME 2的新特徵將會推動微生物研究進入新的階段。主要包括時間和空間分析和視覺化工具,支援代謝組和巨集基因組資料分析,自動化資料來源追溯確保資料的可重複,微生物組資料科學透明。
正文
圖1. 互動式視覺化工具
QIIME2提供了眾多的互動式視覺化工具。本圖展示了4個示例,這些螢幕截圖的互動版本詳見 https://github.com/qiime2/paper1 (快訪問體驗一下吧)。這些圖繪製的詳細程式碼、描述詳見線上方法部分。
A. 基於37,680個樣本的無權重UniFrac PCoA圖,表明QIIME 2的大樣本量處理能力(scalable)。按地球微生物組的本體論分類著色。
B. 波動圖(volatility plot)展示母乳和奶粉餵養嬰兒雙歧桿菌丰度隨時間的變化。此視覺化方法可用於互動挖掘時空特異的特徵。
C. 互動式柱狀圖展示黃石公園熱泉不同溫度梯度下物種組成。更多可互動式的可控條件,極大的減少了分析工作量。
D. 人類面板表面的分子地圖。 著色的點代表小分子化妝品硫酸月桂酸鈉在人體面板上的丰度。樣本資料可以在3D模型上互動式視覺化,支援空間模式的發現。
圖2. 迭代記錄資料來源確保分析可重複
簡化的示意圖展示建立圖1c中柱狀圖分析過程的可追溯圖。其它圖的可追溯過程見附圖1。
程式碼可用
QIIME 2對所有使用者可用,包括商業用途,原始碼見 https://github.com/qiime2
線上方法
我們建立一個qiime2的目錄對本流程進行初步瞭解
wd=~/test/qiime2
mkdir $wd
cd $wd
提取QIIME2的存檔內容
很多QIIME 2新使用者困惑的是結果為特殊格式,不可直接檢視,使用不方便。
其實,qza和qzv格式就是zip的壓縮包,可使用unzip直接解壓
# 下載代表性序列(OTU)
wget https://docs.qiime2.org/2018.8/data/tutorials/moving-pictures/rep-seqs.qza
# 解壓
unzip rep-seqs.qza
# 檢視序列檔案前4行
head -4 8dc793b8-7284-462a-8578-6370ffccebdc/data/dna-sequences.fasta
是不是覺得QIIME 2的結果很熟悉了,讓我們開始全新的可重複計算新時代吧!
>f352c1f1efecf483511c2270aabd0ae6
TACGTAGGGTGCGAGCGTTAATCGGAATTACTGGGCGTAAAGCGTGCGCAGGCGGTTTTGTAAGACAGAGGTGAAATCCCCGGGCTCAACCTGGGAACTGCCTTTGTGACTGCAAGGCTG
>82e72255267397b777a1afd44ea22755
TACGGAGGATCCAAGCGTTATCCGGAATCATTGGGTTTAAAGGGTCCGTAGGCGGTTTAGTAAGTCAGTGGTGAAAGCCCATCGCTCAACGGTGGAACGGCCATTGATACTGCTAGACTT
附圖
附圖1. QIIME2原理圖
附圖2. QIIME2介面各類
QIIME 2提供多種使用介面,方便不同計算水平人員使用。
a. 網頁QIIME 2 View檢視資料或結果工具,使用者無需安裝軟體;這一設計方便團隊負責人、醫生、決策者探索其他人分析的互動式視覺化結果;
b. 喜歡圖形介面的使用者可使用原生的圖型介面QIIME 2 Studio,,無需命令列或程式設計技巧;
c. 對於熟悉Linux命令列,計算叢集使用的使用者,推薦使用命令列介面——q2cli;
d. 使用Jupyter Notebooks、對自運化工作流程感興趣的資料科學家,可使用Python 3介面的artifact API
附圖3. QIIME2文件型別和結構
QIIME2儲存的資料採用目錄結果,稱為存檔。這些存檔為壓縮格式,方便資料移動。目錄結構有唯一的根目錄,並有UUID作為標識。
作者列表
Evan Bolyen1, Jai Ram Rideout1, Matthew R Dillon1, Nicholas A Bokulich1, Christian Abnet2, Gabriel A Al-Ghalith3, Harriet Alexander4,5, Eric J Alm6,7, Manimozhiyan Arumugam8, Francesco Asnicar9, Yang Bai10,11,12, Jordan E Bisanz13, Kyle Bittinger14,15, Asker Brejnrod16, Colin J Brislawn17, C Titus Brown5, Benjamin J Callahan18,19, Andrés Mauricio Caraballo-Rodríguez20, John Chase1, Emily Cope1,21, Ricardo Da Silva20, Pieter C Dorrestein20, Gavin M Douglas22, Daniel M Durall23, Claire Duvallet6, Christian F Edwardson24, Madeleine Ernst20, Mehrbod Estaki25, Jennifer Fouquier26,27, Julia M Gauglitz20, Deanna L Gibson28,29, Antonio Gonzalez30, Kestrel Gorlick1, Jiarong Guo31, Benjamin Hillmann32, Susan Holmes33, Hannes Holste30,34, Curtis Huttenhower35,36, Gavin Huttley37, Stefan Janssen38, Alan K Jarmusch20, Lingjing Jiang39, Benjamin Kaehler37, Kyo Bin Kang20,40, Christopher R Keefe1, Paul Keim1, Scott T Kelley41, Dan Knights32,42, Irina Koester20,43, Tomasz Kosciolek30, Jorden Kreps1, Morgan GI Langille44, Joslynn Lee45, Ruth Ley46,47, Yong-Xin Liu10,11, Erikka Loftfield2, Catherine Lozupone48, Massoud Maher49, Clarisse Marotz30, Bryan D Martin50, Daniel McDonald30, Lauren J McIver35,36, Alexey V Melnik20, Jessica L Metcalf51, Sydney C Morgan52, Jamie Morton30,49, Ahmad Turan Naimey1, Jose A Navas-Molina30,49,53, Louis Felix Nothias20, Stephanie B Orchanian54, Talima Pearson1, Samuel L Peoples55,56, Daniel Petras20, Mary Lai Preuss57, Elmar Pruesse48, Lasse Buur Rasmussen16, Adam Rivers58, Michael S Robeson, II59, Patrick Rosenthal57, Nicola Segata9, Michael Shaffer48,60, Arron Shiffer1, Rashmi Sinha2, Se Jin Song30, John R Spear61, Austin D Swafford54, Luke R Thompson62,63, Pedro J Torres64, Pauline Trinh65, Anupriya Tripathi20,30,66, Peter J Turnbaugh67, Sabah Ul-Hasan68, Justin JJ van der Hooft69, Fernando Vargas66, Yoshiki Vázquez-Baeza30, Emily Vogtmann2, Max von Hippel70, William Walters46, Yunhu Wan2, Mingxun Wang20, Jonathan Warren71, Kyle C Weber58,72, Chase HD Williamson1, Amy D Willis73, Zhenjiang Zech Xu30, Jesse R Zaneveld74, Yilong Zhang75, Qiyun Zhu30, Rob Knight30,54,76, J Gregory Caporaso1,21
單位列表
- Pathogen and Microbiome Institute, Northern Arizona University, Flagstaff, AZ, USA
- Metabolic Epidemiology Branch, National Cancer Institute, Rockville, MD, USA
- Department of Computer Science and Engineering, University of Minnesota, Minneapolis, Minnesota, USA
- Biology Department, Woods Hole Oceanographic Institution, Woods Hole, MA, USA
- Department of Population Health and Reproduction, University of California, Davis, CA, USA
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
- Center for Microbiome Informatics and Therapeutics, Massachusetts Institute of Technology, Cambridge, MA, USA
- University of Copenhagen, Faculty of Health and Medical Sciences, Novo Nordisk Foundation Center for Basic Metabolic Research, Copenhagen, Denmark
- Centre for Integrative Biology, University of Trento, Trento, Italy
- State Key Laboratory of Plant Genomics, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China
- Centre of Excellence for Plant and Microbial Sciences (CEPAMS), Institute of Genetics and Developmental Biology, Chinese Academy of Sciences & John Innes Centre, Beijing, China
- University of Chinese Academy of Sciences, Beijing, China
- Department of Microbiology and Immunology, University of California, San Francisco, CA, USA
- Division of Gastroenterology and Nutrition, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Hepatology, Children’s Hospital of Philadelphia, Philadelphia, PA, USA
- Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, Denmark
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA, USA
- Department of Population Health & Pathobiology, North Carolina State University, Raleigh, NC, USA
- Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA
- Collaborative mass spectrometry innovation center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
- Department of Biological Sciences, Northern Arizona University, Flagstaff, AZ, USA
- Department of Microbiology and Immunology, Dalhousie University, Halifax, Nova Scotia, Canada
- Irving K. Barber School of Arts and Sciences, University of British Columbia, Kelowna, British Columbia, Canada
- A. Watson Armour III Center for Animal Health and Welfare, Aquarium Microbiome Project, John G. Shedd Aquarium, Chicago, IL, USA
- Department of Biology, University of British Columbia Okanagan, Okanagan, BC, Canada
- Computational Bioscience Graduate Program, University of Colorado Denver Anschutz Medical Campus, Aurora, Colorado, USA
- Department of Medicine, Division of Biomedical Informatics and Personalized Medicine, University of Colorado Denver Anschutz Medical Campus, Aurora, Colorado, USA
- Irving K. Barber School of Arts and Sciences, Department of Biology, The University of British Columbia, Kelowna, BC, Canada
- Department of Medicine, The University of British Columbia, Kelowna, BC, Canada
- Department of Pediatrics, University of California San Diego, La Jolla, CA, USA
- Center for Microbial Ecology, Michigan State University, East Lansing, MI, USA
- Department of Computer Science and Engineering, University of Minnesota, Minneapolis, MN, USA
- Stanford University, Statistics Department, Palo Alto, CA, USA
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, CA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
- Broad Institute of MIT and Harvard, Cambridge, MA, USA
- Research School of Biology, The Australian National University, Canberra, ACT, Australia
- Department of Pediatric Oncology, Hematology and Clinical Immunology, Heinrich-Heine University Dusseldorf, Dusseldorf, Germany
- Department of Family Medicine and Public Health, University of California San Diego, La Jolla, CA, USA
- College of Pharmacy, Sookmyung Women’s University, Seoul, Republic of Korea
- San Diego State University, Department of Biology, San Diego, CA, USA
- Biotechnology Institute, University of Minnesota, Saint Paul, MN, USA
- Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, USA
- Department of Pharmacology, Dalhousie University, Halifax, Nova Scotia, Canada
- Science Education, Howard Hughes Medical Institute, Ashburn, VA, USA
- Department of Microbiome Science, Max Planck Institute for Developmental Biology, Tübingen, Germany
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
- Department of Medicine, Division of Biomedical Informatics and Personalized Medicine, University of Colorado Denver Anschutz Medical Campus, Aurora, CO, USA
- Department of Computer Science & Engineering, University of California San Diego, La Jolla, CA, USA
- Department of Statistics, University of Washington, Seattle, WA, USA
- Department of Animal Science, Colorado State University, Fort Collins, CO, USA
- Irving K. Barber School of Arts and Sciences, Unit 2 (Biology), University of British Columbia, Kelowna, BC, Canada
- Mountain View, Google LLC, Mountain View, CA, USA
- Center for Microbiome Innovation, University of California San Diego, La Jolla, CA, USA
- School of Information Studies, Syracuse University, Syracuse, NY, USA
- School of STEM, University of Washington Bothell, Bothell, WA, USA
- Department of Biological Sciences, Webster University, St Louis, MO, USA
- Agricultural Research Service, Genomics and Bioinformatics Research Unit, United States Department of Agriculture, Gainesville, FL, USA
- College of Medicine, Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, USA
- Computational Bioscience Program, University of Colorado Denver Anschutz Medical Campus, Aurora, CO, USA
- Department of Civil and Environmental Engineering, Colorado School of Mines, Golden, CO, USA
- Department of Biological Sciences and Northern Gulf Institute, University of Southern Mississippi, Hattiesburg, Mississippi, USA
- Ocean Chemistry and Ecosystems Division, Atlantic Oceanographic and Meteorological Laboratory, National Oceanic and Atmospheric Administration, La Jolla, CA, USA
- Department of Biology, San Diego State University, San Diego, CA, USA
- Department of Environmental and Occupational Health Sciences, University of Washington, Seattle, WA, USA
- Division of Biological Sciences, University of California San Diego, San Diego, CA, USA
- Department of Microbiology and Immunology, University of California San Francisco, San Francisco, CA, USA
- Quantitative and Systems Biology Graduate Program, University of California Merced, Merced, CA, USA
- Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
- Department of Mathematics, University of Arizona, Tucson, AZ, USA
- National Laboratory Service, Environment Agency, Starcross, UK
- College of Agriculture and Life Sciences, University of Florida, Gainesville, FL, USA
- Department of Biostatistics, University of Washington, Seattle, WA, USA
- University of Washington Bothell, School of STEM, Division of Biological Sciences, Bothell, WA, USA
- Merck & Co. Inc., Kenilworth, NJ, USA
- Department of Computer Science and Engineering, University of California San Diego, La Jolla, California, USA
Reference
https://peerj.com/preprints/27295/
Bolyen E, Rideout JR, Dillon MR, Bokulich NA, Abnet C, Al-Ghalith GA, Alexander H, Alm EJ, Arumugam M, Asnicar F, Bai Y, Bisanz JE, Bittinger K, Brejnrod A, Brislawn CJ, Brown CT, Callahan BJ, Caraballo-Rodríguez AM, Chase J, Cope E, Da Silva R, Dorrestein PC, Douglas GM, Durall DM, Duvallet C, Edwardson CF, Ernst M, Estaki M, Fouquier J, Gauglitz JM, Gibson DL, Gonzalez A, Gorlick K, Guo J, Hillmann B, Holmes S, Holste H, Huttenhower C, Huttley G, Janssen S, Jarmusch AK, Jiang L, Kaehler B, Kang KB, Keefe CR, Keim P, Kelley ST, Knights D, Koester I, Kosciolek T, Kreps J, Langille MG, Lee J, Ley R, Liu Y, Loftfield E, Lozupone C, Maher M, Marotz C, Martin BD, McDonald D, McIver LJ, Melnik AV, Metcalf JL, Morgan SC, Morton J, Naimey AT, Navas-Molina JA, Nothias LF, Orchanian SB, Pearson T, Peoples SL, Petras D, Preuss ML, Pruesse E, Rasmussen LB, Rivers A, Robeson, II MS, Rosenthal P, Segata N, Shaffer M, Shiffer A, Sinha R, Song SJ, Spear JR, Swafford AD, Thompson LR, Torres PJ, Trinh P, Tripathi A, Turnbaugh PJ, Ul-Hasan S, van der Hooft JJ, Vargas F, Vázquez-Baeza Y, Vogtmann E, von Hippel M, Walters W, Wan Y, Wang M, Warren J, Weber KC, Williamson CH, Willis AD, Xu ZZ, Zaneveld JR, Zhang Y, Zhu Q, Knight R, Caporaso JG. 2018. QIIME 2: Reproducible, interactive, scalable, and extensible microbiome data science. PeerJ Preprints 6:e27295v2 https://doi.org/10.7287/peerj.preprints.27295v2
猜你喜歡
- 10000+: 菌群分析
寶寶與貓狗 提DNA發Nature 實驗分析誰對結果影響大 Cell微生物專刊 腸道指揮大腦 - 系列教程:微生物組入門 Biostar 微生物組 巨集基因組
- 專業技能:生信寶典 學術圖表 高分文章 不可或缺的人
- 一文讀懂:巨集基因組 寄生蟲益處 進化樹
- 必備技能:提問 搜尋 Endnote
- 文獻閱讀 熱心腸 SemanticScholar Geenmedical
- 擴增子分析:圖表解讀 分析流程 統計繪圖
- 16S功能預測 PICRUSt FAPROTAX Bugbase Tax4Fun
- 線上工具:16S預測培養基 生信繪圖
- 科研經驗:雲筆記 雲協作 公眾號
- 程式設計模板: Shell R Perl
- 生物科普: 腸道細菌 人體上的生命 生命大躍進 細胞暗戰 人體奧祕
寫在後面
為鼓勵讀者交流、快速解決科研困難,我們建立了“巨集基因組”專業討論群,目前己有國內外2600+ 一線科研人員加入。參與討論,獲得專業解答,歡迎分享此文至朋友圈,並掃碼加主編好友帶你入群,務必備註“姓名-單位-研究方向-職稱/年級”。技術問題尋求幫助,首先閱讀《如何優雅的提問》學習解決問題思路,仍末解決群內討論,問題不私聊,幫助同行。
學習擴增子、巨集基因組科研思路和分析實戰,關注“巨集基因組”
點選閱讀原文,跳轉最新文章目錄閱讀
https://mp.weixin.qq.com/s/5jQspEvH5_4Xmart22gjMA