1. 程式人生 > >【電腦科學】【2016.09】深度學習的不確定性





在本工作中,我們通過演算法來獲得深度學習中的實際不確定性估計,將最新的深度學習工具模擬為貝葉斯模型,而不需要改變模型或優化方法。在本文的第一部分,我們研究了這種工具的理論方法,並提供了應用和說明性的例子。我們將貝葉斯模型中的近似推理與dropout等隨機正則化技術聯絡起來,並從經驗上評估這些近似方法。我們從現代深度學習和貝葉斯建模之間的聯絡出發,給出了諸如影象資料的主動學習和資料有效的深度強化學習產生的應用例項。我們通過對語言應用、醫學診斷、生物資訊學、影象處理和自動駕駛等最新應用中建議技術的應用進行調查,進一步證明了所提出工具的實用性。論文的第二部分探討了貝葉斯建模與深度學習之間的聯絡及其理論意義。我們討論了決定模型不確定特性的因素,分析了線性情況下的近似推理,並在理論上審查了各種先驗知識(spike and slab priors)。

Deep learning has attracted tremendousattention from researchers in various fields of information engineering such asAI, computer vision, and language processing [Kalchbrenner and Blunsom, 2013;Krizhevsky et al., 2012; Mnih et al., 2013], but also from more traditionalsciences such as physics, biology, and manufacturing [Anjos et al., 2015; Baldiet al., 2014; Bergmann et al., 2014]. Neural networks, image processing toolssuch as convolutional neural networks, sequence processing models such asrecurrent neural networks, and regularisation tools such as dropout, are usedextensively. However, fields such as physics, biology, and manufacturing areones in which representing model uncertainty is of crucial importance[Ghahramani, 2015; Krzywinski and Altman, 2013]. With the recent shift in manyof these fields towards the use of Bayesian uncertainty [Herzog and Ostwald,2013; Nuzzo, 2014; Trafimow and Marks, 2015], new needs arise from deeplearning. In this work we develop tools to obtain practical uncertaintyestimates in deep learning, casting recent deep learning tools as Bayesianmodels without changing either the models or the optimisation. In the firstpart of this thesis we develop the theory for such tools, providingapplications and illustrative examples. We tie approximate inference inBayesian models to dropout and other stochastic regularisation techniques, andassess the approximations empirically. We give example applications arisingfrom this connection between modern deep learning and Bayesian modelling suchas active learning of image data and data-efficient deep reinforcementlearning. We further demonstrate the tools’ practicality through a survey ofrecent applications making use of the suggested techniques in languageapplications, medical diagnostics, bioinformatics, image processing, andautonomous driving. In the second part of the thesis we explore the insightsstemming from the link between Bayesian modelling and deep learning, and itstheoretical implications. We discuss what determines model uncertaintyproperties, analyse the approximate inference analytically in the linear case,and theoretically examine various priors such as spike and slab priors.

1 引言:瞭解我們所不知道知識的重要性
2 語言不確定性
3 貝葉斯深度學習
4 不確定性度量
5 具體應用
6 深入分析
7 未來研究展望
附錄A KL條件
附錄B 圖片集
附錄C Spike andslab prior KL


