1. 程式人生 > >A Primer on Deep Learning

A Primer on Deep Learning

This is what the hype is about

Deep learning has been all over the news lately. In a presentation I gave at Boston Data Festival 2013 and at a recent PyData Boston meetup I provided some history of the method and a sense of what it is being used for presently. This post aims to cover the first half of that presentation, focusing on the question of why we have been hearing so much about deep learning lately. The content is aimed at data scientists who might have heard a little about deep learning and are interested in a bit more context. Regardless of your background, hopefully you will see how deep learning might be relevant for you. At the very least, you should be able to separate the signal from the noise as the media hype around deep learning increases.

What is deep learning?

I like to use the following three-part definition as a baseline. Deep learning is:

  1. a collection of statistical machine learning techniques
  2. used to learn feature hierarchies
  3. often based on artificial neural networks

That's it. Not so scary after all.  For sounding so innocuous under the hood, there's a lot of rumble in the news about what might be done with DL in the future.  Let's start with an example of what has already been done to motivate why it is proving interesting to so many.

Save the whales!

What does it do that couldn't be done before?

We'll first talk a bit about deep learning in the context of the 2013 Kaggle-hosted quest to save the whales.  The game asks its players the following question: given a set of 2-second sound clips from buoys in the ocean, can you classify each sound clip as having a call from a North Atlantic right whale or not? The practical application of the competition is that if we can detect where the whales are migrating by picking up their calls, we can route shipping traffic to avoid them, a positive both for effective shipping and whale preservation.  

In a post-competition interview competition's winners noted the value of focusing on feature generation, also called feature engineering. Data scientists spend a significant portion of their time, effort, and creativity working on engineering good features; in contrast, they spend relatively little time running machine learning algorithms. A simple example of an engineered feature would involve subtracting two columns and including this new number as an additional descriptor of your data. In the case of the whales, the winning team represented each sound clip in its spectrogram form and built features based on how well the spectrogram matched some example templates. After that, they then subsequently iterated new features that would help them correctly classify examples that they got wrong through the use of a previous set of features.

The final results

This is a look at the final standings for the competition. The results within the top contenders were pretty tight, and the winning team's focus on feature engineering paid off. But how is it that several deep learning approaches could be so competitive while at the same time using as few as one fourth the submissions?  One answer to that question arises from the unsupervised feature learning that deep learning can do. Rather than using data science experience, intuition, and trial-and-error, unsupervised feature learning techniques spend computational time automatically developing new ways of representing the data. The end goal is the same, but the experience along the way can be drastically different.

Not the same

This is not to say that 'deep learning' and 'unsupervised learning' are necessarily the same concept. There are unsupervised learning techniques that have nothing to do with neural networks at all, and you can certainly use neural networks for supervised learning tasks.  The takeaway is that deep learning excels in tasks where the basic unit, a single pixel, a single frequency, or a single word has very little meaning in and of itself, but the combination of such units has a useful meaning. It can learn these useful combinations of values without any human intervention. The canonical example used when discussing the deep learning's ability to learn from data is the MNIST dataset of handwritten digits.  When presented with 60,000 digits a neural network can learn that it is useful to look for loops and lines when trying to classify which digit it is looking at.

Learning accomplished On the left, the raw input digits. On the right, graphical representations of the learned features. In essence, the network learns to "see" lines and loops.

Why the new-found love for Neural Networks?

Is this old wine in new wineskins? Is this not just the humble neural network returning to the foreground?

Neural networks soared in popularity in the 1980S, peaked in the early 1990s, and slowly declined after that. There was quite a bit of hype and some high expectations, but in the end the models were just not proving as capable as had been hoped. So, what was the problem?  The answer to this question helps us get around to understanding why this is called "deep learning" in the first place.

What do you mean, 'deep'?

Neural networks get their representations from using layers of learning.  Primate brains do a similar thing in the visual cortex, so the hope was that using more layers in a neural network could allow it to learn better models.  Researchers found that they couldn't get it to work, though. They found that they could build successful models with a shallow network, one with only a single layer of data representation. Learning in a deep neural network, one with more than one layer of data representation, just wasn't working out. In reality, deep learning has been around for as long as neural networks have - we just weren't any good at using it.

Shallow Neural Network

Deep Neural Network Deep neural networks have more than one hidden layer. It really is that simple.

So, what changed?

The Trinity of Deep Learning The Fathers of Deep Learning

Finally in 2006 three separate groups developed ways of overcoming the difficulties that many in the machine learning world encountered while trying to train deep neural networks. The leaders of these three groups are the fathers of the age of deep learning. This is not at all hyperbole; these figures ushered in a new epoch.  Their work breathed new life into neural networks when many had given up on their utility. A few years down the line, Geoff Hinton has been snatched up by Google; Yann LeCun is Director of AI Research at Facebook; and Yoshua Bengio holds a position as research chair for Artificial Intelligence at University of Montreal, funded in part by the video game company Ubisoft. Their trajectories show that their work is serious business.

What was it that they did to their deep neural networks to make it work? The topic of how their work enables this would merit its own lengthy discussion, so for now please accept this heavily abbreviated version. Before their work, the earliest layers in a deep network simply weren't learning useful representations of the data. In many cases they weren't learning anything at all. Instead they were staying close to their random initialization because of the nature of the training algorithm for neural networks.  Using different techniques, each of these three groups was able to get these early layers to learn useful representations, which led to much more powerful neural networks.

This is what the hype is about Each successive layer in a neural network uses features in the previous layer to learn more complex features.

Now that this problem has been fixed, we ask, what is it that these neural networks learn? This paper illustrates what a deep neural network is capable of learning, and I've included the above picture to make things clearer. At the lowest level, the network fixates on patterns of local contrast as important. The following layer is then able to use those patterns of local contrast to fixate on things that resemble eyes, noses, and mouths. Finally, the top layer is able to apply those facial features to face templates.  A deep neural network is capable of composing more and more complex features in each of its successive layers.

This automated learning of data representations and features is what the hype is all about.  This application of deep neural networks has seen models that successfully learn useful representations of imagery, audio, written language, and even molecular activity.  These have been previously been some hard problems in machine learning, which is why they get so much attention.  Don't be surprised if deep learning is the secret ingredient in even more projects in the future.

The above touches on most of the points I made in the first half of the presentation, a presentation I hope makes for a useful primer on deep learning. The key takeaway is that the breakthroughs in 2006 have enabled deep neural networks that are able automatically to learn rich representations of data. This unsupervised feature learning is proving extremely helpful in domains where individual data points are not very useful but many individual points taken together convey quite a bit of information. This accomplishment that has proven particularly useful in areas like computer vision, speech recognition, and natural language processing.

The second half of the talk was a whirlwind tour through the topics that fall under the umbrella term of 'deep learning'. Feel free to contact me by email or leave a comment below if there are any questions you have or if you'd like pointers on where to find additional material.

相關推薦

A Primer on Deep Learning

Deep learning has been all over the news lately. In a presentation I gave at Boston Data Festival 2013 and at a recent PyData Boston meetup I provided som

A Review on Deep Learning Techniques Applied to Semantic Segmentation 論文閱讀

為了以後的學習方便,把幾篇計算機視覺的論文放上來,僅為自己的學習方便。期間有參考了很多部落格和文獻,但是我寫的仍然很粗糙,存在很多的疑問。這篇文章是第一篇有關語義分割的總結,可能大學畢設會用到,暫時先簡單總結一下自己的所得。 大學快要畢業了,開始準備畢設,分割方向逃不了了。提示:排版對手機端

綜述論文翻譯:A Review on Deep Learning Techniques Applied to Semantic Segmentation

應用於語義分割問題的深度學習技術綜述 摘要 計算機視覺與機器學習研究者對影象語義分割問題越來越感興趣。越來越多的應用場景需要精確且高效的分割技術,如自動駕駛、室內導航、甚至虛擬現實與增強現實等。這個需求與視覺相關的各個領域及應用場景下的深度學習技術的發展相符合,包括語義分割及場景理解等。這篇論文回

論文:Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey翻譯工作

**關於對抗性攻擊對深度學習威脅的研究** Naveed Akhtar and Ajmal Mian ACKNOWLEDGEMENTS: The authors thank Nicholas Carlini (UC Berkeley) and Dimit

論文閱讀:A Primer on Neural Network Models for Natural Language Processing(1)

選擇 works embed 負責 距離 feature 結構 tran put 前言 2017.10.2博客園的第一篇文章,Mark。 由於實驗室做的是NLP和醫療相關的內容,因此開始啃NLP這個硬骨頭,希望能學有所成。後續將關註知識圖譜,深度強化學習等內

基於深度學習的影象檢索 image retrieval based on deep learning (code ,程式碼)

本次程式碼分享主要是用的caffe框架,至於caffe框架的安裝過程不再說明。程式碼修改自“cross weights”的一篇2016年的文章,但是名字忘記了,誰記得,提醒我下。 一、環境要求         1、python &nb

「Computer Vision」Notes on Deep Learning for Generic Object Detection

QQ Group: 428014259 Sina Weibo:小鋒子Shawn Tencent E-mail:[email protected] http://blog.csdn.net/dgyuanshaofeng/article/details/83834249 [1]

On Deep Learning-Based Channel Decoding 論文筆記

摘要 我們重新考慮使用深度神經網路對隨機和結構化碼字(例如極性碼字)進行一次性解碼。 雖然可以為碼字族和短碼字長度實現最大後驗(MAP)誤位元速率(BER)效能,但我們觀察到(i)結構化碼字更容易學習和(ii)神經網路能夠生成在結構化訓練期間從未見過的碼字,而不是隨機碼字。 這些結果提供了一些證據,表明神經

讀書筆記-Python科學程式設計入門(A Primer on Scientific Programming with Python)(第五版)-第一章

第一章-用公式計算(Computing with Formulas) 通過和數學公式有關的例子,介紹變數(variable)、物件(object)、模組(module)和文字格式化(text formatting)相關的概念。 1.1-與程式設計的初遇:一個公式 用Py

讀書筆記-Python科學程式設計入門(A Primer on Scientific Programming with Python)(第五版)-第四章

第四章-使用者輸入與錯誤處理(User Input and Error Handling) Python 輸入資料的方法: 從終端視窗(Terminal Window)輸入 (4.1 節) 從命令列(Command Line)輸入 (4.2 節) 從檔案(File

Artificial Intelligence News — Newsletter on Deep Learning & AI

This newsletter is a collection of AI news and resources curated by @dlissmyr. If you find it worthwhile, please forward to your friends and col

A Primer on Trustworthy Secure Bootloading*

Secure bootloading is not very complicated, but we often make it seem complicated. In a discussion of this topic with colleagues, it was apparent that “sec

A Primer on Cloud Functions for Firebase

Since I am going to use Cloud Functions For Firebase a bit more in coming posts on this blog, I am going to give a ve

Biotechnology Stocks To Buy Based on Deep Learning: Returns up to 52.38% in 7 Days

Biotechnology to buy: The BioTech Stocks Package is designed for investors and analysts who need predictions of the best stocks to buy for the whole Biotec

『 論文閱讀』A Multi-View Deep Learning Approach for Cross Domain User Modeling in Recommendation Systems

AbstractMULTI-VIEW-DNN聯合了多個域做的豐富特徵,使用multi-view DNN模型構建推薦,包括app、新聞、電影和TV,相比於最好的演算法,老使用者提升49%,新使用者提升110%。並且可以輕鬆的涵蓋大量使用者,解決冷啟動問題。主要做user embedding的過程,通多使用者在多

A Survey on Transfer Learning》中文版翻譯《遷移學習研究綜述》

首先感謝(http://blog.csdn.net/magic_leg/article/details/73957331)這篇部落格首次將《A Survey on Transfer Learning》這篇文章翻譯成中文版,給予我們很大的參考。但上述作者翻譯的內容有很多不準確的

Review of Stanford Course on Deep Learning for Natural Language Processing

Tweet Share Share Google Plus Natural Language Processing, or NLP, is a subfield of machine lear

論文閱讀:A Survey on Transfer Learning

  本文主要內容為論文《A Survey on Transfer Learning》的閱讀筆記,內容和圖片主要參考 該論文 。其中部分內容引用與部落格《遷移學習綜述a survey on transfer learning的整理下載》,感謝博主xf__ma

讀書筆記-Python科學程式設計入門(A Primer on Scientific Programming with Python)(第五版)-第二章

第二章-迴圈和列表(Loops and Lists) 介紹兩個迴圈語句:for 迴圈和 while 迴圈;介紹一種儲存資料的物件(object):列表 2.1-While迴圈 用於重複執行若干條語句 結構為: while 迴圈條件: 需迴圈的語句1

Understanding Feature Engineering (Part 4) — A hands-on intuitive approach to Deep Learning Methods

Introduction Working with unstructured text data is hard especially when you are trying to build an intelligent system which interprets and understa