深度學習相關論文
1. 深度學習基礎及歷史
1.0 書
[0] 深度學習聖經 ★★★★★
Bengio, Yoshua, Ian J. Goodfellow, and AaronCourville. "Deep learning." An MIT Press book. (2015).
https://github.com/HFTrader/DeepLearningBook/raw/master/DeepLearningBook.pdf
1.1 報告
[1] 三巨頭報告★★★★★
LeCun, Yann, Yoshua Bengio, and GeoffreyHinton. "Deep learning." Nature 521.7553 (2015): 436-444.
http://www.cs.toronto.edu/%7Ehinton/absps/NatureDeepReview.pdf
1.2 深度信念網路 (DBN)
[2] 深度學習前夜的里程碑 ★★★
Hinton, Geoffrey E., Simon Osindero, andYee-Whye Teh. "A fast learning algorithm for deep belief nets."Neural computation 18.7 (2006): 1527-1554.
http://www.cs.toronto.edu/%7Ehinton/absps/ncfast.pdf
[3] 展示深度學習前景的里程碑 ★★★
Hinton, Geoffrey E., and Ruslan R.Salakhutdinov. "Reducing the dimensionality of data with neuralnetworks." Science 313.5786 (2006): 504-507.
http://www.cs.toronto.edu/%7Ehinton/science.pdf
1.3 ImageNet革命(深度學習大爆炸)
[4] AlexNet的深度學習突破 ★★★
Krizhevsky, Alex, Ilya Sutskever, andGeoffrey E. Hinton. "Imagenet classification with deep convolutionalneural networks." Advances in neural information processing systems. 2012.
http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf
[5] VGGNet深度神經網路出現 ★★★
Simonyan, Karen, and Andrew Zisserman."Very deep convolutional networks for large-scale image recognition."arXiv preprint arXiv:1409.1556 (2014).
https://arxiv.org/pdf/1409.1556.pdf
[6] GoogLeNet ★★★
Szegedy, Christian, et al. "Goingdeeper with convolutions." Proceedings of the IEEE Conference on ComputerVision and Pattern Recognition. 2015.
http://www.cv-foundation.org/openaccess/content_cvpr_2015/papers/Szegedy_Going_Deeper_With_2015_CVPR_paper.pdf
[7] ResNet極深度神經網路,CVPR最佳論文 ★★★★★
He, Kaiming, et al. "Deep residuallearning for image recognition." arXiv preprint arXiv:1512.03385 (2015).
https://arxiv.org/pdf/1512.03385.pdf
1.4 語音識別革命
[8] 語音識別突破 ★★★★
Hinton, Geoffrey, et al. "Deep neuralnetworks for acoustic modeling in speech recognition: The shared views of fourresearch groups." IEEE Signal Processing Magazine 29.6 (2012): 82-97.
http://cs224d.stanford.edu/papers/maas_paper.pdf
[9] RNN論文 ★★★
Graves, Alex, Abdel-rahman Mohamed, andGeoffrey Hinton. "Speech recognition with deep recurrent neuralnetworks." 2013 IEEE international conference on acoustics, speech andsignal processing. IEEE, 2013.
http://arxiv.org/pdf/1303.5778.pdf
[10] 端對端RNN語音識別 ★★★
Graves, Alex, and Navdeep Jaitly."Towards End-To-End Speech Recognition with Recurrent NeuralNetworks." ICML. Vol. 14. 2014.
http://www.jmlr.org/proceedings/papers/v32/graves14.pdf
[11] Google語音識別系統論文 ★★★
Sak, Haşim, et al. "Fast and accuraterecurrent neural network acoustic models for speech recognition." arXivpreprint arXiv:1507.06947 (2015).
http://arxiv.org/pdf/1507.06947
[12] 百度語音識別系統論文 ★★★★
Amodei, Dario, et al. "Deep speech 2:End-to-end speech recognition in english and mandarin." arXiv preprintarXiv:1512.02595 (2015).
https://arxiv.org/pdf/1512.02595.pdf
[13] 來自微軟的當下最先進的語音識別論文 ★★★★
W. Xiong, J. Droppo, X. Huang, F. Seide, M.Seltzer, A. Stolcke, D. Yu, G. Zweig "Achieving Human Parity inConversational Speech Recognition." arXiv preprint arXiv:1610.05256(2016).
https://arxiv.org/pdf/1610.05256v1
讀完上面這些論文,你將對深度學習的歷史、深度學習模型(CNN、RNN、LSTM等)的基本架構有一個基本認識,並能理解深度學習是如何解決影象及語音識別問題的。接下來的論文將帶你深入理解深度學習方法、深度學習在前沿領域的不同應用。根據自己的興趣和研究方向選擇閱讀即可:
2. 深度學習方法
2.1 模型
[14] Dropout ★★★
Hinton, Geoffrey E., et al. "Improvingneural networks by preventing co-adaptation of feature detectors." arXivpreprint arXiv:1207.0580 (2012).
https://arxiv.org/pdf/1207.0580.pdf
[15] 過擬合 ★★★
Srivastava, Nitish, et al. "Dropout: asimple way to prevent neural networks from overfitting." Journal ofMachine Learning Research 15.1 (2014): 1929-1958.
http://www.jmlr.org/papers/volume15/srivastava14a.old/source/srivastava14a.pdf
[16] Batch歸一化——2015年傑出成果 ★★★★
Ioffe, Sergey, and Christian Szegedy."Batch normalization: Accelerating deep network training by reducinginternal covariate shift." arXiv preprint arXiv:1502.03167 (2015).
http://arxiv.org/pdf/1502.03167
[17] Batch歸一化的升級 ★★★★
Ba, Jimmy Lei, Jamie Ryan Kiros, andGeoffrey E. Hinton. "Layer normalization." arXiv preprintarXiv:1607.06450 (2016).
https://arxiv.org/pdf/1607.06450.pdf
[18] 快速訓練新模型 ★★★
Courbariaux, Matthieu, et al."Binarized Neural Networks: Training Neural Networks with Weights andActivations Constrained to+ 1 or−1."
https://pdfs.semanticscholar.org/f832/b16cb367802609d91d400085eb87d630212a.pdf
[19] 訓練方法創新 ★★★★★
Jaderberg, Max, et al. "Decoupledneural interfaces using synthetic gradients." arXiv preprintarXiv:1608.05343 (2016).
https://arxiv.org/pdf/1608.05343
[20] 修改預訓練網路以降低訓練耗時 ★★★
Chen, Tianqi, Ian Goodfellow, and JonathonShlens. "Net2net: Accelerating learning via knowledge transfer."arXiv preprint arXiv:1511.05641 (2015).
https://arxiv.org/abs/1511.05641
[21] 修改預訓練網路以降低訓練耗時 ★★★
Wei, Tao, et al. "NetworkMorphism." arXiv preprint arXiv:1603.01670 (2016).
https://arxiv.org/abs/1603.01670
2.2 優化
[22] 動量優化器 ★★
Sutskever, Ilya, et al. "On theimportance of initialization and momentum in deep learning." ICML (3) 28(2013): 1139-1147.
http://www.jmlr.org/proceedings/papers/v28/sutskever13.pdf
[23] 可能是當前使用最多的隨機優化 ★★★
Kingma, Diederik, and Jimmy Ba. "Adam:A method for stochastic optimization." arXiv preprint arXiv:1412.6980(2014).
http://arxiv.org/pdf/1412.6980
[24] 神經優化器 ★★★★★
Andrychowicz, Marcin, et al. "Learningto learn by gradient descent by gradient descent." arXiv preprintarXiv:1606.04474 (2016).
https://arxiv.org/pdf/1606.04474
[25] ICLR最佳論文,讓神經網路執行更快的新方向★★★★★
Han, Song, Huizi Mao, and William J. Dally."Deep compression: Compressing deep neural network with pruning, trainedquantization and huffman coding." CoRR, abs/1510.00149 2 (2015).
https://pdfs.semanticscholar.org/5b6c/9dda1d88095fa4aac1507348e498a1f2e863.pdf
[26] 優化神經網路的另一個新方向 ★★★★
Iandola, Forrest N., et al."SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and< 1MBmodel size." arXiv preprint arXiv:1602.07360 (2016).
http://arxiv.org/pdf/1602.07360
2.3 無監督學習 / 深度生成式模型
[27] Google Brain找貓的里程碑論文,吳恩達 ★★★★
Le, Quoc V. "Building high-levelfeatures using large scale unsupervised learning." 2013 IEEE internationalconference on acoustics, speech and signal processing. IEEE, 2013.
http://arxiv.org/pdf/1112.6209.pdf
[28] 變分自編碼機 (VAE) ★★★★
Kingma, Diederik P., and Max Welling."Auto-encoding variational bayes." arXiv preprint arXiv:1312.6114(2013).
http://arxiv.org/pdf/1312.6114
[29] 生成式對抗網路 (GAN) ★★★★★
Goodfellow, Ian, et al. "Generativeadversarial nets." Advances in Neural Information Processing Systems.2014.
http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf
[30] 解卷積生成式對抗網路 (DCGAN) ★★★★
Radford, Alec, Luke Metz, and SoumithChintala. "Unsupervised representation learning with deep convolutionalgenerative adversarial networks." arXiv preprint arXiv:1511.06434 (2015).
http://arxiv.org/pdf/1511.06434
[31] Attention機制的變分自編碼機 ★★★★★
Gregor, Karol, et al. "DRAW: Arecurrent neural network for image generation." arXiv preprintarXiv:1502.04623 (2015).
http://jmlr.org/proceedings/papers/v37/gregor15.pdf
[32] PixelRNN ★★★★
Oord, Aaron van den, Nal Kalchbrenner, andKoray Kavukcuoglu. "Pixel recurrent neural networks." arXiv preprintarXiv:1601.06759 (2016).
http://arxiv.org/pdf/1601.06759
[33] PixelCNN ★★★★
Oord, Aaron van den, et al."Conditional image generation with PixelCNN decoders." arXiv preprintarXiv:1606.05328 (2016).
https://arxiv.org/pdf/1606.05328
2.4 RNN / 序列到序列模型
[34] RNN的生成式序列,LSTM ★★★★
Graves, Alex. "Generating sequenceswith recurrent neural networks." arXiv preprint arXiv:1308.0850 (2013).
http://arxiv.org/pdf/1308.0850
[35] 第一份序列到序列論文 ★★★★
Cho, Kyunghyun, et al. "Learning phraserepresentations using RNN encoder-decoder for statistical machinetranslation." arXiv preprint arXiv:1406.1078 (2014).
http://arxiv.org/pdf/1406.1078
[36] 神經網路的序列到序列學習 ★★★★★
Sutskever, Ilya, Oriol Vinyals, and Quoc V.Le. "Sequence to sequence learning with neural networks." Advances inneural information processing systems. 2014.
http://papers.nips.cc/paper/5346-information-based-learning-by-agents-in-unbounded-state-spaces.pdf
[37] 神經機器翻譯 ★★★★
Bahdanau, Dzmitry, KyungHyun Cho, and YoshuaBengio. "Neural Machine Translation by Jointly Learning to Align and Translate."arXiv preprint arXiv:1409.0473 (2014).
https://arxiv.org/pdf/1409.0473v7.pdf
[38] 序列到序列Chatbot ★★★
Vinyals, Oriol, and Quoc Le. "A neuralconversational model." arXiv preprint arXiv:1506.05869 (2015).
http://arxiv.org/pdf/1506.05869.pdf%20(http://arxiv.org/pdf/1506.05869.pdf
2.5 神經網路圖靈機
[39] 未來計算機的基本原型 ★★★★★
Graves, Alex, Greg Wayne, and Ivo Danihelka."Neural turing machines." arXiv preprint arXiv:1410.5401 (2014).
http://arxiv.org/pdf/1410.5401.pdf
[40] 強化學習神經圖靈機★★★
Zaremba, Wojciech, and Ilya Sutskever."Reinforcement learning neural Turing machines." arXiv preprintarXiv:1505.00521 362 (2015).
https://pdfs.semanticscholar.org/f10e/071292d593fef939e6ef4a59baf0bb3a6c2b.pdf
[41] 記憶網路 ★★★
Weston, Jason, Sumit Chopra, and AntoineBordes. "Memory networks." arXiv preprint arXiv:1410.3916 (2014).
http://arxiv.org/pdf/1410.3916
[42] 端對端記憶網路 ★★★★
Sukhbaatar, Sainbayar, Jason Weston, and RobFergus. "End-to-end memory networks." Advances in neural informationprocessing systems. 2015.
http://papers.nips.cc/paper/5846-end-to-end-memory-networks.pdf
[43] 指標網路 ★★★★
Vinyals, Oriol, Meire Fortunato, and NavdeepJaitly. "Pointer networks." Advances in Neural Information ProcessingSystems. 2015.
http://papers.nips.cc/paper/5866-pointer-networks.pdf
[44] 整合神經網路圖靈機概念的里程碑論文 ★★★★★
Graves, Alex, et al. "Hybrid computingusing a neural network with dynamic external memory." Nature (2016).
https://www.dropbox.com/s/0a40xi702grx3dq/2016-graves.pdf
2.6 深度強化學習
[45] 第一篇以深度強化學習為名的論文 ★★★★
Mnih, Volodymyr, et al. "Playing atariwith deep reinforcement learning." arXiv preprint arXiv:1312.5602 (2013).
http://arxiv.org/pdf/1312.5602.pdf
[46] 里程碑 ★★★★★
Mnih, Volodymyr, et al. "Human-levelcontrol through deep reinforcement learning." Nature 518.7540 (2015): 529-533.
https://storage.googleapis.com/deepmind-data/assets/papers/DeepMindNature14236Paper.pdf
[47] ICLR最佳論文 ★★★★
Wang, Ziyu, Nando de Freitas, and MarcLanctot. "Dueling network architectures for deep reinforcementlearning." arXiv preprint arXiv:1511.06581 (2015).
http://arxiv.org/pdf/1511.06581
[48] 當前最先進的深度強化學習方法 ★★★★★
Mnih, Volodymyr, et al. "Asynchronousmethods for deep reinforcement learning." arXiv preprint arXiv:1602.01783(2016).
http://arxiv.org/pdf/1602.01783
[49] DDPG ★★★★
Lillicrap, Timothy P., et al."Continuous control with deep reinforcement learning." arXiv preprintarXiv:1509.02971 (2015).
http://arxiv.org/pdf/1509.02971
[50] NAF ★★★★
Gu, Shixiang, et al. "Continuous DeepQ-Learning with Model-based Acceleration." arXiv preprint arXiv:1603.00748(2016).
http://arxiv.org/pdf/1603.00748
[51] TRPO ★★★★
Schulman, John, et al. "Trust regionpolicy optimization." CoRR, abs/1502.05477 (2015).
http://www.jmlr.org/proceedings/papers/v37/schulman15.pdf
[52] AlphaGo ★★★★★
Silver, David, et al. "Mastering thegame of Go with deep neural networks and tree search." Nature 529.7587(2016): 484-489.
http://willamette.edu/%7Elevenick/cs448/goNature.pdf
2.7 深度遷移學習 / 終生學習 / 強化學習
[53] Bengio教程 ★★★
Bengio, Yoshua. "Deep Learning ofRepresentations for Unsupervised and Transfer Learning." ICML Unsupervisedand Transfer Learning 27 (2012): 17-36.
http://www.jmlr.org/proceedings/papers/v27/bengio12a/bengio12a.pdf
[54] 終生學習的簡單討論 ★★★
Silver, Daniel L., Qiang Yang, and LianghaoLi. "Lifelong Machine Learning Systems: Beyond Learning Algorithms."AAAI Spring Symposium: Lifelong Machine Learning. 2013.
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.696.7800&rep=rep1&type=pdf
[55] Hinton、Jeff Dean大神研究 ★★★★
Hinton, Geoffrey, Oriol Vinyals, and JeffDean. "Distilling the knowledge in a neural network." arXiv preprintarXiv:1503.02531 (2015).
http://arxiv.org/pdf/1503.02531
[56] 強化學習策略 ★★★
Rusu, Andrei A., et al. "Policydistillation." arXiv preprint arXiv:1511.06295 (2015).
http://arxiv.org/pdf/1511.06295
[57] 多工深度遷移強化學習 ★★★
Parisotto, Emilio, Jimmy Lei Ba, and RuslanSalakhutdinov. "Actor-mimic: Deep multitask and transfer reinforcementlearning." arXiv preprint arXiv:1511.06342 (2015).
http://arxiv.org/pdf/1511.06342
[58] 累進式神經網路 ★★★★★
Rusu, Andrei A., et al. "Progressiveneural networks." arXiv preprint arXiv:1606.04671 (2016).
https://arxiv.org/pdf/1606.04671
2.8 一次性深度學習
[59] 不涉及深度學習,但值得一讀 ★★★★★
Lake, Brenden M., Ruslan Salakhutdinov, andJoshua B. Tenenbaum. "Human-level concept learning through probabilisticprogram induction." Science 350.6266 (2015): 1332-1338.
http://clm.utexas.edu/compjclub/wp-content/uploads/2016/02/lake2015.pdf
[60] 一次性影象識別 ★★★
Koch, Gregory, Richard Zemel, and RuslanSalakhutdinov. "Siamese Neural Networks for One-shot Image Recognition."(2015)
http://www.cs.utoronto.ca/%7Egkoch/files/msc-thesis.pdf
[61] 一次性學習基礎 ★★★★
Santoro, Adam, et al. "One-shotLearning with Memory-Augmented Neural Networks." arXiv preprintarXiv:1605.06065 (2016).
http://arxiv.org/pdf/1605.06065
[62] 一次性學習網路 ★★★
Vinyals, Oriol, et al. "MatchingNetworks for One Shot Learning." arXiv preprint arXiv:1606.04080 (2016).
https://arxiv.org/pdf/1606.04080
[63] 大型資料 ★★★★
Hariharan, Bharath, and Ross Girshick."Low-shot visual object recognition." arXiv preprint arXiv:1606.02819(2016).
http://arxiv.org/pdf/1606.02819
3. 應用
3.1 自然語言處理 (NLP)
[1] ★★★★
Antoine Bordes, et al. "Joint Learningof Words and Meaning Representations for Open-Text Semantic Parsing."AISTATS(2012)
https://www.hds.utc.fr/%7Ebordesan/dokuwiki/lib/exe/fetch.php?id=en%3Apubli&cache=cache&media=en:bordes12aistats.pdf
[2] ★★★
word2vec
Mikolov, et al. "Distributedrepresentations of words and phrases and their compositionality."ANIPS(2013): 3111-3119
http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf
[3]★★★
Sutskever, et al. "Sequence to sequencelearning with neural networks." ANIPS(2014)
http://papers.nips.cc/paper/5346-sequence-to-sequence-learning-with-neural-networks.pdf
[4] ★★★★
Ankit Kumar, et al. "Ask Me Anything:Dynamic Memory Networks for Natural Language Processing." arXiv preprintarXiv:1506.07285(2015)
https://arxiv.org/abs/1506.07285
[5] ★★★★
Yoon Kim, et al. "Character-AwareNeural Language Models." NIPS(2015) arXiv preprint arXiv:1508.06615(2015)
https://arxiv.org/abs/1508.06615
[6] bAbI任務 ★★★
Jason Weston, et al. "TowardsAI-Complete Question Answering: A Set of Prerequisite Toy Tasks." arXivpreprint arXiv:1502.05698(2015)
https://arxiv.org/abs/1502.05698
[7] CNN / DailyMail 風格對比 ★★
Karl Moritz Hermann, et al. "TeachingMachines to Read and Comprehend." arXiv preprint arXiv:1506.03340(2015)
https://arxiv.org/abs/1506.03340
[8] 當前最先進的文字分類 ★★★
Alexis Conneau, et al. "Very DeepConvolutional Networks for Natural Language Processing." arXiv preprintarXiv:1606.01781(2016)
https://arxiv.org/abs/1606.01781
[9] 稍次於最先進方案,但速度快很多 ★★★
Armand Joulin, et al. "Bag of Tricksfor Efficient Text Classification." arXiv preprint arXiv:1607.01759(2016)
https://arxiv.org/abs/1607.01759
3.2 目標檢測
[1] ★★★
Szegedy, Christian, Alexander Toshev, andDumitru Erhan. "Deep neural networks for object detection." Advancesin Neural Information Processing Systems. 2013.
http://papers.nips.cc/paper/5207-deep-neural-networks-for-object-detection.pdf
[2] RCNN ★★★★★
Girshick, Ross, et al. "Rich featurehierarchies for accurate object detection and semantic segmentation."Proceedings of the IEEE conference on computer vision and pattern recognition.2014.
http://www.cv-foundation.org/openaccess/content_cvpr_2014/papers/Girshick_Rich_Feature_Hierarchies_2014_CVPR_paper.pdf
[3] SPPNet ★★★★
He, Kaiming, et al. "Spatial pyramidpooling in deep convolutional networks for visual recognition." EuropeanConference on Computer Vision. Springer International Publishing, 2014.
http://arxiv.org/pdf/1406.4729
[4] ★★★★
Girshick, Ross. "Fast r-cnn."Proceedings of the IEEE International Conference on Computer Vision. 2015.
https://pdfs.semanticscholar.org/8f67/64a59f0d17081f2a2a9d06f4ed1cdea1a0ad.pdf
[5] ★★★★
Ren, Shaoqing, et al. "Faster R-CNN:Towards real-time object detection with region proposal networks."Advances in neural information processing systems. 2015.
http://papers.nips.cc/paper/5638-analysis-of-variational-bayesian-latent-dirichlet-allocation-weaker-sparsity-than-map.pdf
[6] 相當實用的YOLO專案 ★★★★★
Redmon, Joseph, et al. "You only lookonce: Unified, real-time object detection." arXiv preprintarXiv:1506.02640 (2015).
http://homes.cs.washington.edu/%7Eali/papers/YOLO.pdf
[7] ★★★
Liu, Wei, et al. "SSD: Single ShotMultiBox Detector." arXiv preprint arXiv:1512.02325 (2015).
http://arxiv.org/pdf/1512.02325
[8] ★★★★
Dai, Jifeng, et al. "R-FCN: ObjectDetection via Region-based Fully Convolutional Networks." arXiv preprintarXiv:1605.06409 (2016).
https://arxiv.org/abs/1605.06409
[9] ★★★★
He, Gkioxari, et al. "Mask R-CNN"arXiv preprint arXiv:1703.06870 (2017).
https://arxiv.org/abs/1703.06870
3.3 視覺追蹤
[1] 第一份採用深度學習的視覺追蹤論文,DLT追蹤器 ★★★
Wang, Naiyan, and Dit-Yan Yeung."Learning a deep compact image representation for visual tracking."Advances in neural information processing systems. 2013.
http://papers.nips.cc/paper/5192-learning-a-deep-compact-image-representation-for-visual-tracking.pdf
[2] SO-DLT ★★★★
Wang, Naiyan, et al. "Transferring richfeature hierarchies for robust visual tracking." arXiv preprintarXiv:1501.04587 (2015).
http://arxiv.org/pdf/1501.04587
[3] FCNT ★★★★
Wang, Lijun, et al. "Visual trackingwith fully convolutional networks." Proceedings of the IEEE InternationalConference on Computer Vision. 2015.
http://www.cv-foundation.org/openaccess/content_iccv_2015/papers/Wang_Visual_Tracking_With_ICCV_2015_paper.pdf
[4] 跟深度學習一樣快的非深度學習方法,GOTURN ★★★★
Held, David, Sebastian Thrun, and SilvioSavarese. "Learning to Track at 100 FPS with Deep Regression Networks."arXiv preprint arXiv:1604.01802 (2016).
http://arxiv.org/pdf/1604.01802
[5] 新的最先進的實時目標追蹤方案 SiameseFC ★★★★
Bertinetto, Luca, et al."Fully-Convolutional Siamese Networks for Object Tracking." arXivpreprint arXiv:1606.09549 (2016).
https://arxiv.org/pdf/1606.09549
[6] C-COT ★★★★
Martin Danelljan, Andreas Robinson, FahadKhan, Michael Felsberg. "Beyond Correlation Filters: Learning ContinuousConvolution Operators for Visual Tracking." ECCV (2016)
http://www.cvl.isy.liu.se/research/objrec/visualtracking/conttrack/C-COT_ECCV16.pdf
[7] VOT2016大賽冠軍 TCNN ★★★★
Nam, Hyeonseob, Mooyeol Baek, and BohyungHan. "Modeling and Propagating CNNs in a Tree Structure for VisualTracking." arXiv preprint arXiv:1608.07242 (2016).
https://arxiv.org/pdf/1608.07242
3.4 影象標註
[1] ★★★
Farhadi,Ali,etal. "Every picture tellsa story: Generating sentences from images". In Computer VisionECCV201match0. Spmatchringer Berlin Heidelberg:15-29, 2010.
https://www.cs.cmu.edu/%7Eafarhadi/papers/sentence.pdf
[2] ★★★★
Kulkarni, Girish, et al. "Baby talk:Understanding and generating image descriptions". In Proceedings of the24th CVPR, 2011.
http://tamaraberg.com/papers/generation_cvpr11.pdf
[3] ★★★
Vinyals, Oriol, et al. "Show and tell:A neural image caption generator". In arXiv preprint arXiv:1411.4555,2014.
https://arxiv.org/pdf/1411.4555.pdf
[4] RNN視覺識別與標註
Donahue, Jeff, et al. "Long-termrecurrent convolutional networks for visual recognition and description".In arXiv preprint arXiv:1411.4389 ,2014.
https://arxiv.org/pdf/1411.4389.pdf
[5] 李飛飛及高徒Andrej Karpathy ★★★★★
Karpathy, Andrej, and Li Fei-Fei. "Deepvisual-semantic alignments for generating image descriptions". In arXivpreprint arXiv:1412.2306, 2014.
https://cs.stanford.edu/people/karpathy/cvpr2015.pdf
[6] 李飛飛及高徒Andrej Karpathy ★★★★
Karpathy, Andrej, Armand Joulin, and Fei FeiF. Li. "Deep fragment embeddings for bidirectional image sentencemapping". In Advances in neural information processing systems, 2014.
https://arxiv.org/pdf/1406.5679v1.pdf
[7] ★★★★
Fang, Hao, et al. "From captions tovisual concepts and back". In arXiv preprint arXiv:1411.4952, 2014.
https://arxiv.org/pdf/1411.4952v3.pdf
[8] ★★★★
Chen, Xinlei, and C. Lawrence Zitnick."Learning a recurrent visual representation for image caption generation".In arXiv preprint arXiv:1411.5654, 2014.
https://arxiv.org/pdf/1411.5654v1.pdf
[9]★★★
Mao, Junhua, et al. "Deep captioningwith multimodal recurrent neural networks (m-rnn)". In arXiv preprintarXiv:1412.6632, 2014.
https://arxiv.org/pdf/1412.6632v5.pdf
[10] ★★★★★
Xu, Kelvin, et al. "Show, attend andtell: Neural image caption generation with visual attention". In arXivpreprint arXiv:1502.03044, 2015.
https://arxiv.org/pdf/1502.03044v3.pdf
3.5 機器翻譯
本話題的部分里程碑論文列在 2.4 “RNN / 序列到序列模型”話題下。
[1] ★★★★
Luong, Minh-Thang, et al. "Addressingthe rare word problem in neural machine translation." arXiv preprintarXiv:1410.8206 (2014).
http://arxiv.org/pdf/1410.8206
[2] ★★★
Sennrich, et al. "Neural MachineTranslation of Rare Words with Subword Units". In arXiv preprintarXiv:1508.07909, 2015.
https://arxiv.org/pdf/1508.07909.pdf
[3]★★★★
Luong, Minh-Thang, Hieu Pham, andChristopher D. Manning. "Effective approaches to attention-based neuralmachine translation." arXiv preprint arXiv:1508.04025 (2015).
http://arxiv.org/pdf/1508.04025
[4] ★★
Chung, et al. "A Character-LevelDecoder without Explicit Segmentation for Neural Machine Translation". InarXiv preprint arXiv:1603.06147, 2016.
https://arxiv.org/pdf/1603.06147.pdf
[5] ★★★★★
Lee, et al. "Fully Character-LevelNeural Machine Translation without Explicit Segmentation". In arXivpreprint arXiv:1610.03017, 2016.
https://arxiv.org/pdf/1610.03017.pdf
[6] 里程碑 ★★★★
Wu, Schuster, Chen, Le, et al."Google's Neural Machine Translation System: Bridging the Gap betweenHuman and Machine Translation". In arXiv preprint arXiv:1609.08144v2,2016.
https://arxiv.org/pdf/1609.08144v2.pdf
3.6 機器人
[1] ★★★
Koutník, Jan, et al. "Evolvinglarge-scale neural networks for vision-based reinforcement learning."Proceedings of the 15th annual conference on Genetic and evolutionarycomputation. ACM, 2013.
http://repository.supsi.ch/4550/1/koutnik2013gecco.pdf
[2] ★★★★★
Levine, Sergey, et al. "End-to-endtraining of deep visuomotor policies." Journal of Machine LearningResearch 17.39 (2016): 1-40.
http://www.jmlr.org/papers/volume17/15-522/15-522.pdf
[3] ★★★
Pinto, Lerrel, and Abhinav Gupta."Supersizing self-supervision: Learning to grasp from 50k tries and 700robot hours." arXiv preprint arXiv:1509.06825 (2015).
http://arxiv.org/pdf/1509.06825
[4] ★★★★
Levine, Sergey, et al. "LearningHand-Eye Coordination for Robotic Grasping with Deep Learning and Large-ScaleData Collection." arXiv preprint arXiv:1603.02199 (2016).
http://arxiv.org/pdf/1603.02199
[5] ★★★★
Zhu, Yuke, et al. "Target-driven VisualNavigation in Indoor Scenes using Deep Reinforcement Learning." arXivpreprint arXiv:1609.05143 (2016).
https://arxiv.org/pdf/1609.05143
[6] ★★★★
Yahya, Ali, et al. "Collective RobotReinforcement Learning with Distributed Asynchronous Guided PolicySearch." arXiv preprint arXiv:1610.00673 (2016).
https://arxiv.org/pdf/1610.00673
[7] ★★★★
Gu, Shixiang, et al. "DeepReinforcement Learning for Robotic Manipulation." arXiv preprintarXiv:1610.00633 (2016).
https://arxiv.org/pdf/1610.00633
[8] ★★★★
A Rusu, M Vecerik, Thomas Rothörl, N Heess,R Pascanu, R Hadsell."Sim-to-Real Robot Learning from Pixels withProgressive Nets." arXiv preprint arXiv:1610.04286 (2016).
https://arxiv.org/pdf/1610.04286.pdf
[9] ★★★★
Mirowski, Piotr, et al. "Learning tonavigate in complex environments." arXiv preprint arXiv:1611.03673 (2016).
https://arxiv.org/pdf/1611.03673
3.7 藝術
[1] Google Deep Dream ★★★★
Mordvintsev, Alexander; Olah, Christopher;Tyka, Mike (2015). "Inceptionism: Going Deeper into Neural Networks".Google Research.
https://research.googleblog.com/2015/06/inceptionism-going-deeper-into-neural.html
[2] 當前最為成功的藝術風格遷移方案,Prisma ★★★★★
Gatys, Leon A., Alexander S. Ecker, andMatthias Bethge. "A neural algorithm of artistic style." arXivpreprint arXiv:1508.06576 (2015).
http://arxiv.org/pdf/1508.06576
[3] iGAN★★★★
Zhu, Jun-Yan, et al. "Generative VisualManipulation on the Natural Image Manifold." European Conference onComputer Vision. Springer International Publishing, 2016.
https://arxiv.org/pdf/1609.03552
[4] Neural Doodle ★★★★
Champandard, Alex J. "Semantic StyleTransfer and Turning Two-Bit Doodles into Fine Artworks." arXiv preprintarXiv:1603.01768 (2016).
http://arxiv.org/pdf/1603.01768
[5] ★★★★
Zhang, Richard, Phillip Isola, and Alexei A.Efros. "Colorful Image Colorization." arXiv preprint arXiv:1603.08511(2016).
http://arxiv.org/pdf/1603.08511
[6] 超解析度,李飛飛 ★★★★
Johnson, Justin, Alexandre Alahi, and LiFei-Fei. "Perceptual losses for real-time style transfer andsuper-resolution." arXiv preprint arXiv:1603.08155 (2016).
https://arxiv.org/pdf/1603.08155.pdf
[7] ★★★★
Vincent Dumoulin, Jonathon Shlens andManjunath Kudlur. "A learned representation for artistic style."arXiv preprint arXiv:1610.07629 (2016).
https://arxiv.org/pdf/1610.07629v1.pdf
[8] 基於空間位置、色彩資訊與空間尺度的風格遷移 ★★★★
Gatys, Leon and Ecker, etal."Controlling Perceptual Factors in Neural Style Transfer." arXivpreprint arXiv:1611.07865 (2016).
https://arxiv.org/pdf/1611.07865.pdf
[9] 紋理生成與風格遷移 ★★★★
Ulyanov, Dmitry and Lebedev, Vadim, et al."Texture Networks: Feed-forward Synthesis of Textures and StylizedImages." arXiv preprint arXiv:1603.03417(2016).
http://arxiv.org/abs/1603.03417
3.8 目標分割
[1] ★★★★★
J. Long, E. Shelhamer, and T. Darrell,“Fully convolutional networks for semantic segmentation.” in CVPR, 2015.
https://arxiv.org/pdf/1411.4038v2.pdf
[2] ★★★★★
L.-C. Chen, G. Papandreou, I. Kokkinos, K.Murphy, and A. L. Yuille. "Semantic image segmentation with deepconvolutional nets and fully connected crfs." In ICLR, 2015.
https://arxiv.org/pdf/1606.00915v1.pdf
[3] ★★★★
Pinheiro, P.O., Collobert, R., Dollar, P."Learning to segment object candidates." In: NIPS. 2015.
https://arxiv.org/pdf/1506.06204v2.pdf
[4] ★★★
Dai, J., He, K., Sun, J."Instance-aware semantic segmentation via multi-task networkcascades." in CVPR. 2016
https://arxiv.org/pdf/1512.04412v1.pdf
[5] ★★★
Dai, J., He, K., Sun, J."Instance-sensitive Fully Convolutional Networks." arXiv preprintarXiv:1603.08678 (2016).
https://arxiv.org/pdf/1603.08678v1.pdf