深度學習必讀的一些資料
阿新 • • 發佈:2019-02-10
List of reading lists and survey papers:
-
Books
- Deep Learning, Yoshua Bengio, Ian Goodfellow, Aaron Courville, MIT Press, In preparation.
-
Review Papers
- The monograph or review paper Learning Deep Architectures for AI (Foundations & Trends in Machine Learning, 2009).
- Deep Machine Learning – A New Frontier in Artificial Intelligence Research – a
- Graves, A. (2012). Supervised sequence labelling with recurrent neural networks(Vol. 385). Springer.
- Schmidhuber, J. (2014). Deep Learning in Neural Networks: An Overview. 75 pages, 850+ references, http://arxiv.org/abs/1404.7828,
PDF & LATEX source & complete public BIBTEX file under
-
Reinforcement Learning
- Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. “Playing Atari with deep reinforcement learning.” arXiv preprint arXiv:1312.5602 (2013).
- Volodymyr Mnih, Nicolas Heess, Alex Graves, Koray Kavukcuoglu. “
-
Computer Vision
- Going Deeper with Convolutions, Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, 19-Sept-2014.
- Learning Hierarchical Features for Scene Labeling, Clement Farabet, Camille Couprie, Laurent Najman and Yann LeCun, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2013.
- Learning Convolutional Feature Hierachies for Visual Recognition, Koray Kavukcuoglu, Pierre Sermanet, Y-Lan Boureau, Karol Gregor, Michaël Mathieu and Yann LeCun, Advances in Neural Information Processing Systems (NIPS 2010), 23, 2010.
- Cireşan, D. C., Meier, U., Gambardella, L. M., & Schmidhuber, J. (2010). Deep, big, simple neural nets for handwritten digit recognition. Neural computation, 22(12), 3207-3220.
- Ciresan, Dan, Ueli Meier, and Jürgen Schmidhuber. “Multi-column deep neural networks for image classification.” Computer Vision and Pattern Recognition (CVPR), 2012 IEEE Conference on. IEEE, 2012.
- Ciresan, D., Meier, U., Masci, J., & Schmidhuber, J. (2011, July). A committee of neural networks for traffic sign classification. In Neural Networks (IJCNN), The 2011 International Joint Conference on (pp. 1918-1921). IEEE.
-
Disentangling Factors and Variations with Depth
- Goodfellow, Ian, et al. “Measuring invariances in deep networks.” Advances in neural information processing systems 22 (2009): 646-654.
- Bengio, Yoshua, et al. “Better Mixing via Deep Representations.” arXiv preprint arXiv:1207.4404 (2012).
-
Transfer Learning and domain adaptation
- Raina, Rajat, et al. “Self-taught learning: transfer learning from unlabeled data.” Proceedings of the 24th international conference on Machine learning. ACM, 2007.
- R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu and P. Kuksa. Natural Language Processing (Almost) from Scratch. Journal of Machine Learning Research, 12:2493-2537, 2011.
- Mesnil, Grégoire, et al. “Unsupervised and transfer learning challenge: a deep learning approach.” Unsupervised and Transfer Learning Workshop, in conjunction with ICML. 2011.
- Ciresan, D. C., Meier, U., & Schmidhuber, J. (2012, June). Transfer learning for Latin and Chinese characters with deep neural networks. In Neural Networks (IJCNN), The 2012 International Joint Conference on (pp. 1-6). IEEE.
-
Practical Tricks and Guides
- Practical recommendations for gradient-based training of deep architectures, Yoshua Bengio, U. Montreal, arXiv report:1206.5533, Lecture Notes in Computer Science Volume 7700, Neural Networks: Tricks of the Trade Second Edition, Editors: Grégoire Montavon, Geneviève B. Orr, Klaus-Robert Müller, 2012.
- A practical guide to training Restricted Boltzmann Machines, by Geoffrey Hinton.
-
Foundation Theory and Motivation
- Hinton, Geoffrey E. “Deterministic Boltzmann learning performs steepest descent in weight-space.” Neural computation 1.1 (1989): 143-150.
- Bengio, Yoshua, and Samy Bengio. “Modeling high-dimensional discrete data with multi-layer neural networks.” Advances in Neural Information Processing Systems 12 (2000): 400-406.
- Bengio, Yoshua, et al. “Greedy layer-wise training of deep networks.” Advances in neural information processing systems 19 (2007): 153.
- Bengio, Yoshua, Martin Monperrus, and Hugo Larochelle. “Nonlocal estimation of manifold structure.” Neural Computation 18.10 (2006): 2509-2528.
- Hinton, Geoffrey E., and Ruslan R. Salakhutdinov. “Reducing the dimensionality of data with neural networks.” Science 313.5786 (2006): 504-507.
- Marc’Aurelio Ranzato, Y., Lan Boureau, and Yann LeCun. “Sparse feature learning for deep belief networks.” Advances in neural information processing systems 20 (2007): 1185-1192.
- Bengio, Yoshua, and Yann LeCun. “Scaling learning algorithms towards AI.” Large-Scale Kernel Machines 34 (2007).
- Le Roux, Nicolas, and Yoshua Bengio. “Representational power of restricted boltzmann machines and deep belief networks.” Neural Computation 20.6 (2008): 1631-1649.
- Sutskever, Ilya, and Geoffrey Hinton. “Temporal-Kernel Recurrent Neural Networks.” Neural Networks 23.2 (2010): 239-243.
- Le Roux, Nicolas, and Yoshua Bengio. “Deep belief networks are compact universal approximators.” Neural computation 22.8 (2010): 2192-2207.
- Bengio, Yoshua, and Olivier Delalleau. “On the expressive power of deep architectures.” Algorithmic Learning Theory. Springer Berlin/Heidelberg, 2011.
- Montufar, Guido F., and Jason Morton. “When Does a Mixture of Products Contain a Product of Mixtures?.” arXiv preprint arXiv:1206.0387 (2012).
- Montúfar, Guido, Razvan Pascanu, Kyunghyun Cho, and Yoshua Bengio. “On the Number of Linear Regions of Deep Neural Networks.” arXiv preprint arXiv:1402.1869 (2014).
-
Supervised Feedfoward Neural Networks
- The Manifold Tangent Classifier, Salah Rifai, Yann Dauphin, Pascal Vincent, Yoshua Bengio and Xavier Muller, in: NIPS’2011.
- Goodfellow, I., Warde-Farley, D., Mirza, M., Courville, A., and Bengio, Y. (2013). Maxout networks. Technical Report, Universite de Montreal.
- Wang, Sida, and Christopher Manning. “Fast dropout training.” In Proceedings of the 30th International Conference on Machine Learning (ICML-13), pp. 118-126. 2013.
- Glorot, Xavier, Antoine Bordes, and Yoshua Bengio. “Deep sparse rectifier networks.” In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR W&CP Volume, vol. 15, pp. 315-323. 2011.
-
Large Scale Deep Learning
- Building High-level Features Using Large Scale Unsupervised Learning Quoc V. Le, Marc’Aurelio Ranzato, Rajat Monga, Matthieu Devin, Kai Chen, Greg S. Corrado, Jeffrey Dean, and Andrew Y. Ng, ICML 2012.
- Bengio, Yoshua, et al. “Neural probabilistic language models.” Innovations in Machine Learning (2006): 137-186. Specifically Section 3 of this paper discusses the asynchronous SGD.
- Dean, Jeffrey, et al. “Large scale distributed deep networks.” Advances in Neural Information Processing Systems. 2012.
-
Hyper Parameters
-
Optimization
- Schaul, Tom, Sixin Zhang, and Yann LeCun. “No More Pesky Learning Rates.” arXiv preprint arXiv:1206.1106 (2012).
- Le Roux, Nicolas, Pierre-Antoine Manzagol, and Yoshua Bengio. “Topmoumoute online natural gradient algorithm.” Neural Information Processing Systems (NIPS). 2007.
- Bordes, Antoine, Léon Bottou, and Patrick Gallinari. “SGD-QN: Careful quasi-Newton stochastic gradient descent.” The Journal of Machine Learning Research 10 (2009): 1737-1754.
- Glorot, Xavier, and Yoshua Bengio. “Understanding the difficulty of training deep feedforward neural networks.” Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS’10). Society for Artificial Intelligence and Statistics. 2010.
- Glorot, Xavier, Antoine Bordes, and Yoshua Bengio. “Deep Sparse Rectifier Networks.” Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. JMLR W&CP Volume. Vol. 15. 2011.
- “Deep learning via Hessian-free optimization.” Martens, James. Proceedings of the 27th International Conference on Machine Learning (ICML). Vol. 951. 2010.
- Hochreiter, Sepp, and Jürgen Schmidhuber. “Flat minima.” Neural Computation, 9.1 (1997): 1-42.
- Pascanu, Razvan, and Yoshua Bengio. “Revisiting natural gradient for deep networks.” arXiv preprint arXiv:1301.3584 (2013).
- Advances in Neural Information Processing Systems, pp. 2933-2941. 2014.
-
Unsupervised Feature Learning
- Salakhutdinov, Ruslan, and Geoffrey E. Hinton. “Deep boltzmann machines.” Proceedings of the international conference on artificial intelligence and statistics. Vol. 5. No. 2. Cambridge, MA: MIT Press, 2009.
-
Deep Boltzmann Machines
- An Efficient Learning Procedure for Deep Boltzmann Machines, Ruslan Salakhutdinov and Geoffrey Hinton, Neural Computation August 2012, Vol. 24, No. 8: 1967 — 2006.
- Montavon, Grégoire, and Klaus-Robert Müller. “Deep Boltzmann Machines and the Centering Trick.” Neural Networks: Tricks of the Trade (2012): 621-637.
- Salakhutdinov, Ruslan, and Hugo Larochelle. “Efficient learning of deep boltzmann machines.” International Conference on Artificial Intelligence and Statistics. 2010.
- Salakhutdinov, Ruslan. . Diss. University of Toronto, 2009.
- Goodfellow, Ian, et al. “Multi-prediction deep Boltzmann machines.” Advances in Neural Information Processing Systems. 2013.