Author: Pedro Lopez
Translator: Li Haiming
Proofreader: Liang Fuqi
This article contains approximately 2832 words, and it is recommended to read in 8 minutes.
This article discusses how deep learning is currently undergoing a rapid evolution phase, with new technologies, tools, and applications profoundly changing the field of machine learning and continuously yielding significant results.
Keywords: algorithms, deep learning, machine learning, neural networks, TensorFlow, text analysis, development trends
Deep learning is in a rapid evolution phase, with new technologies, tools, and applications profoundly changing the field of machine learning and continuously yielding significant results.
As a subfield of machine learning and statistical learning, deep learning has made significant progress in recent years. Cloud computing, powerful open-source tools, and a vast amount of available data have become levers for these key breakthroughs. This article selects 20 top deep learning papers based on citation statistics from the deep learning papers on (academic.microsoft.com). It is worth noting that since the citation counts of papers are continuously updated, using citation counts as a measure of the article’s value must consider the publication time.
More than 75% of these 20 papers focus on research in deep learning and neural networks, particularly convolutional neural network technology. Nearly 50% of the papers are involved in pattern recognition application research in the field of machine vision. I believe that tools like TensorFlow and Theano, as well as the further use of GPUs, have paved the way for data scientists and machine learning engineers to expand their research areas.
1.Deep Learning, by Yann L., Yoshua B. & Geoffrey H. (2015) (Cited: 5,716)
Link:
https://www.cs.toronto.edu/~hinton/absps/NatureDeepReview.pdf
Deep learning enables computational models with multiple processing layers to learn how to represent data through multiple layers of abstraction. These methods significantly enhance the state-of-the-art performance in language recognition, visual image recognition, object detection, and other applications such as drug discovery and genomics.
2.TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems, by Martín A., Ashish A. B., Eugene B. C., et al. (2015) (Cited: 2,423)
Link: http://download.tensorflow.org/paper/whitepaper2015.pdf
This system is highly flexible and can express a variety of algorithms, including training and inference algorithms for deep neural network models, and has been used in dozens of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery, to research and deploy machine learning systems into production environments.
3.TensorFlow: a system for large-scale machine learning, by Martín A., Paul B., Jianmin C., Zhifeng C., Andy D. et al. (2016) (Cited: 2,227)
Link: https://www.usenix.org/system/files/conference/osdi16/osdi16-abadi.pdf%20rel=
TensorFlow supports a wide range of applications, focusing on the training and inference of deep neural networks. Some Google services use TensorFlow in production, and we have released it as an open-source project, which has been widely used in machine learning research.
4.Deep learning in neural networks, by Juergen Schmidhuber (2015) (Cited: 2,196)
Link: https://arxiv.org/pdf/1404.7828.pdf
This historical survey succinctly summarizes relevant work, much of which comes from the previous millennium. Shallow and deep learners are distinguished by the depth of their credit assignment paths, which are learnable, stochastic chains of relationships between behavior and effects. I review deep supervised learning (including a brief history of backpropagation), unsupervised learning, reinforcement learning, evolutionary computation, and indirectly investigate the coding schemes of large deep networks.
5.Human-level control through deep reinforcement learning, by Volodymyr M., Koray K., David S., Andrei A. R., Joel V et al (2015) (Cited: 2,086)
Link: https://web.stanford.edu/class/psych209/Readings/MnihEtAlHassibis15NatureControlDeepRL.pdf
Here we use recent advances in training deep neural networks to develop a new type of artificial agent called a deep Q-network. It can learn successful strategies directly from high-dimensional sensory input using end-to-end reinforcement learning. We tested this agent in classic Atari 2600 game challenges.
6.Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, by Shaoqing R., Kaiming H., Ross B. G. & Jian S. (2015) (Cited: 1,421)
Link: https://arxiv.org/pdf/1506.01497.pdf
In this work, we introduce a Region Proposal Network (RPN) that shares convolutional features of the entire image with the detection network, enabling nearly free region proposals. The RPN is a fully convolutional network that simultaneously predicts object boundaries and object scores at each position.
7.Long-term recurrent convolutional networks for visual recognition and description, by Jeff D., Lisa Anne H., Sergio G., Marcus R., Subhashini V. et al. (2015) (Cited: 1,285)
Link: https://arxiv.org/pdf/1411.4389.pdf
The current models assume a fixed spatiotemporal receptive field or simple temporal averaging and use it for sequential processing. Recurrent convolutional models have a “dual depth” because they can combine across “layers” in space and time.
8.MatConvNet: Convolutional Neural Networks for MATLAB, by Andrea Vedaldi & Karel Lenc (2015) (Cited: 1,148)
Link: https://arxiv.org/pdf/1412.4564.pdf
It implements CNN modules as easy-to-use MATLAB functions and provides programs for calculating linear convolutions with filter groups, feature pooling, etc. The documentation outlines CNN and how it is implemented in MatConvNet, while providing technical details of each operation block in the toolbox.
9.Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, by Alec R., Luke M. & Soumith C. (2015) (Cited: 1,054)
Link: https://arxiv.org/pdf/1511.06434.pdf
This paper aims to bridge the gap between unsupervised learning and the success of supervised learning with CNNs. We introduce a class of CNNs called deep convolutional generative adversarial networks, which have certain architectural constraints and demonstrate that they are strong candidates for unsupervised learning.
10.U-Net: Convolutional Networks for Biomedical Image Segmentation, by Olaf R., Philipp F. & Thomas B. (2015) (Cited: 975)
Link: https://arxiv.org/pdf/1505.04597.pdf
It is a consensus that training deep networks requires thousands of annotated training samples. This paper proposes a network and training strategy based on data augmentation to more effectively utilize available annotated samples.
11.Conditional Random Fields as Recurrent Neural Networks, by Shuai Z., Sadeep J., Bernardino R., Vibhav V. et al (2015) (Cited: 760)
Link: http://www.robots.ox.ac.uk/~szheng/papers/CRFasRNN.pdf
This paper introduces a new type of convolutional neural network that simultaneously includes convolutional neural networks and probabilistic image models based on conditional random fields. To achieve this, we consider the mean-field approximate inference of conditional random fields with Gaussian potentials as a recurrent neural network.
12.Image Super-Resolution Using Deep Convolutional Networks, by Chao D., Chen C., Kaiming H. & Xiaoou T. (2014) (Cited: 591)
Link: https://arxiv.org/pdf/1501.00092.pdf
This paper uses direct learning of the end-to-end mapping between high and low-resolution images. This mapping uses convolutional neural networks, taking low-resolution images as input and outputting high-resolution images.
13.Beyond short snippets: Deep networks for video classification, by Joe Y. Ng, Matthew J. H., Sudheendra V., Oriol V., Rajat M. & George T. (2015) (Cited: 533)
Link: https://arxiv.org/pdf/1503.08909.pdf
This paper establishes and evaluates several deep neural network architectures to integrate longer temporal information from video images than previously attempted.
14.Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, by Christian S., Sergey I., Vincent V. & Alexander A A. (2017) (Cited: 520)
Link: https://arxiv.org/pdf/1602.07261.pdf
Deep convolutional neural networks have played a central role in improving image recognition performance in recent years. Through a combination of three residual connections and one inception-v4, we achieved a Top-5 error rate of 3.08% on the ImageNet classification challenge.
15.Salient Object Detection: A Discriminative Regional Feature Integration Approach, by Huaizu J., Jingdong W., Zejian Y., Yang W., Nanning Z. & Shipeng Li. (2013) (Cited: 518)
Link: https://arxiv.org/pdf/1410.5926.pdf
This paper treats salient object detection as a regression problem. Our approach is based on multi-level image segmentation and optimizes supervised learning methods to map regional feature vectors to saliency scores.
16.Visual Madlibs: Fill in the Blank Description Generation and Question Answering, by Licheng Y., Eunbyung P., Alexander C. B. & Tamara L. B. (2015) (Cited: 510)
Link: https://www.cvfoundation.org/openaccess/content_iccv_2015/papers/Yu_Visual_Madlibs_Fill_ICCV_2015_paper.pdf
This paper introduces a new dataset containing 360,001 natural language descriptions of 10,738 images, called the Visual Madlibs dataset. This dataset uses automatically generated blank-filling templates to collect some target descriptions, including people and objects, appearances, activities and interactions, as well as entire scenes or content with broader context.
17.Asynchronous methods for deep reinforcement learning, by Volodymyr M., Adrià P. B., Mehdi M., Alex G., Tim H. et al. (2016) (Cited: 472)
Link: http://proceedings.mlr.press/v48/mniha16.pdf
The best-performing methods, asynchronous variants of actor-critic, outperform the current best performance in the Atari domain, requiring only half the training time on a single multicore CPU instead of a GPU. Furthermore, we show that asynchronous variants of actor-critic can successfully handle various continuous motor control problems, as well as a new task using visual input to navigate random 3D mazes.
18.Theano: A Python framework for fast computation of mathematical expressions., by Rami A., Guillaume A., Amjad A., Christof A. et al (2016) (Cited: 451)
Link: https://arxiv.org/pdf/1605.02688.pdf
Theano is a Python library that allows efficient definition, optimization, and evaluation of mathematical expressions involving multi-dimensional arrays. Since its launch, it has been one of the most used mathematical compilers in the machine learning field for both CPU and GPU, and its performance has steadily improved.
19.Deep Learning Face Attributes in the Wild, by Ziwei L., Ping L., Xiaogang W. & Xiaoou T. (2015) (Cited: 401)
Link: https://www.cvfoundation.org/openaccess/content_iccv_2015/papers/Liu_Deep_Learning_Face_ICCV_2015_paper.pdf
This framework not only performs significantly better than existing techniques but also reveals valuable facts about learning facial representations:
-
It illustrates how different pre-training strategies can improve performance in face localization (LNet) and attribute prediction (ANet);
-
While LNet’s filters are fine-tuned only with image-level attribute labels, their response maps across the entire image have strong facial location indications.
20.Character-level convolutional networks for text classification, by Xiang Z., Junbo Jake Z. & Yann L. (2015) (Cited: 401)
Link:
http://papers.nips.cc/paper/5782-character-level-convolutional-networks-for-text-classification.pdf
This paper provides an empirical exploration of using character-level convolutional networks (ConvNets) for text classification. We constructed several large-scale datasets to show that character-level convolutional networks can achieve state-of-the-art or competitive results.
Translator’s Bio
Li Haiming is a graduate student at the University of Chinese Academy of Sciences, an amateur triathlete, and loves music, art, and life. Enjoy making friends with all kinds of people ~ let’s travel together and explore different paths.
Recruitment Information for Translation Team
Job Content: Accurately translate selected cutting-edge foreign articles into fluent Chinese. If you are an international student majoring in data science/statistics/computer science, or working overseas in related fields, or confident in your foreign language skills, the data team welcomes you to join!
What You Can Get: Improve your understanding of cutting-edge data science, enhance your awareness of foreign news sources, and overseas friends can stay in touch with the development of technical applications in China. The background of the data team in academia and industry provides good development opportunities for volunteers.
Other Benefits: Collaborate and communicate with data scientists from well-known companies and students from top universities such as Peking University, Tsinghua University, and overseas.
Click on the end of the article “Read the Original” to join the data team~
Reprint Notice
If you need to reprint, please indicate the author and source prominently at the beginning of the article (originally from: Data Team THU ID: DatapiTHU), and place a prominent QR code of the data team at the end of the article. For articles with original identification, please send [article name – waiting for authorized public account name and ID] to the contact email to apply for whitelist authorization and edit according to requirements.
After publishing, please feedback the link to the contact email (see below). Unauthorized reprints and adaptations will be pursued legally.
To ensure the quality of the publication and establish a good reputation, the data team has established a “Typo Fund”, encouraging readers to actively report errors.
If you find any errors during your reading, please leave a message at the end of the article, and after confirmation by the editor, the data team will reward the reporting reader with an 8.8 yuan red envelope.
The reward remains the same if the same reader points out multiple errors in the same article. If different readers point out the same error, the reward goes to the first reader.
Thank you for your continued attention and support. We hope you can supervise the data team to produce higher quality content.
Click “Read the Original” to embrace the organization