The Ongoing Battle Between TF 2.0 and PyTorch: Current Situation

Author | Jeff Hale

Translator | Jackey

Editor | Jane

Produced by | AI Technology Camp (id: rgznai100)

【Introduction】 After the release of TensorFlow 2.0 and PyTorch 1.0, the discussion about which of the two is superior has been ongoing and still lacks a definitive conclusion. In this article, the author analyzes data from multiple sources regarding TensorFlow 2.0 and PyTorch 1.0, exploring which is currently more favored, as well as offering some learning suggestions and discussing future directions worth noting in deep learning frameworks.

TF 2.0 vs PyTorch Intense Battle

In September 2018, the author wrote an article comparing mainstream deep learning frameworks from the perspectives of demand, usage, and popularity. TensorFlow undoubtedly became the heavyweight champion of deep learning frameworks, while PyTorch emerged as both a newcomer and a rising star. So, what new developments have occurred in the deep learning framework arena over the past six months?

To find out, I checked job postings from several recruitment websites including Indeed, Monster, LinkedIn, and SimplyHired, and also considered Google search results, GitHub performance, the number of new articles on Medium and ArXiv, and the number of followers on Quora regarding deep learning frameworks. This information outlines the overall picture of demand, usage, and popularity growth for deep learning frameworks.

Integration and Updates

Recently, both TensorFlow and PyTorch frameworks have made significant progress. PyTorch 1.0 was pre-released in October 2018, and fastai v1.0 was also released, marking the maturation of these frameworks. TensorFlow 2.0 Alpha was released on March 4, 2019, adding new features and improving user experience, integrating more closely with Keras as a high-level API, and offering even more functionalities.

Comparative Analysis Method

This time, I also included Keras and fastai in the evaluation analysis, as they are closely related to TensorFlow and PyTorch, and their development and popularity data can serve as a metric for assessing TensorFlow and PyTorch.

However, I will not discuss other deep learning frameworks such as Caffe, Theano, MXNET, CNTK, DeepLearning4J, and Chainer, as these frameworks have their own advantages but differ in development direction from TensorFlow or PyTorch and are not closely integrated with other frameworks.

The data collection for this analysis took place from March 20 to 21, 2019, and visual representations of the data are provided. For those interested in the raw data and reference interactive charts, you can visit the following two links:

https://docs.google.com/spreadsheets/d/1Q9rQkfi8ubKM8aX33In0Ki6ldUCfJhGqiH9ir6boexw/edit?usp=sharing

https://www.kaggle.com/discdiver/2019-deep-learning-framework-growth-scores

Next, let’s take a look at the more specific comparative analysis results.

I. Job Websites: Changes in Job Demand

To understand which deep learning frameworks are valued in current job demands, I searched for related positions on Indeed, LinkedIn, Monster, and SimplyHired.

I searched using the keywords “machine learning” + “library name”. For example, to search for positions requiring TensorFlow, I would enter “machine learning TensorFlow”. This method is also a personal habit. Searching without the machine learning keyword would not yield significantly different results (the search area is the United States).

Excluding job postings before March 2019 to limit the data to the past six months, I found:

The number of positions requiring TensorFlow is slightly higher than those for PyTorch. The demand for Keras positions is about half that of TensorFlow. Currently, there is no specific demand for fastai in job postings. Notably, except for LinkedIn, other job websites show a higher number of positions requiring PyTorch compared to TensorFlow.

II. Google Trends: Analyzing Google Search Data

As the largest search engine, Google’s web search results can serve as one of the standards for measuring the popularity of deep learning frameworks. Analyzing Google Trends over the past year reveals users’ search trends for machine learning and artificial intelligence globally. Google does not provide absolute search numbers but offers approximate figures for reference.

By calculating the average search scores for each framework over the past six months and comparing them with the average scores from the previous six months, it was found that TensorFlow’s search volume has declined while PyTorch’s has increased. The following chart reflects the search popularity for each framework over the past year.

(Blue: TensorFlow; Yellow: Keras; Red: PyTorch; Green: fastai)

III. Blog Sites: New Article Publication Trends

Medium is an important platform for publishing data science articles and tutorials, where many people share their blog posts.

In the past six months, the number of articles published about TensorFlow and Keras on Medium has been very close, while PyTorch has significantly fewer articles. As a high-level API, Keras and fastai have also gained popularity among deep learning developers, leading to many educational articles on Medium.

IV. arXiv: Usage in New Papers

Most people are familiar with arXiv, where many researchers upload their papers. A search of the number of papers updated on arXiv in the past six months that used TensorFlow and PyTorch shows that the number of studies using TensorFlow is still significantly higher than that of PyTorch.

V. GitHub: Four Data Points

GitHub data can also serve as an indicator of the popularity of frameworks. The following four charts show the statistics for Stars, Forks, Watches, and Contributors for different deep learning frameworks on GitHub. In all categories, TensorFlow ranks Top 1, while the number of Watchers and Contributors for PyTorch is very close to TensorFlow, showing no significant gap. Meanwhile, the number of Contributors for FastAI is noteworthy, as it is much higher than Keras and close to PyTorch. Keras and TensorFlow are both developed primarily by Google, and some Keras contributors also participate in TensorFlow development, which affects Keras’s data as well.

VI. Quora: Number of Followers

In this comparative analysis, a new reference metric was added: the number of followers for each framework on Quora. This metric has not been referenced before. Over the past six months, TensorFlow has the most followers, while PyTorch and Keras have significantly fewer followers than TensorFlow.

After collecting this data, I integrated all the information into a unified evaluation metric, and below I will introduce the calculation of growth scores.

Growth Score Calculation

The following are the steps to calculate the growth score:

1. Convert all information into values ranging from 0 to 1.

2. Aggregate the various components of online job information and GitHub activity.

3. Perform weighted calculations for each category based on the chart below.

4. Convert the weighted scores to a percentage score.

5. Sum all the scores to obtain the growth score for each framework.

In the above six categories, job information data accounts for 35% of the total, while the other five categories have equal weight. (The motivation of money is quite realistic). This division seems to balance the weight of each category well. Unlike the previous 2018 weighted score analysis, I did not consider the KDNuggets survey (no updated data) or the publication of new books (not many new books published in the last six months).

Comparison Results

The table below shows the scores for each framework’s subcategories:

To view the Google Sheets, you can access the following URL

https://docs.google.com/spreadsheets/d/1Q9rQkfi8ubKM8aX33In0Ki6ldUCfJhGqiH9ir6boexw/edit?usp=sharing

Below are the categories and final scores:

The following chart shows the growth scores for each framework:

TensorFlow remains the most in-demand and fastest-growing deep learning framework, with its position temporarily irreplaceable. PyTorch has also shown rapid growth, and the significant increase in job demand corroborates its rising usage and demand. Keras has improved over the past six months. Finally, FastAI, while starting from a low base, has also seen growth. After all, it is the newest of these frameworks.

Both TensorFlow and PyTorch are excellent frameworks worth learning.

Learning Suggestions

If you want to learn TensorFlow, I recommend starting with Keras. I highly recommend Chollet’s Deep Learning with Python and Dan Becker’s DataCamp course on Keras. The TensorFlow 2.0 version integrates tf.keras, so you can now call Keras within the TensorFlow framework. If you want to learn PyTorch, I suggest starting with fastai. You can learn from the MOOC course Practical Deep Learning for Coders, v3, which covers the fundamentals of deep learning and integrates fastai with PyTorch.

What else can we learn about TensorFlow and PyTorch?

Future Directions

Developers often express a preference for PyTorch over TensorFlow because PyTorch is more suited to Python, and its API is more stable. It also has a native ONNX model interface that can accelerate training and prediction. Additionally, PyTorch has many commands consistent with Numpy, which lowers the learning curve.

However, Google’s Chief Decision Intelligence Engineer, Cathy Kuo, believes that TensorFlow 2.0 focuses on enhancing user experience. Better interfaces, integration with Keras, and the addition of an execution option are changes that, along with TensorFlow’s inclusivity, will help it maintain popularity and heat in the future.

TensorFlow recently announced an exciting plan: to develop Swift for TensorFlow. Swift is a programming language developed by Apple that is faster than Python in execution efficiency and development speed. The MOOC Fast.ai course will also incorporate some Swift for TensorFlow, although this language still needs time to mature. Nonetheless, it represents a significant advancement for deep learning frameworks. Integration and borrowing between languages and frameworks will certainly occur.

Another factor affecting deep learning frameworks is quantum computing. It will take several years before quantum computers can be effectively utilized, but Google, IBM, Microsoft, and other companies are already considering how to integrate quantum computing with deep learning. At that time, these frameworks will face corresponding adjustments to accommodate new technologies.

Both TensorFlow and PyTorch frameworks are advancing, and they both feature high-level API interfaces (tf.keras and fastai), lowering the learning threshold for deep learning.

Original link:

https://towardsdatascience.com/which-deep-learning-framework-is-growing-fastest-3f77f14aa318

Recommended Reading:

Get 80 free books on Python, AI, and algorithms, with free shipping!

From Word2Vec to Bert, let’s talk about the past and present of word vectors (Part 1)

Chen Lijie, a PhD student from Tsinghua University’s Yao Class, won the Best Student Paper Award at the top conference in theoretical computer science.

The Ongoing Battle Between TF 2.0 and PyTorch: Current Situation

Leave a Comment Cancel reply