Optimized TensorFlow 2.4 for Mac: Accelerated CPU and GPU Training

Optimized TensorFlow 2.4 for Mac: Accelerated CPU and GPU Training

Written by / Pankaj Kanwar and Fred Alcober

With TensorFlow 2, developers, engineers, and researchers can achieve top-notch training performance across platforms, devices, and hardware, enabling them to work on their preferred platforms. Now, TensorFlow users can accelerate training speeds on Macs equipped with Apple’s new M1 chip or Intel chip using the optimized TensorFlow 2.4 for Mac and the new ML Compute framework. These improvements enhance Apple developers’ ability to execute TensorFlow on iOS via TensorFlow Lite, showcasing the breadth and depth of TensorFlow’s support for high-performance ML execution on Apple hardware.

Optimized TensorFlow 2.4 for Mac: Accelerated CPU and GPU Training

Performance on Mac with ML Compute

The Mac has always been a popular general-purpose platform for developers, engineers, and researchers. Recently, Apple released a series of Mac products featuring the new M1 chip, allowing the optimized TensorFlow 2.4 for Mac to fully leverage the powerful capabilities of Macs and significantly enhance performance.

ML Compute is Apple’s new framework that enables training TensorFlow models on Mac, allowing accelerated CPU and GPU training on Macs equipped with M1 and Intel chips.

For example, the M1 chip features a powerful new 8-core CPU and up to an 8-core GPU, both optimized for ML training tasks on Mac. In the following image, you can see how the Mac-optimized TensorFlow 2.4 achieves significant performance improvements on general models of Macs equipped with M1 and Intel chips.

Optimized TensorFlow 2.4 for Mac: Accelerated CPU and GPU Training

The impact on common model training when using ML Compute on a 13-inch MacBook Pro with M1 and Intel chips, shown in seconds per batch; the smaller the number, the shorter the training time.

Optimized TensorFlow 2.4 for Mac: Accelerated CPU and GPU Training

The impact on training common models when using ML Compute on a 2019 Mac Pro with Intel chips, shown in seconds per batch; the smaller the number, the shorter the training time.

Getting Started with TensorFlow Optimized for Mac

Users can utilize ML Compute as a backend for TensorFlow and TensorFlow plugins without making any changes to their existing TensorFlow scripts.

First, please visit Apple’s GitHub repository to learn how to download and install the Mac-optimized TensorFlow 2.4.

In the near future, we will integrate this version into the TensorFlow master branch, making it easier for users to perform such updates and obtain these performance metrics.

You can find details about the ML Compute framework on Apple’s machine learning website.

Note:

  1. Tested by Apple in October and November 2020 using a pre-production 13-inch MacBook Pro system (equipped with Apple M1 chip, 16GB RAM, and 256GB SSD) and a production 1.7GHz quad-core Intel Core i7 13-inch MacBook Pro system (equipped with Intel Iris Plus Graphics645, 16GB RAM, and 2TB SSD). Testing was conducted on pre-release macOS Big Sur, TensorFlow 2.3, pre-release TensorFlow 2.4, with fine-tuning features of ResNet50V2, CycleGAN, Style Transfer, MobileNetV3, and DenseNet121. Performance testing was conducted using specific computer systems, reflecting the approximate performance of the MacBook Pro.

  2. Tested by Apple in October and November 2020 using a production 3.2GHz 16-core Intel Xeon W Mac Pro system (equipped with 32GB RAM, AMD Radeon Pro Vega II Duo graphics card with 64GB HBM2, and 256GB SSD). Testing was conducted on pre-release macOS Big Sur, TensorFlow 2.3, pre-release TensorFlow 2.4, with fine-tuning features of ResNet50V2, CycleGAN, Style Transfer, MobileNetV3, and DenseNet121. Performance testing was conducted using specific computer systems, reflecting the approximate performance of the Mac Pro.

Optimized TensorFlow 2.4 for Mac: Accelerated CPU and GPU Training

Leave a Comment