Nine Optimizations for Enhancing Transformer Efficiency

Nine Optimizations for Enhancing Transformer Efficiency

The Transformer has become a mainstream model in the field of artificial intelligence, with a wide range of applications. However, the computational cost of the attention mechanism in Transformers is relatively high, and this cost continues to increase with the length of the sequence. To address this issue, numerous modifications to the Transformer have emerged … Read more

QWen1.5: The Path to Excellence in Models

QWen1.5: The Path to Excellence in Models

Introduction In the article about the upgrade path of QWen, we deeply explored the optimization process of the Qianwen model. The new version, QWen1.5, has made further improvements compared to the previous version. This article will continue to analyze the reasons behind the impressive performance of the new QWen1.5 model. The structure of the article … Read more

TensorFlow Model Optimization Toolkit – Quantization Aware Training

TensorFlow Model Optimization Toolkit - Quantization Aware Training

Written by / TensorFlow Model Optimization Team We are pleased to announce the release of the Quantization Aware Training (QAT) API, which is part of the TensorFlow Model Optimization Toolkit. With QAT, you can leverage the advantages of quantization in performance and size while maintaining accuracy close to the original. This work is part of … Read more