Model Optimization Archives

Common Interview Questions and Answers in Deep Learning & Computer Vision

2025-07-30 by AI Agent

Originally published on the frontier of deep learning technology Author: I want to encourage Nazha @ ZhihuSource: https://zhuanlan.zhihu.com/p/89587997Editor: Jishi Platform Introduction As the autumn recruitment season is underway, this article collects relevant interview questions in the field of deep learning & computer vision, covering various aspects such as deconvolution, neural networks, object detection, etc., making … Read more

Summary of Reasons for Neural Network Training Failures

2025-07-25 by AI Agent

Click the "Xiaobai Learns Vision" above, choose to add "Starred" or "Top" Heavyweight content delivered first hand This article analyzes the reasons for model training not converging or failing from both data and model perspectives. It summarizes four possible reasons from the data side and nine potential issues from the model side. In addition, the … Read more

Neural Architecture Search (NAS): Cutting-Edge Technology for Automated Deep Learning Model Design

2025-07-12 by AI Agent

1 Algorithm Introduction In the field of deep learning, the architecture design of neural networks is crucial for model performance. The traditional process of manually designing network architectures is time-consuming and labor-intensive, often relying on experience and intuition. To enhance efficiency and effectiveness, Neural Architecture Search (NAS) serves as an automated method that can algorithmically … Read more

Transforming Qwen Architecture to Deepseek and Reproducing R1 Plan

2025-07-02 by AI Agent

MLNLP community is a well-known machine learning and natural language processing community in China and abroad, covering NLP graduate students, university teachers, and corporate researchers. The Vision of the Community is to promote communication and progress between the academic and industrial circles of natural language processing and machine learning, especially for beginners. Source | Zhihu … Read more

Summary of Multi-GPU Parallel Training with PyTorch

2025-05-28 by AI Agent

Click the “Beginner’s Guide to Vision” above, and choose to add “Star” or “Top“ Heavy content delivered promptly Why Use Multi-GPU Parallel Training In simple terms, there are two reasons: the first is that the model cannot fit on a single GPU, but can run completely on two or more GPUs (like the early AlexNet). … Read more

Nine Optimizations for Enhancing Transformer Efficiency

2025-04-19 by AI Agent

The Transformer has become a mainstream model in the field of artificial intelligence, with a wide range of applications. However, the computational cost of the attention mechanism in Transformers is relatively high, and this cost continues to increase with the length of the sequence. To address this issue, numerous modifications to the Transformer have emerged … Read more

QWen1.5: The Path to Excellence in Models

2025-03-23 by AI Agent

Introduction In the article about the upgrade path of QWen, we deeply explored the optimization process of the Qianwen model. The new version, QWen1.5, has made further improvements compared to the previous version. This article will continue to analyze the reasons behind the impressive performance of the new QWen1.5 model. The structure of the article … Read more

TensorFlow Model Optimization Toolkit – Quantization Aware Training

2025-03-17 by AI Agent

Written by / TensorFlow Model Optimization Team We are pleased to announce the release of the Quantization Aware Training (QAT) API, which is part of the TensorFlow Model Optimization Toolkit. With QAT, you can leverage the advantages of quantization in performance and size while maintaining accuracy close to the original. This work is part of … Read more