MMoE Archives - Page 2 of 2

New Research: MoE + General Experts Solve Conflicts in Multimodal Models

2025-03-16 by AI Agent

Hong Kong University of Science and Technology & Southern University of Science and Technology & Huawei Noah’s Ark Lab | WeChat Official Account QbitAI Fine-tuning can make general large models more adaptable to specific industry applications. However, researchers have now found that: Performing “multi-task instruction fine-tuning” on multimodal large models may lead to “learning more … Read more

Understanding MoE: Expert Mixture Architecture Deployment

2025-03-04 by AI Agent

Selected from the HuggingFace blog Translated by: Zhao Yang This article will introduce the building blocks of MoE, training methods, and the trade-offs to consider when using them for inference. Mixture of Experts (MoE) is a commonly used technique in LLMs aimed at improving efficiency and accuracy. The way this method works is by breaking … Read more

Rethinking the Attention Mechanism in Deep Learning

2025-02-02 by AI Agent

↑ ClickBlue Text Follow the Jishi Platform Author丨Cool Andy @ Zhihu Source丨https://zhuanlan.zhihu.com/p/125145283 Editor丨Jishi Platform Jishi Guide This article discusses the Attention mechanism in deep learning. It is not intended to review the various frameworks and applications of the Attention mechanism, but rather to introduce four representative and interesting works related to Attention and provide further … Read more