Qwen Series Technical Interpretation 3 – Architecture

Qwen Series Technical Interpretation 3 - Architecture

Shadows slant across the shallow water, a faint fragrance drifts in the moonlight at dusk. Hello everyone, I am the little girl selling hot dry noodles. I am very happy to share cutting-edge technology and thoughts in the field of artificial intelligence with my friends. Following the previous shares in the same series: Qwen Series … Read more

Qwen-Agent Framework: Exploring Open Source Qwen Model Capabilities

Qwen-Agent Framework: Exploring Open Source Qwen Model Capabilities

  Qwen-Agent is a code framework designed to explore the tool usage, planning, and memory capabilities of the open-source Qwen model. Based on Qwen-Agent, we developed a Chrome browser extension called BrowserQwen, which has the following main features: Discuss the content of the current webpage or PDF document with Qwen. With your authorization, BrowserQwen will record … Read more

In-Depth Analysis of RL Strategies in Mainstream Open-Source LLMs

In-Depth Analysis of RL Strategies in Mainstream Open-Source LLMs

The author is from Meta, an internet practitioner, focusing on LLM4Code and LLMinfra. The original text is from Zhihu, link: https://zhuanlan.zhihu.com/p/16270225772 This article is for academic/technical sharing only. If there is any infringement, please contact for removal. RLHF is an important part of LLM training. With the development of open-source models, we observe that some … Read more

Building a Q&A Bot with Local Knowledge Base Using LlamaIndex and Qwen1.5

Building a Q&A Bot with Local Knowledge Base Using LlamaIndex and Qwen1.5

01 Introduction What is RAG LLMs can produce misleading “hallucinations”, depend on information that may be outdated, and are inefficient when handling specific knowledge, lacking deep insights in specialized fields, while also having some deficiencies in reasoning capabilities. It is against this backdrop that Retrieval-Augmented Generation (RAG) technology has emerged, becoming a significant trend in … Read more

Understanding Qwen1.5 MoE: Efficient Intelligence of Sparse Large Models

Understanding Qwen1.5 MoE: Efficient Intelligence of Sparse Large Models

Introduction Official Documentation: Qwen1.5-MoE: Achieving the Performance of 7B Models with 1/3 Activation Parameters | Qwen On March 28, Alibaba announced the open-source MoE technology large model Qwen1.5-MoE-A2.7B for the first time. This model is based on the existing Qwen-1.8B model. The activation parameters of Qwen1.5-MoE-A2.7B are 270 million, but it can achieve the performance … Read more

How Effective Is Tongyi Qwen-7B? Firefly Fine-Tuning Practice Shows Great Results

How Effective Is Tongyi Qwen-7B? Firefly Fine-Tuning Practice Shows Great Results

01 Introduction On August 3, Alibaba Cloud released its first open-source large model: Tongyi Qwen-7B, which is open-source and commercially usable. Although everyone has been raised expectations with various hundred-billion parameter models, the fact that it is produced by Alibaba has attracted widespread attention and discussion among peers, and it has performed excellently on various … Read more