Overview of 15 Classic RAG Frameworks (Part 2)

Overview of 15 Classic RAG Frameworks (Part 2)

Source: Deep Learning and Large Models (LLM) This article is approximately 3500 words long and is recommended for a 9-minute read. This article delves into the development of Retrieval-Augmented Generation (RAG), from basic concepts to the latest technologies. 4. Overview of Existing RAG Frameworks Agent-Based RAG A new agent-based Retrieval-Augmented Generation (RAG) framework adopts a … Read more

RAG vs Fine-Tuning: A Guide for Domain-Specific AI Models

RAG vs Fine-Tuning: A Guide for Domain-Specific AI Models

Machine Heart Report Editor: Rome Retrieval-Augmented Generation (RAG) and Fine-tuning are two common methods to enhance the performance of large language models. So, which method is better? Which is more efficient when building applications in specific domains? This paper from Microsoft serves as a reference for your choice. When constructing large language model applications, there … Read more

Microsoft’s ‘Little Cannon’: Phi-4 – A Model for Complex Inference Driven by Synthetic Data

Microsoft's 'Little Cannon': Phi-4 - A Model for Complex Inference Driven by Synthetic Data

Follow us to stay updated! Recently, the LLM community has been immersed in the shock brought by DeepSeek-V3. This model is not only open-source but also performs well. However, such a large-scale LLM is beyond our reach (the GPU memory can’t handle it). If we can’t afford that, let’s take a look at Microsoft’s open-source … Read more

Understanding Kimi 1.5 Technical Report

Understanding Kimi 1.5 Technical Report

Recently, it feels like the New Year has come early. Just last night, DeepSeek and Kimi both released their version 1.0, and Kimi was the first to publish its technical report, which is quite interesting… When it comes to Kimi, everyone has the impression that it has a technological first-mover advantage, being the first to … Read more

Kimi Releases Latest Model K1.5: Comprehensive Technical Report

Hello everyone, I am Liu Cong from NLP. Just tonight, Kimi released the latest model K1.5, first, let’s take a look at the leaderboard results, it’s simply explosive. In long reasoning, K1.5 far surpasses OpenAI’s O1 model in mathematical ability, whether in pure text or visual multimodal; it is on par with Codeforces, slightly lagging … Read more

Query Optimization Techniques in RAG

Query Optimization Techniques in RAG

A Survey of Query Optimization in Large Language Models Paper Link:https://arxiv.org/pdf/2412.17558 Published by: Tencent Large Language Models (LLMs) are becoming increasingly popular, but they also face challenges such as “hallucination” when dealing with domain-specific tasks or those requiring specialized knowledge. Retrieval-Augmented Generation (RAG) technology has emerged as a key method for enhancing model performance, with … Read more

AI Developer Perspective: Evolution of Large Model Infrastructure and Middleware Toolchain in 2024

AI Developer Perspective: Evolution of Large Model Infrastructure and Middleware Toolchain in 2024

I originally planned to write an article titled “A Two-Year Review of ChatGPT” to echo last year’s summary article on the first anniversary of ChatGPT. However, I’ve been too busy lately, and now that it’s almost mid-January, this topic no longer seems appropriate. So, I’ve decided to change the subject and discuss the developments over … Read more

BCG’s Forecast: How AI Agents Create Business Value

BCG's Forecast: How AI Agents Create Business Value

Recently, the world-renowned management consulting firm Boston Consulting Group (BCG) released a highly insightful report predicting that AI Agents will spark a revolution across various industries, prompting profound reflections on future work models, business models, and even the shape of human society. As Yuval Noah Harari, author of “Sapiens: A Brief History of Humankind,” stated: … Read more

Review of Generative AI Developments (2024)

Review of Generative AI Developments (2024)

Since OpenAI officially released ChatGPT in November 2022, the AI technology ecosystem has experienced rapid advancement. The public has transitioned from a state of confusion to a thrilling and exciting experience, and now to feelings of unease due to the cost-cutting and efficiency improvements brought by their respective companies. The world is changing so quickly, … Read more

Detailed Explanation of Attention Mechanism (With Code)

Detailed Explanation of Attention Mechanism (With Code)

The Attention mechanism is a technique in deep learning, particularly widely used in Natural Language Processing (NLP) and computer vision. Its core idea is to mimic the human attention mechanism, where humans focus on certain key parts of information while ignoring less important information. In machine learning models, this can help the model better capture … Read more