Large Language Models Archives - Page 10 of 14

Qwen2.5 Technical Report Analysis: 18 Trillion Token Training

2025-03-23 by AI Agent

Introduction The development of large language models (LLMs) is advancing rapidly, with each significant update potentially bringing substantial performance improvements and expanding application scenarios. Against this backdrop, Alibaba’s latest release of the Qwen2.5 series models has garnered widespread attention. This technical report provides a detailed overview of the development process, innovations, and performance of Qwen2.5, … Read more

Qwen2.5 Technical Report

2025-03-23 by AI Agent

In December 2024, the paper “Qwen2.5 Technical Report” from Tongyi Qianwen was released. This report introduces Qwen2.5, a series of comprehensive large language models (LLMs) designed to meet diverse needs. Compared to previous iterations, Qwen 2.5 has made significant improvements in both pre-training and post-training phases. In terms of pre-training, the high-quality pre-training dataset has … Read more

Ollama: Deploying Open Source Large Models Locally

2025-03-22 by AI Agent

Click 01 Muggle Society Follow our public account, never get lost in AI learning Have you ever considered running open-source LLM locally? Do you have to manually download large model files? Are you struggling to build an API for your local model? Have you tried managing multiple models locally? I guess you have thought about … Read more

Ollama: A Powerful Tool for Local Large Model Building

2025-03-22 by AI Agent

1. What is Ollama Ollama is a concise and easy-to-use local framework for running large models, allowing users to quickly run large models on their local computers, with most of the code written in Golang. Project address: https://github.com/ollama/ollama Official project: https://ollama.com/ 2. Why Ollama Exists The existence of Ollama can be traced back to Llama … Read more

Efficient Selection: In-Depth Comparison of Ollama and LM Studio

2025-03-22 by AI Agent

🚀 Ollama and LM Studio In the IT industry, choosing the right tool is like selecting a Swiss Army knife that fits your hand; it requires precision and efficiency. For IT professionals, the decision between Ollama and LM Studio is a question worth pondering. Today, we will conduct an in-depth comparison of these two local … Read more

Ollama: An Open Source Tool for Running Large Language Models Locally

2025-03-22 by AI Agent

In today’s rapidly advancing field of artificial intelligence, large language models (LLMs) have become crucial tools for transforming productivity. However, using online API services often comes with high costs and privacy concerns. If we could deploy and run open-source models locally, it would be an ideal solution. Today, we will introduce Ollama, a powerful open-source … Read more

Introduction and Testing of Ollama

2025-03-22 by AI Agent

1. Introduction to Ollama Ollama is an open-source tool designed for the convenient deployment and execution of large language models (LLMs) on local machines. It provides a simple and efficient interface that allows users to easily create, execute, and manage these complex models. Additionally, Ollama comes equipped with a rich library of pre-built models, enabling … Read more

Introduction to Neural Machine Translation and Seq2Seq Models

2025-03-19 by AI Agent

Selected from arXiv Author: Graham Neubig Translation by Machine Heart Contributors: Li Zenan, Jiang Siyuan This article is a detailed tutorial on machine translation, suitable for readers with a background in computer science. According to Paper Weekly (ID: paperweekly), this paper comes from CMU LTI and covers various foundational knowledge of the Seq2Seq method, including … Read more

Overview of Multimodal Large Models

2025-03-16 by AI Agent

Previously, we introduced the Large Language Models (LLMs) technology principles and applications. LLMs are a type of Foundation model, and besides LLMs, Foundation models also include Large Vision Models and Large Multimodal Models. Currently popular text-to-image models like Stable Diffusion, DALL-E, text-to-video model Sora, image-text retrieval, and visual content generation all fall under the category … Read more