Llama 3.2 Reasoning WebGPU: A Powerful In-Browser Model

Llama 3.2 Reasoning WebGPU: A Powerful In-Browser Model

Llama 3.2 Reasoning WebGPU: A compact and powerful reasoning language model that runs in the browser, akin to putting an intelligent brain into a webpage, enabling quick understanding and reasoning of various issues. References: [1] http://github.com/huggingface/transformers.js-examples/tree/main/llama-3.2-reasoning-webgpu Welcome to support my knowledge platform (NLP Engineering): Dify source code analysis and Q&A, Dify dialogue system source code, … Read more

LongQLoRA: Efficiently Extending LLaMA2-13B Context Length

LongQLoRA: Efficiently Extending LLaMA2-13B Context Length

Click the card below to follow the “LiteAI” public account This article will introduce our work on efficiently extending the context length of large models with low resources:LongQLoRA. It will involve knowledge related to Position Interpolation and QLoRA, and we recommend combining it with previous articles to help understand this work: Illustration of RoPE Rotational … Read more

Building Local Network Search Agents with Phidata and Ollama

Building Local Network Search Agents with Phidata and Ollama

Background: Attempting to build search Agents based on a local Agent framework. Reference Website: https://docs.phidata.com/tools/website Basic Environment: Command line tools (Linux/Mac), python3 (set up an independent conda environment). Basic LLM: Download and install from the Ollama official website (if you have a ChatGPT membership, you can also use ChatGPT). AI Agent Framework: This time we … Read more

Running Phi-4 Model with Ollama and Python Calls

Running Phi-4 Model with Ollama and Python Calls

## Install Ollama Select the appropriate installation method based on your operating system, taking Linux CentOS as an example. Use the command `curl -fsSL https://ollama.com/install.sh | sh` to install (requires sudo privileges). After installation, you can verify if it was successful by running `ollama –version`. ### Start Ollama After successful installation, start the Ollama service … Read more

Quickly Build an Agent with Llama-Index

Quickly Build an Agent with Llama-Index

Meow! In the previous article, we used Tongyi Qianwen to create an intelligent customer service agent with four major functions through four system-level prompts. This article will build an upgraded agent based on calling Tongyi Qianwen and combining it with Llama-Index. First, let’s implement the simplest example using ReActAgent and Functional Tool to create a … Read more

Llama 3.3: Meta AI Releases New Text-Based Language Model

Llama 3.3: Meta AI Releases New Text-Based Language Model

🚀 Quick Read Model Parameters: Llama 3.3 has 70B parameters, comparable to the 405B parameters of Llama 3.1. Multilingual Support: Supports input and output in 8 languages including English, German, French, etc. Application Scenarios: Suitable for chatbots, customer service automation, language translation, and various other scenarios. Main Content What is Llama 3.3 WeChat Official Account: … Read more

Tang Guoliang Llama Model Architecture: Theory to Practice

Tang Guoliang Llama Model Architecture: Theory to Practice

Follow the official account above to reply:Course Resources can be obtained from this course There is a course on Tang Guoliang Llama model architecture from theory to practice Tang Guoliang Llama model architecture from theory to practice Tang Guoliang Llama Model Architecture: From Theory to Practice In today’s era of rapid advancement in artificial intelligence, … Read more

Transforming Text to SQL with LLaMA2: A Local LLM Guide

Transforming Text to SQL with LLaMA2: A Local LLM Guide

With the rapid development of large model technology, how to fully utilize AI while ensuring data privacy has become a hot topic. Open-source local large language models (LLMs) are gradually becoming an important tool to solve this problem. Today, we will introduce a star-level open-source model—LLaMA2, and see how it seamlessly implements the “text to … Read more