Ollama 0.5.7 Deployment Guide: Easily Build Your AI Assistant!

Introduction

Do you want to run large language models locally and create your own AI assistant? The latest Ollama 0.5.7 version makes this easier than ever. By following the steps below, you will easily complete the deployment and embark on an intelligent journey!💡

🎯 What is Ollama?

Ollama is a tool designed to help users run large language models in a local environment. Whether you are a developer or an AI enthusiast, you can easily deploy and use various models through it.

Main features:

Simple and Easy to Use: Provides an intuitive command-line interface that is easy to operate.
Multi-Platform Support: Compatible with macOS, Linux, and Windows systems.
Rich Model Selection: Supports various pre-trained models to meet different needs.

🛠️ Steps to Deploy Ollama

Step 1: Environment Preparation

Ensure your system meets the following requirements:

Operating System: macOS, Linux, or Windows.
Memory: At least 8GB RAM (for running the 7B model), 16GB RAM (for running the 13B model), 32GB RAM (for running the 33B model).
Network Connection: Needed to download necessary dependencies and model files.

Note: Currently, Windows systems only provide preview support.

Step 2: Install Ollama

Choose the appropriate installation method based on your operating system:

macOS and Windows:

Visit the Ollama official website to download the installation package for your system and follow the prompts to complete the installation.

Linux:

Open the terminal and run the following command:
curl -fsSL https://ollama.com/install.sh | sh

Tip: If manual installation is needed, please refer to the manual installation instructions.

Docker:

If you prefer to use Docker, you can run the following command to start the Ollama container:
docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama

Note: The above command maps the Ollama service to port 11434 on your localhost. Please ensure that this port is not occupied.

Step 3: Download and Run Models

After installation, you can use the pre-trained models provided by Ollama. Here are some example models and how to download and run them:

Model	Parameters	Size	Download
Llama 3.3	70B	43GB	ollama run llama3.3
Llama 3.2	3B	2.0GB	ollama run llama3.2
Llama 3.2	1B	1.3GB	ollama run llama3.2:1b
Llama 3.2 Vision	11B	7.9GB	ollama run llama3.2-vision
Llama 3.2 Vision	90B	55GB	ollama run llama3.2-vision:90b
Llama 3.1	8B	4.7GB	ollama run llama3.1
Llama 3.1	405B	231GB	ollama run llama3.1:405b
Phi 4	14B	9.1GB	ollama run phi4
Phi 3 Mini	3.8B	2.3GB	ollama run phi3
Gemma 2	2B	1.6GB	ollama run gemma2:2b
Gemma 2	9B	5.5GB	ollama run gemma2
Gemma 2	27B	16GB	ollama run gemma2:27b
Mistral	7B	4.1GB	ollama run mistral
Moondream 2	1.4B	829MB	ollama run moondream
Neural Chat	7B	4.1GB	ollama run neural-chat
Starling	7B	4.1GB	ollama run starling-lm
Code Llama	7B	3.8GB	ollama run codellama
Llama 2 Uncensored	7B	3.8GB	ollama run llama2-uncensored
LLaVA	7B	4.5GB	ollama run llava
Solar	10.7B	6.1GB	ollama run solar

Tip: For more available models and their details, please visit the Ollama Model Library.

Step 4: Test the Model

After running the above command, you can interact with the model in the terminal. For example:

ollama run llama3. Enter your question, and the model will return the corresponding answer.

🌟 Advanced Features

Custom Models

You can import models from local files or customize existing models. For example:

Import from GGUF File:

Create a file named Modelfile with the following content:
FROM ./your-model-file.gguf
Then run:
ollama create your_model_name -f Modelfile

Custom Prompts:

FROM llama3.2

PARAMETER temperature 1

SYSTEM """

You are an expert AI assistant and a senior software developer. You have extensive knowledge covering various programming languages, frameworks, and best practices. Your goal is to help users solve various technical problems and provide efficient, concise solutions.

"""

Create a Modelfile with the following content:
Then run:
ollama create custom_llama -f Modelfileollama run custom_llama

Use REST API Calls

Ollama provides a REST API for developers to integrate easily. After starting the Ollama service, you can use the following command to generate a response:

curl http://localhost:11434/api/generate -d '{ "model": "llama3.2", "prompt": "Why is the sky blue?"}'

Note: Ensure your Ollama service is running and listening on the correct port.

📝 Conclusion

By following the steps above, you can successfully deploy and run Ollama 0.5.7 locally. Whether for research or application development, this is a powerful tool. Start exploring and create your own AI assistant!

Project Address

https://github.com/ollama/ollama

Official Documentation

https://ollama.readthedocs.io/quickstart/