Ollama: Running Large Language Models Locally

A comprehensive guide to installation, supported models, and use cases

Introduction to Ollama

Ollama is an open-source framework that allows you to run large language models (LLMs) and vision models locally on your own hardware. It provides a simple interface to download, manage, and interact with these models, offering a range of powerful AI capabilities without requiring cloud services or complex setups.

With Ollama, you can:

Whether you're a developer looking to integrate AI into your applications, a researcher experimenting with different models, or just someone interested in exploring the capabilities of large language models without sending data to third-party services, Ollama provides an accessible solution.

Installation

Ollama is available for macOS, Windows, and Linux, with Docker support as well. Here are the installation instructions for each platform:

macOS

Ollama supports macOS 11 Big Sur or later.

  1. Visit ollama.com/download and click on the macOS download link
  2. Open the downloaded zip file
  3. Drag Ollama to your Applications folder
  4. Launch Ollama from your Applications folder

Alternatively, if you use Homebrew:

brew install ollama

Windows

Ollama provides a Windows installer that simplifies the setup process:

  1. Visit ollama.com/download and download the Windows installer
  2. Run the downloaded OllamaSetup.exe file
  3. Follow the installation wizard
  4. After installation, Ollama will be available from the Start menu

Linux

For Linux, you can use the provided installation script:

curl -fsSL https://ollama.com/install.sh | sh

This script will install Ollama on your Linux system. For manual installation options, refer to the Linux installation documentation.

Docker

The official Ollama Docker image is available on Docker Hub:

docker pull ollama/ollama

To run Ollama using Docker:

docker run -d -p 11434:11434 ollama/ollama

This will start the Ollama server on port 11434.

Note: After installing Ollama, the service will run in the background. You can interact with it using the Ollama CLI or the REST API.

Supported Models

Ollama provides access to a diverse library of models available at ollama.com/library. Here's a selection of key models you can download and run locally:

Model Parameters Size Download Command
Gemma 3 1B 815MB ollama run gemma3:1b
Gemma 3 4B 3.3GB ollama run gemma3
Gemma 3 12B 8.1GB ollama run gemma3:12b
Gemma 3 27B 17GB ollama run gemma3:27b
Llama 3.2 3B 2.0GB ollama run llama3.2
Llama 3.2 1B 1.3GB ollama run llama3.2:1b
Llama 3.2 Vision 11B 7.9GB ollama run llama3.2-vision
Llama 3.1 8B 4.7GB ollama run llama3.1
Phi 4 14B 9.1GB ollama run phi4
Mistral 7B 4.1GB ollama run mistral
Moondream 2 1.4B 829MB ollama run moondream
Neural Chat 7B 4.1GB ollama run neural-chat
Code Llama 7B 3.8GB ollama run codellama
LLaVA 7B 4.5GB ollama run llava

Hardware Requirements: You should have at least 8 GB of RAM available to run 7B models, 16 GB for 13B models, and 32 GB for 33B models.

Model Categories

Ollama supports several types of models, including:

Basic Usage

Running a Model

Once Ollama is installed, you can run models with a simple command:

ollama run llama3.2

This will download the model if you don't already have it, then start an interactive chat session.

Chat with a Model

After running a model, you'll see a prompt where you can enter text to converse with the model:

>>> Why is the sky blue?
The sky appears blue due to a phenomenon called Rayleigh scattering. When sunlight travels through the atmosphere, it interacts with air molecules and other tiny particles. These particles scatter the sunlight in all directions. Blue light has a shorter wavelength compared to other colors in the visible spectrum, which causes it to scatter more easily when it collides with air molecules. This scattered blue light then reaches our eyes from all directions in the sky, giving it the blue appearance we observe. During sunrise and sunset, the sky often appears red or orange because the blue light gets scattered away from our line of sight as sunlight has to travel through more of the atmosphere to reach us, allowing the longer wavelength red and orange light to dominate what we see.

Basic CLI Commands

Here are some fundamental commands for working with Ollama:

Advanced Features

Customizing Models with Modelfiles

You can create custom models using a Modelfile, which allows you to modify parameters, set system messages, and more.

Create a file named Modelfile with the following content:

FROM llama3.2 # Set the temperature to 1 [higher is more creative, lower is more coherent] PARAMETER temperature 1 # Set the system message SYSTEM """ You are Mario from Super Mario Bros. Answer as Mario, the assistant, only. """

Then create and run the model:

ollama create mario -f ./Modelfile ollama run mario

Using the REST API

Ollama provides a REST API for programmatically interacting with models:

Generate a response:

curl http://localhost:11434/api/generate -d '{ "model": "llama3.2", "prompt":"Why is the sky blue?" }'

Chat with a model:

curl http://localhost:11434/api/chat -d '{ "model": "llama3.2", "messages": [ { "role": "user", "content": "why is the sky blue?" } ] }'

Structured Outputs

Ollama supports structured outputs that allow you to define the format of responses using JSON schema:

curl -X POST http://localhost:11434/api/chat -H "Content-Type: application/json" -d '{ "model": "llama3.1", "messages": [{"role": "user", "content": "Tell me about Canada."}], "stream": false, "format": { "type": "object", "properties": { "name": { "type": "string" }, "capital": { "type": "string" }, "languages": { "type": "array", "items": { "type": "string" } } }, "required": [ "name", "capital", "languages" ] } }'

This produces a structured response like:

{ "capital": "Ottawa", "languages": [ "English", "French" ], "name": "Canada" }

Working with Multimodal Models

Vision models like Llama 3.2 Vision can process images along with text:

ollama run llava "What's in this image? /path/to/image.png"

Use Cases

Ollama enables a wide range of local AI applications. Here are some popular use cases:

🤖 Chatbots & Assistants

Create personalized AI assistants for various tasks, from answering questions to providing recommendations, all while keeping your conversations private and running locally.

💻 Coding Assistant

Use Code Llama or other code-specialized models to help with programming tasks, debug code, explain algorithms, or generate code snippets without sending your proprietary code to external services.

📊 Data Analysis & Extraction

Extract structured information from documents, summarize reports, or analyze data trends with models equipped with structured output capabilities.

🔍 Local Search & RAG

Implement Retrieval-Augmented Generation (RAG) systems that can search through and reason about your local documents, providing accurate answers based on your private data.

👁️ Image Analysis

Use vision models to analyze images, extract information, generate descriptions, or identify objects without sending potentially sensitive visual data to cloud services.

🎮 Gaming & Interactive Fiction

Create dynamic game characters or interactive storytelling experiences with customized models that can maintain context and generate creative responses.

🧠 Learning & Education

Develop personalized tutoring systems or educational tools that can explain concepts, answer questions, and adapt to individual learning styles.

📝 Content Creation

Generate blog posts, marketing copy, creative writing, or other content while maintaining full control over the generation process.

🔄 Workflow Automation

Automate repetitive tasks by integrating Ollama with scripts and tools to process and transform data, generate reports, or respond to events.

Real-world Examples

Here are some specific examples of how people are using Ollama in real-world scenarios:

Integration Ecosystem

Ollama has a rich ecosystem of integrations and libraries that extend its functionality:

Libraries for Developers

Framework Integrations

User Interfaces

Many community-built UIs are available for Ollama, including:

Tips and Best Practices

Performance Optimization

Prompt Engineering

Security Considerations

Conclusion

Ollama represents a significant step forward in making powerful AI models accessible to everyone. By enabling local execution of LLMs, it addresses privacy concerns, reduces dependency on cloud services, and opens up new possibilities for AI integration in various applications.

Whether you're a developer looking to build AI-powered applications, a researcher experimenting with language models, or just someone interested in exploring what these models can do, Ollama provides an easy-to-use platform that puts the power of state-of-the-art AI in your hands.

As the field of AI continues to evolve rapidly, tools like Ollama will play an increasingly important role in democratizing access to these technologies and enabling innovation across industries.

Resources

< Back to All Posts