Run LLMs Locally

Run LLMs Locally

Ollama - Run LLMs Locally with Ease

Large Language Models (LLMs) are revolutionizing how we interact with machines, but running them often requires internet access or reliance on cloud services. Enter Ollama β€” an open-source solution to run LLMs locally, securely, and offline.

πŸ” What is Ollama?

Ollama allows you to run LLMs like LLaMA, Mistral, and others right from your own machine, using Docker containers or the native CLI. Unlike cloud-based models, Ollama provides:

  • Better privacy: Data never leaves your machine.
  • Lower latency: Faster responses, no network delay.
  • Offline capability: Models can run without internet once downloaded.

πŸš€ Key Use Cases

  • βœ… Rapid AI prototyping without cloud dependencies
  • πŸ”’ Local and privacy-sensitive applications
  • πŸ”Œ Full offline use after initial download
  • πŸ€– Powering AI tools, automations, and workflows on your desktop

🐳 Running Ollama with Docker

Start by running the Ollama container with the following command:

sudo docker run -d \
  -v ./ollama:/root/.ollama \
  -p 11434:11434 \
  --name ollama \
  ollama/ollama

This launches the container in detached mode, mounts the model storage volume, and exposes the Ollama service on port 11434.

▢️ Running a Model (e.g., LLaMA 3)

Once the container is running, execute:

sudo docker exec -it ollama ollama run llama3
  • The model will download and initialize (may take a few minutes).
  • You’ll get an interactive prompt to start chatting with the model.

🧩 Accessing Ollama

After deployment, you can interact with Ollama in multiple ways:

  • Command Line Interface via docker exec
  • Local REST API: http://localhost:11434
  • UI Interface via Open WebUI (explained below)

🌐 Open WebUI - Graphical Interface for Ollama

To make interaction easier, use Open WebUI:

  1. Identify the Ollama container’s IP (e.g., 172.17.0.2)
  2. Run the UI container:
sudo docker run -d \
  -p 3000:8080 \
  -e OLLAMA_BASE_URL=http://172.17.0.2:11434 \
  -v ./open-webui:/app/backend/data \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main
  1. Open your browser and visit:
    http://localhost:3000

πŸ•’ Note: The UI might take up to 5 minutes to fully initialize.


πŸ’¬ Example Prompts

Once your model is running, try asking questions or giving instructions like:

  • "Summarize this article..."
  • "Write a Python script for file renaming"
  • "Explain quantum computing like I’m five"

Everything runs locally β€” no external API calls involved.


πŸ“š References