Phase 2: Local LLM RAG DemoOpen Source

SovAI Air-Gap AI Starter v0.2 with Ollama

Run private RAG with local documents, a local Ollama model, citations, and audit logs.

View GitHub Repository Request implementation help

Expected outcomes

Run a private RAG workflow with a local Ollama model

Keep documents, prompts, answers, citations, and audit logs local

Compare retrieval-only fallback with local LLM generation

What it includes

Ollama llama3.2:1b setup guidance

Grounded RAG answer flow

Citation return pattern

Offline runtime preparation

Status and troubleshooting commands

Audit logging structure

What this kit proves

These are the practical claims the kit demonstrates when run locally.

A local LLM can generate answers without an external LLM API.

Private RAG can run over local documents.

Model output can be grounded in retrieved context.

Citations can be attached to the answer.

The app can capture model usage and citations in audit logs.

Ollama can act as a laptop-friendly local model runtime.

Local LLM RAG flow

Step 1

Pull llama3.2:1b using Ollama while internet is available.

Step 2

Start the Docker application runtime.

Step 3

Retrieve relevant local document chunks with the built-in retriever.

Step 4

Build a grounded prompt using only approved local evidence.

Step 5

Call Ollama locally through host.docker.internal:11434.

Step 6

Return a generated answer with citations and write an audit record.

Quick start

Copy these commands into a terminal and follow the connected/offline steps described in the repository README.

ollama pull llama3.2:1b
git clone https://github.com/sovaihub-lab/sovai-airgap-ai-starter-v0.2-ollama
cd sovai-airgap-ai-starter-v0.2-ollama
chmod +x scripts/*.sh
./scripts/prepare-online.sh
# Disconnect internet and keep Ollama running
./scripts/bootstrap-offline.sh
open http://127.0.0.1:8080

Configuration

LLM_ENABLED

true

LLM_BASE_URL

http://host.docker.internal:11434

OLLAMA_MODEL

llama3.2:1b

SOVAI_OFFLINE_MODE

true

AUDIT_LOG_PATH

/app/data/audit/audit-log.jsonl

Ports and services

Service

Port

Purpose

SovAI app

8080

Local RAG UI and API

Ollama

11434

Local model API on the host machine

Troubleshooting notes

The model was not available

Run ollama list and confirm llama3.2:1b exists. If missing, run ollama pull llama3.2:1b while connected.

Docker app could not reach Ollama

The container uses http://host.docker.internal:11434. On Windows, set OLLAMA_HOST to 0.0.0.0:11434 and restart Ollama if host access fails.

The first version was only document retrieval

Phase 2 adds the LLM generation step: retrieve local context, build prompt, call Ollama, return answer with citations.

Limitations

This is a local LLM RAG reference implementation, not a hardened enterprise deployment.

The default model llama3.2:1b is intentionally small for laptop testing.

The demo does not yet include enterprise RBAC, model scanning, signed images, or centralized observability.

Recommended next steps

Add a stronger local model

Swap llama3.2:1b for a larger Ollama model when hardware allows.

Move to Phase 3

Add an internal artifact hub so teams can build AI apps from approved internal sources.