Model Registry

Not every model qualifies for sovereign AI.

Sovereignty is not a model feature — it is a deployment decision. The same model can be fully sovereign when self-hosted or conditionally sovereign when managed by a cloud provider. This registry maps the distinction clearly.

View Model Registry Discuss Your Model Stack

Sovereignty Tiers

Three tiers. One question: who controls the inference?

Sovereignty is determined by where inference happens, who owns the runtime, and whether your data crosses a boundary you do not control.

Fully Sovereign

Model runs entirely on infrastructure you control. No data leaves your network boundary. Open weights, self-hosted, no licensing dependency on a third-party API.

Open-weight model you can download and self-host
Runs on your own hardware, VMs, or private cloud
Zero external API calls during inference
You own the runtime, the weights, and the output

Conditionally Sovereign

Model is hosted by a cloud provider within a dedicated tenant. Data does not leave your cloud account, but the model weights and runtime are controlled by the provider.

Hosted within your Azure, AWS, or GCP tenant
Data stays in your cloud region and account
Provider controls model weights and updates
Dependency on provider availability and pricing

Not Sovereign

Inference happens on the provider's shared infrastructure. Your prompts, documents, and responses leave your boundary and are processed externally.

API calls route to provider's shared servers
No control over where data is processed
Subject to provider's data retention policies
Not suitable for regulated or sensitive workloads

Fully Sovereign Models

Open-weight models you can self-host and fully control

These models can be downloaded, self-hosted on your own infrastructure, and run with zero external API calls. Your data never leaves your network boundary.

Model	Provider	Deployment	Context	Enterprise fit	Use cases
Llama 3.1 8B / 70B / 405B Most mature open-weight family for enterprise private deployment. All sizes available for self-hosting.	Meta (open weights)	Ollama · vLLM · OpenShift AI	128K	High	RAGAgentsSummarization
Mistral 7B / Mixtral 8×7B Highly efficient. Mixtral MoE architecture gives near-70B quality at lower compute cost.	Mistral AI (open weights)	Ollama · vLLM · Docker	32K	High	RAGClassificationCode
Mistral Nemo 12B Best-in-class at 12B scale. Long context makes it ideal for document intelligence pipelines.	Mistral AI (open weights)	Ollama · vLLM	128K	High	DocumentsRAGLong context
Phi-3 / Phi-3.5 Mini Exceptional quality-to-size ratio. Suitable for edge, laptop, or resource-constrained private deployments.	Microsoft (open weights)	Ollama · Edge · Docker	128K	Medium	EdgeLightweightClassification
Gemma 2 9B / 27B Strong reasoning capability. 27B model approaches GPT-3.5 quality in private deployment benchmarks.	Google (open weights)	Ollama · vLLM	8K	Medium	RAGReasoningSummarization
Qwen 2.5 7B / 72B Outstanding multilingual and code capability. 72B is competitive with GPT-4o on many enterprise tasks.	Alibaba (open weights)	Ollama · vLLM	128K	High	MultilingualCodeRAG
DeepSeek R1 / V3 Exceptional reasoning model. R1 rivals o1-class performance when self-hosted. Requires GPU infrastructure.	DeepSeek (open weights)	vLLM · OpenShift AI	64K	High	ReasoningAnalysisCode
Code Llama / StarCoder2 Purpose-built for code generation, completion, and review in private developer tooling.	Meta / BigCode (open weights)	Ollama · vLLM	16K	Medium	CodeDeveloperCompletion

Conditionally Sovereign Models

Cloud-managed models within your tenant boundary

These models run inside your Azure, AWS, or GCP account. Your data stays within your cloud tenant, but model weights and runtime are controlled by the provider.

Conditional sovereignty depends on your cloud agreement, region selection, and data processing terms. Always verify provider data residency commitments before using these models with sensitive data.

Model	Provider / Platform	Deployment	Context	Enterprise fit	Use cases
GPT-4o / GPT-4 Turbo Data stays within your Azure tenant and region. No training on your data by default. HIPAA and SOC2 eligible.	Microsoft via Azure OpenAI	Azure OpenAI Service	128K	High	RAGAgentsEnterprise
GPT-3.5 Turbo Lower cost option for high-volume RAG pipelines within Azure boundary. Good for summarization at scale.	Microsoft via Azure OpenAI	Azure OpenAI Service	16K	High	RAGSummarizationHigh volume
Claude 3.x (Haiku / Sonnet / Opus) Longest context window available in a managed tier. Data stays in your AWS account. Strong for document Q&A.	Anthropic via AWS Bedrock	AWS Bedrock	200K	High	DocumentsLong contextRAG
Llama 3 (Managed) Open weights, managed runtime. A middle path when self-hosting GPU infrastructure is not yet possible.	Meta via AWS Bedrock / Azure	AWS Bedrock · Azure AI	128K	High	RAGManagedTransition
Mistral Large / Small Managed Mistral inside Azure boundary. Useful when GPU ops team is not available but data must stay in Azure.	Mistral via Azure AI	Azure AI Foundry	32K	Medium	RAGManagedAzure

Not Sovereign — Public APIs

Models that cross your data boundary

These models process your data on shared provider infrastructure. They are not suitable for sovereign AI workloads involving sensitive, regulated, or confidential data.

Public API models are listed here for comparison, not recommendation. For many use cases without sensitive data, public APIs are practical. For sovereign AI, they are excluded by definition.

Model	Provider	Deployment	Context	Sovereign fit	Sovereign alternative
GPT-4o / GPT-4 Prompts and documents are processed on OpenAI shared infrastructure. Not suitable for private or regulated data.	OpenAI (direct API)	OpenAI API	128K	Not Sovereign	Azure OpenAI Service
Claude 3.x Processed on Anthropic's infrastructure. Data leaves your boundary. Use AWS Bedrock for a conditional alternative.	Anthropic (direct API)	Anthropic API	200K	Not Sovereign	AWS Bedrock
Gemini 1.5 Pro / Flash Inference on Google's shared servers. Use Vertex AI within your GCP project for a conditionally sovereign path.	Google (direct API)	Google AI Studio / API	1M	Not Sovereign	Vertex AI (GCP)

Deployment Runtimes

Where and how sovereign models run

The runtime determines the sovereignty level, performance envelope, and operational complexity. Choose based on your infrastructure maturity and compliance requirements.

Fully Sovereign

Ollama

Local and single-server model runtime. Ideal for development, proof-of-concept, and low-volume private deployments.

Best for: Development · Laptops · Small teams
Models: Llama 3, Mistral, Phi-3, Gemma 2, Qwen 2.5
Note: One-command model pull and serve. No GPU required for smaller models. Not designed for production scale.

Fully Sovereign

vLLM

High-throughput inference server with continuous batching. Purpose-built for GPU-accelerated production workloads.

Best for: Production · GPU servers · High concurrency
Models: Llama 3, Mistral, Qwen 2.5, DeepSeek, Gemma 2
Note: OpenAI-compatible API surface. Supports tensor parallelism for large models. Preferred for production private RAG.

Fully Sovereign

OpenShift AI

Red Hat's ML platform for enterprise Kubernetes deployments. Integrates model serving with observability and governance.

Best for: Enterprise · Kubernetes · Air-gapped
Models: Llama 3, Mistral, DeepSeek R1
Note: Preferred for regulated industries, air-gapped environments, and enterprises already using OpenShift.

Conditionally Sovereign

Azure OpenAI Service

Microsoft-managed OpenAI models within your Azure subscription and region. HIPAA, SOC2, and EU data boundary eligible.

Best for: Azure-aligned orgs · Compliance · Fast start
Models: GPT-4o, GPT-3.5 Turbo
Note: No self-managed GPU infrastructure required. Data stays in your Azure tenant. Provider controls model updates.

Conditionally Sovereign

AWS Bedrock

Managed foundation model API within your AWS account. Supports Claude, Llama, Mistral, and others with VPC integration.

Best for: AWS-aligned orgs · Multi-model · Compliance
Models: Claude 3, Llama 3, Mistral Large
Note: Data stays in your AWS account and region. No cross-account data sharing. Supports PrivateLink for VPC isolation.

Decision Guide

Which model path fits your situation?

Sovereignty requirements, infrastructure maturity, and compliance obligations determine the right deployment path — not model quality rankings.

Situation

You handle regulated data (HIPAA, GDPR, financial)

Recommendation

Fully Sovereign or Conditionally Sovereign only

Self-host on vLLM / OpenShift AI, or use Azure OpenAI / AWS Bedrock within your tenant

Situation

You have a GPU-equipped private server or Kubernetes cluster

Recommendation

Fully Sovereign — self-hosted

Llama 3.1 70B or Mistral Nemo on vLLM. OpenShift AI if enterprise Kubernetes is already in use.

Situation

You want private AI but don't have GPU infrastructure yet

Recommendation

Conditionally Sovereign — managed cloud

Azure OpenAI (if Azure-aligned) or AWS Bedrock (if AWS-aligned). Plan migration to self-hosted as GPU capacity grows.

Situation

You need to run AI on laptops or edge devices

Recommendation

Fully Sovereign — local runtime

Phi-3 Mini or Llama 3.1 8B on Ollama. Works without internet, ideal for field teams or disconnected environments.

Situation

You need maximum model quality right now

Recommendation

Conditionally Sovereign

GPT-4o via Azure OpenAI or Claude Sonnet via AWS Bedrock. Not fully sovereign but data stays in your cloud tenant.

Situation

You are building a prototype or internal demo

Recommendation

Start with Ollama locally, plan for vLLM in production

Llama 3.1 8B on Ollama for speed. Design the application layer to swap the model endpoint without rewriting the app.

Registry Principles

How SovAIHub evaluates models

This registry does not rank models on benchmark performance. It evaluates them on sovereignty, deployment control, and enterprise operational fit.

Boundary control

Where does inference happen? Who owns the compute? Can you prevent data from leaving your network? These questions determine sovereignty, not model size or capability.

Deployment operability

A sovereign model you cannot realistically operate is not a useful recommendation. Models are rated on whether enterprise teams can deploy, monitor, and maintain them.

Weight availability

Fully sovereign models require open or licensed weights you can download. A model that requires an external API for every inference is not self-hosted, regardless of marketing language.

Enterprise context fit

Context window, throughput, and accuracy on enterprise tasks (document Q&A, summarization, classification) matter more than general benchmark scores for private RAG workloads.

Regulatory alignment

Models deployed in regulated industries must support data residency, audit logging, and access control. The registry notes which deployment methods support these requirements.

Runtime independence

SovAIHub favors models that can be served across multiple runtimes (Ollama, vLLM, OpenShift AI) without vendor lock-in. Application code should route to a model endpoint, not a provider.

Next Step

Need help selecting and deploying the right model stack?

Model selection depends on your data classification, infrastructure, compliance requirements, and team capability. SovAIHub can help you map the right path.

Sovereign model architecture

Select the right model. Deploy it on infrastructure you control.

Share your use case, data sensitivity, infrastructure environment, and compliance requirements. We will help you identify a practical sovereign model deployment path.

Discuss Your Model Stack