TRD Sovereign — Self-hosted AI infrastructure for regulated industries

01 — Who it's for

Built for organizations that can't route data through external APIs.

When the data is citizen records, account holders, or patient histories, "send it to OpenAI" isn't an option — it's a compliance violation. TRD Sovereign runs the entire agent stack where the data already lives. The engagement model: $5k–50k/month per deployment, scoped per environment.

GOVERNMENT

Public sector

Deploy in your national cloud. Citizen data stays inside the country. Air-gap-friendly — the pod runs with zero outbound dependencies beyond an optional license check. UAE government agencies are a core target.

FINANCIAL

Banks & insurers

Meet RBI, MAS, DFSA, and EU regulator requirements. Run AI inference on data you legally cannot expose to a third-party endpoint. Indian PSU banks are a primary deployment target.

HEALTHCARE

Health systems

HIPAA-compatible deployment. Patient records never leave your private network — inference happens inside your perimeter, not someone else's. EU healthcare systems are an active focus.

02 — How deployment works

From scoping call to running pod, in a structured engagement.

No surprise integration work, no open-ended consulting. The path from first conversation to a deployed Sovereign pod is three defined stages — and the model swap, isolation mode, and region are all your choices.

STAGE 01

Scope & audit

A 30-minute scoping call covers your environment, hardware, and compliance needs. If it's a fit, a paid pre-deployment infrastructure audit follows — and the fee converts to a deposit on a signed contract.

30-min call → paid audit

STAGE 02

Configure the pod

You fill a single config file — your domain, your model choice, your tenant isolation mode, your region. The deployment package handles the rest: backend, inference, database, monitoring, runbooks.

edit config.yaml → helm install

STAGE 03

Run inside your perimeter

The full stack comes up in your Kubernetes cluster or via Docker Compose. The only outbound connection is an optional license callback — it carries a token, never data. You hold the keys.

trd-sovereign status → all green

03 — What's shipped & in flight

A foundation that's real today, not a pitch deck.

The marketing surface, the lead pipeline, the architecture decisions, and the own-GPU inference layer are all built. Everything below is either live in production or actively in flight — tagged honestly.

sovereign.trdn.io marketing surface

LIVE

The public-facing surface explaining the self-hosted proposition to enterprise prospects — backend lead-capture API, frontend content, protected-zone enforcement. Shipped May 9.

Enterprise lead pipeline

LIVE

The sovereign_leads backend captures and qualifies inbound enterprise leads — validating company size, use case, and budget signals, with high-value leads routed for fast follow-up.

6 architecture locks identified

LIVE

Six non-negotiable architecture decisions identified for any enterprise deployment — tenant isolation mode, encryption envelope at rest, append-only audit log, and the inference-router abstraction among them.

Own-GPU inference engine

LIVE

vLLM serving Qwen3-32B-FP8 — benchmarked at 1,605 tokens/second aggregate at concurrency 16. This is the layer that makes "sovereign" real: inference on hardware you control, not someone else's API.

Speculative decoding

LIVE

A Qwen3-0.6B draft model proposes tokens for the 32B model to verify — 66% acceptance rate, 25%+ throughput gain. Production-grade serving, not a research demo.

TRD Inference API · Tier 0

IN FLIGHT

The own-GPU inference layer that becomes the primary engine for Sovereign deployments. Code shipped; currently stabilizing on dedicated H200 capacity before it's the default path.

04 — What's coming

The roadmap to a repeatable enterprise deployment.

Sovereign is deliberately sequenced — the engineering for compliance is done up front, certifications are pursued only when a paying customer requires them, and multi-region automation comes once the deployment package is proven. Honest phasing, not vapor.

PHASE 2 · IN FLIGHT

Architecture lock — design week

A focused design week to lock the six architecture decisions before any enterprise deployment is offered: tenant isolation, the encryption envelope and per-customer key management, the append-only audit log, and the inference-router abstraction with backwards compatibility.

Effort — 5 days, non-negotiable

PHASE 3 · UPCOMING

Deployment package — Docker / Kubernetes

The one-package install for enterprise customers. Docker Compose for single-server deployments, a Kubernetes Helm chart for larger ones — bundling the TRD backend, vLLM with the chosen model, self-hosted database, storage backup, monitoring stack, and ops runbooks.

Effort — ~1 month

PHASE 3 · UPCOMING

Anthropic ZDR option — Tier -1

For enterprises that want premium model quality but can't go fully on-prem: an Anthropic Zero Data Retention API tier, where request data is never stored. A middle path between full sovereignty and standard cloud APIs.

Effort — 1–2 weeks

PHASE 4 · UPCOMING

First paying enterprise deployment

The first commercial Sovereign customer — target: a UAE government agency, Indian PSU bank, or EU healthcare system. Engagement model: a 30-day proof of concept, then an annual contract with deployment, training, SLA support, and quarterly reviews.

Effort — ~1 month · founder-led

PHASE 5 · UPCOMING

SOC 2 · ISO 27001 · HIPAA certifications

Formal certifications — pursued only after paying customers explicitly require them. The underlying engineering (audit logs, access controls, encryption) is done up front in the Phase 2 locks, so certification becomes audit prep, not a rebuild.

Effort — 2–3 months, on customer demand

PHASE 6 · UPCOMING

Multi-region pod automation

Scale-out for serving multiple Sovereign customers across regions. A customer signs up, picks a region, and the system provisions a pod, configures their tenant, and returns connection details — UAE, EU, India, and US targeted.

Effort — 2–3 months

05 — vs. cloud AI APIs

Why TRD Sovereign outlasts the API your data can't touch.

Cloud AI APIs are fast to start with. They also send your data to someone else's servers, in someone else's jurisdiction, under someone else's terms. For regulated organizations, that's not a latency tradeoff — it's a non-starter.

Property	TRD Sovereign	OpenAI / Anthropic API	Cloud-hosted LLM (Bedrock / Vertex)
Data stays inside your perimeter	✓	×	~
Runs in your region / national cloud	✓	×	~
Air-gap-capable deployment	✓	×	×
Zero external API calls at inference	✓	×	×
You hold the model weights & keys	✓	×	×
Full agent stack — 641 agents, not just an endpoint	✓	×	×
Observability exportable to your SIEM	✓	×	~
No usage data retained by a vendor	✓	~	~
Predictable cost — flat monthly, no per-token metering	✓	×	×
Standard tooling (Helm / Compose)	✓	×	✓

"~" denotes partial: some cloud-hosted LLM offerings keep data in-region but still run on vendor-controlled infrastructure under vendor terms. Sovereign removes the vendor from the data path entirely.

06 — Engagement model

Priced per deployment, scoped to your environment.

Sovereign isn't a per-seat SaaS — it's an infrastructure deployment. Every engagement starts with a paid audit, and the audit fee converts to a deposit on a signed contract. The figures below are the engagement structure; exact scope is finalized per environment.

STEP ONE

Infrastructure audit

Paid scoping

A pre-deployment audit of your cluster, hardware, network topology, and compliance requirements. Not a sunk cost — the fee converts to a deposit on a signed contract.

30-minute scoping call first
Full environment assessment
Jurisdiction gap analysis
Converts to contract deposit

PROOF OF CONCEPT

30-day deployment

$5k / flat

A full Sovereign pod deployed in your environment for a 30-day proof of concept — the real stack, your data, your perimeter, before any annual commitment.

Full pod deployed in your environment
Your model, region & isolation mode
Deployment + initial training
Direct founder-led engagement

ANNUAL CONTRACT

Sovereign deployment

$5k–50k / month

An annual contract per deployment — $60k–500k ARR depending on scale, model, and support tier. Includes the deployment, training, SLA support, and quarterly business reviews.

Annual contract, per environment
4-hour SLA support
Quarterly business reviews
Certification support on request

All figures reflect the engagement structure; exact scope and pricing are finalized in the infrastructure audit. The first deployments are founder-led.

07 — Compliance posture

Certifications pursued per deployment, not for show.

We pursue formal certification when a specific deployment requires it — the engineering work (audit logs, access controls, encryption) is done up front in the Phase 2 architecture locks, so certification is audit prep, not a rebuild. We're happy to share the gap analysis for your jurisdiction first.

SOC 2 Type 2

Pursued on customer demand

ISO 27001

Pursued on customer demand

HIPAA

Pursued on customer demand

GDPR alignment

Architecture aligned

UAE data residency

In-region deployment

India DPDP

In-region deployment

Need a framework not listed here? The pod's architecture — no egress, your keys, your infrastructure, your region — is designed to map cleanly onto most data-residency and sovereignty regimes. Ask us for the gap analysis for your jurisdiction.

08 — Common questions

The questions regulated buyers actually ask.

No. Inference, orchestration, storage, and observability all run inside your perimeter. The only outbound connection is an optional license-server callback that verifies model weights — it transmits a license token and nothing else. For fully air-gapped deployments, even that can be handled offline.

The default is Qwen3-32B-FP8 served via vLLM — chosen for the quality-to-speed sweet spot, FP8 quantization that fits comfortably on a single H200, strong coding and reasoning performance, and open weights with no licensing concerns. The model choice is part of your deployment config — it's your decision, set when the pod is configured.

The own-GPU inference layer benchmarked at 1,605 tokens/second aggregate at concurrency 16 on real production workloads, with speculative decoding contributing a 25%+ throughput gain (66% draft-token acceptance). Exact numbers depend on your hardware tier and concurrency — the infrastructure audit scopes this for your environment.

A raw open-source model is an endpoint. TRD Sovereign is the full agent stack — all 641 agents, the complete build pipeline, the memory layer, the orchestrator — packaged to deploy with Helm or Compose and run with zero external calls. You're not assembling infrastructure; you're deploying a finished, observable, supported system.

Yes. The pod is designed with no outbound dependencies beyond the optional license callback, and that can be satisfied offline for air-gapped environments. Public-sector and defense deployments are a core use case, not an afterthought.

It starts with a 30-minute scoping call, then a paid pre-deployment infrastructure audit (the fee converts to a deposit on a signed contract). From there: a 30-day proof of concept at a flat $5k, then an annual contract — $5k–50k/month depending on scale — including deployment, training, 4-hour SLA support, and quarterly reviews. The first deployments are founder-led.

You do. The model weights are deployed into your infrastructure, and all keys are yours — managed per-tenant via the encryption envelope locked in the Phase 2 architecture. TRD does not hold a copy of your keys and has no path to your data, by design.

That's the upcoming Anthropic Zero Data Retention tier — a middle path where request data is never stored by the provider. It's premium-quality inference with stronger privacy than a standard API, for organizations whose constraints allow it. It's on the Phase 3 roadmap, not yet live.

The pod's architecture — no egress, your keys, your infrastructure, your region — maps cleanly onto most data-residency and sovereignty regimes. We pursue formal certification when a specific deployment requires it, and we'll share a gap analysis for your jurisdiction before you commit to anything.

AI infrastructure that never leaves your perimeter.

Built for organizations that can't route data through external APIs.

Public sector

Banks & insurers

Health systems

From scoping call to running pod, in a structured engagement.

Scope & audit

Configure the pod

Run inside your perimeter

A foundation that's real today, not a pitch deck.

sovereign.trdn.io marketing surface

Enterprise lead pipeline

6 architecture locks identified

Own-GPU inference engine

Speculative decoding

TRD Inference API · Tier 0

The roadmap to a repeatable enterprise deployment.

Architecture lock — design week

Deployment package — Docker / Kubernetes

Anthropic ZDR option — Tier -1

First paying enterprise deployment

SOC 2 · ISO 27001 · HIPAA certifications

Multi-region pod automation

Why TRD Sovereign outlasts the API your data can't touch.

Priced per deployment, scoped to your environment.

Infrastructure audit

30-day deployment

Sovereign deployment

Certifications pursued per deployment, not for show.

The questions regulated buyers actually ask.

Does any data ever leave our environment?

What model runs in the pod, and can we change it?

What kind of throughput should we expect?

How is this different from self-hosting an open-source LLM ourselves?

Can it run fully air-gapped?

What does the engagement actually look like?

Who controls the model weights and encryption keys?

What if we want premium quality but can't go fully on-prem?

What if our jurisdiction's framework isn't on your compliance list?

Talk to us about your environment.