Production Infrastructure

Clinical AI. Built on Sovereign Infrastructure.

TrustCat delivers clinical AI inference using dedicated, air-gapped hardware — not shared cloud GPUs. Your data never leaves your control. Your results are deterministic and auditable.

Single-tenant
Air-gapped
Auditable
Scroll

The Problem

Why Cloud AI Does Not Work for Clinical Settings

Healthcare operators are sold on the promise of cloud AI, but the architecture is fundamentally incompatible with the trust requirements of clinical medicine.

Cloud AI Fails Clinical Trust

Multi-tenant cloud infrastructure commingles patient data across organizations. Shared compute pools create compliance gaps that auditors flag and patients never consented to.

Shared GPUs Are a Liability

When your inference runs on the same hardware as unknown workloads, you inherit their risk. Side-channel attacks, resource contention, and data residue are not theoretical concerns.

Auditability Is Not Optional

Clinical AI outputs affect patient care. Every inference must be traceable, reproducible, and verifiable. Black-box cloud APIs cannot provide the chain of custody that medicine requires.

Determinism Beats Scale

Elastic cloud scaling introduces variability. Model versions drift. Inference times fluctuate. In clinical settings, predictable, reproducible results matter more than handling 10x burst traffic.

What You Get

Infrastructure Built for Clinical Trust

No asterisks. No fine print. This is what TrustCat delivers.

Dedicated

Single-Tenant Inference

Your workloads run on hardware dedicated to your organization. No shared compute. No resource contention. No commingled data paths.

Isolated

Dedicated Hardware Per Client

Each client receives assigned GPU capacity that is never time-shared with other organizations. Your SLA is your hardware.

Reproducible

Deterministic Outputs

Same input, same model, same result. Every time. We lock model versions, control execution environment, and eliminate sources of inference variability.

Traceable

Immutable Audit Trails

Every job submission, execution event, and result is logged to a tamper-evident ledger. Full chain of custody from submission to delivery.

Sovereign

Client-Owned Identity Agents

Your on-premise agent holds your cryptographic identity. It signs every job submission. We verify. No trust delegation to third parties.

Core Architecture

Air-Gapped by Design

The TrustCat architecture eliminates inbound network access to inference infrastructure. Your client agent initiates all communication. Our GPU workers have no path back into your network.

Your Facility
Your Control
Client Agent
On-premise hardware
Private Key
Ed25519 identity
DICOM/Images
Source data
Outbound Only
HTTPS/TLS 1.3
Signed payloads
No inbound access
TrustCat Infrastructure
Dedicated Compute
SwarmOS Gateway
Job orchestration
GPU Bee Worker
Your assigned hardware
Audit Ledger
Immutable logs
Zero
Inbound network paths
Isolated
Per-client data paths
Signed
Every job submission

Client-Side Agent

A small, auditable agent runs on hardware you control. It holds your cryptographic identity, signs job submissions, and initiates all communication with TrustCat infrastructure.

No Shared Data Paths

Your jobs run on dedicated hardware assigned to your organization. No multi-tenant queues. No shared storage. No data commingling. Your inference is isolated by design, not policy.

End-to-End Flow

How TrustCat Works

From job submission to result delivery. This is the actual system architecture.

01

Client Agent Prepares Job

Your on-premise agent receives imaging data, creates a job manifest, and signs the payload with your Ed25519 private key.

Manifest includes: model target, payload hash, nonce, timestamp
02

Secure Submission

The signed job is transmitted over TLS to the SwarmOS API gateway. No inbound connections are ever established to your network.

Outbound HTTPS POST with signature verification
03

SwarmOS Orchestration

The gateway verifies your signature, validates the payload, and routes the job to your assigned GPU worker. Job is logged to the audit ledger.

PostgreSQL ledger entry: job_id, timestamp, manifest_hash
04

GPU Bee Inference

Your dedicated GPU worker loads the specified model and executes inference. Med42-70B runs on RTX 5090 class hardware with deterministic settings.

Model version locked, temperature fixed, retry logic on parse failure
05

Result Generation

Structured clinical output is generated with confidence scores. Results are validated against schema before delivery.

JSON schema: findings[], impression, recommendations, confidence
06

Audit & Delivery

Result is hashed, logged to the immutable ledger, and returned to your client agent. Full chain of custody is preserved.

Ledger entry: result_hash, proof_hash, execution_time_ms
Typical end-to-end latency:15-30 secondsfor 70B model inference

Scalability

Scale Without Cloud Tradeoffs

The assumption that you need multi-tenant cloud infrastructure to scale is false. Dedicated hardware scales linearly, predictably, and without the trust compromises of shared compute.

"Cloud elasticity optimizes for the provider's economics, not yours. Dedicated infrastructure optimizes for your outcomes."

Scale by Adding Hardware, Not Tenants

Need more capacity? We add dedicated GPUs to your allocation. Your performance never degrades because someone else is scaling up.

Independent Failure Domains

Each client deployment is isolated. A failure in one system cannot cascade to others. No shared queue means no queue poisoning.

No Noisy Neighbors

Cloud elasticity means fighting for resources with unknown workloads. Dedicated hardware means your inference time is your inference time.

Predictable Performance

Same hardware, same model, same conditions. Your P99 latency is stable because nothing about your environment changes without your knowledge.

Multi-Tenant Cloud
  • Shared GPU pools with resource contention
  • Variable latency based on demand
  • Cascading failures affect all tenants
  • Scale means more neighbors, not more capacity
TrustCat Dedicated
  • Assigned GPU capacity per client
  • Consistent, predictable latency
  • Isolated failure domains
  • Scale means more hardware for you

The Stack

Real Infrastructure. Running in Production.

These are the actual components that power TrustCat. No theoretical architectures. No "coming soon" features. This is what exists today.

Client

Client Agent Hardware

On-premise device running the TrustCat agent. Holds your cryptographic identity and initiates all job submissions.

Ed25519 keypair
Outbound-only networking
Local job signing
Orchestration

SwarmOS API Gateway

FastAPI-based gateway that receives signed job submissions, verifies client identity, and orchestrates job routing.

Signature verification
Job queuing (Redis)
Rate limiting
Compute

GPU Bee Workers

Dedicated inference nodes running on RTX 5090 class hardware. Each client is assigned specific GPU capacity.

RTX 5090 (32GB VRAM)
Med42-70B loaded
Deterministic inference
Storage

PostgreSQL Ledger

Immutable audit database recording every job submission, execution event, and result delivery with cryptographic hashes.

Job history
Audit trail
Result hashes
Reliability

Failover Daemon

Background service monitoring GPU worker health, detecting stuck jobs, and automatically requeuing failed work.

Heartbeat monitoring
Job timeout detection
3-retry logic
Operations

Matrix Ops Layer

Internal operations messaging for system alerts, deployment notifications, and infrastructure monitoring.

Real-time alerts
Deployment logs
Health dashboards

Built by operators who run infrastructure, not by marketers who describe it.

Security Posture

Security by Architecture, Not Policy

TrustCat security comes from how the system is built, not from promises in a compliance document. These are the technical measures that protect your data and ensure auditability.

Signed Job Submissions

Every job is signed with your Ed25519 private key before transmission. The gateway verifies the signature and rejects tampered or unsigned requests.

Implementation
  • Ed25519 asymmetric cryptography
  • Nonce prevents replay attacks
  • Signature verification on every request

Client Identity Separation

Each client organization has a distinct cryptographic identity. Jobs, results, and audit logs are separated by client root with no cross-tenant access paths.

Implementation
  • Unique client_root per organization
  • Agent IDs scoped to client
  • No shared queues or storage

Immutable Audit Logs

Every event is logged to a PostgreSQL ledger with cryptographic hashes. Job submissions, execution starts, completions, and failures are all recorded with timestamps.

Implementation
  • Append-only event log
  • SHA-256 hashes for integrity
  • Full chain of custody

Deterministic Execution

Inference runs with locked model versions, fixed random seeds, and controlled execution environments. The same input produces the same output.

Implementation
  • Model version pinning
  • Fixed temperature settings
  • Reproducible results

Failure Handling

Failed jobs are automatically retried with exponential backoff. After three failures, jobs are marked failed with detailed error logs for investigation.

Implementation
  • Automatic retry (max 3)
  • Timeout detection (300s)
  • Error logging to ledger

A Note on Compliance

TrustCat architecture is designed to support HIPAA-compliant workflows, but compliance is a shared responsibility between your organization and ours. We provide the technical controls; you implement the administrative safeguards. We do not make compliance claims we cannot substantiate.