BlackRoad AI

Sovereign inference on owned hardware

Compute

›

52

TOPS

2x Hailo-8 accelerators. Cecilia (26 TOPS) + Octavia (26 TOPS). Edge AI inference for object detection, classification, and model acceleration.

›

4

Inference Nodes

Cecilia (primary, 16 models), Lucidia, Gematria (6 models), Octavia. All running Ollama. Local-first, no external API calls.

›

16+

Models

llama3, mistral, codellama, phi, gemma, nomic-embed-text, deepseek-coder, neural-chat, starling, dolphin, and more.

›

0

API Calls Out

Zero external API calls. All inference runs on owned hardware. No OpenAI, no Anthropic API, no cloud providers. Fully sovereign.

Hailo-8 Accelerators

›

Cecilia26 TOPS

Hailo-8 M.2 module. /dev/hailo0 verified. Used for real-time object detection, YOLOv5/v8 inference, and model acceleration. Combined with Ollama's 16 models for full AI pipeline.

›

Octavia26 TOPS

Hailo-8 M.2 module. Secondary accelerator for parallel inference workloads. Co-located with Gitea, Docker, and NATS for CI/CD-integrated AI pipelines.

Model Registry

RAG System

›

Qdrant + nomic-embed-text

Vector store on Alice · Embedding model on Cecilia

Qdrant vector database on Alice (:6333). Documents are embedded using nomic-embed-text on Cecilia, stored as vectors, and retrieved for context-augmented generation.

Sources indexed: codex solutions (693), TIL broadcasts, journal entries, wiki pages, corporate docs, academic papers (Greenbaum, Schleif, Reddi).

Query flow: user question → embed with nomic → nearest neighbor search in Qdrant → top-k context → feed to llama3/mistral for generation.

Sovereignty

Every model runs on hardware we own or rent with full root access. No tokens leave the network. No usage data sent to third parties. Inference logs stay on-device. The only external dependency is downloading model weights from Ollama's registry — after that, everything is local.

Apps & Tools

AI uptime: calculating...

Keyboard Shortcuts