Architecture Overview

Audience: engineers contributing to or integrating with the platform. Read first: glossary for project terminology.

One paragraph

Users author Skills (system prompt + model + tool whitelist + risk caps) by conversing with a single product-wide Chat Agent that writes to a draft (the agent replaces the legacy 7-tab form — see ADR-0019). The same Skill artifact is consumed by the Simulator (replay historical bars + news against a paper broker) and the Live Runner (long-running per-skill processes on Fly that tick on schedule against Hyperliquid). The Chat Agent also coaches the user on positions and the market, and prepares write actions (deploy / stop / start backtest) as inline confirm-cards the user clicks to commit. Every action the trading agent proposes is routed through the Execution Engine, which validates against the Skill's risk caps before any order is sent to a broker adapter. Everything writes to Supabase.

System diagram

                          ┌──────────────────────────────────────────────────┐
                          │ Web App  (Next.js 16 App Router · Vercel)        │
                          │   • Skill authoring (chat + draft preview)       │
                          │   • Sim runs list + reports                      │
                          │   • Deployments list + detail (no live dash)     │
                          │   • Chat slide-over (authoring / coach / ops)    │
                          │   • Commands (stop/pause/flatten/kill — buttons) │
                          └────────────┬──────────────────────┬──────────────┘
                                       │                      │
                          REST + Server Actions          SSE (streamText)
                                       │                      │
                                       ▼                      ▼
                          ┌─────────────────────┐  ┌────────────────────────┐
                          │ Control API         │  │ Chat Agent             │
                          │ (Next.js route      │  │ (Vercel Function)      │
                          │  handlers)          │  │   • user-scoped reads  │
                          │   • write commands  │  │   • market/news reads  │
                          │   • read state      │  │   • set_* draft writes │
                          │   • trigger sims    │  │   • prepare_action     │
                          │   • provision Fly   │  │     (confirm-card UI)  │
                          │   • save skill draft│  └────────────┬───────────┘
                          └───────┬─────────────┘               │
                                  │                             │
                                  ▼                             ▼
                          ┌──────────────────────────────────────────────────┐
                          │ Supabase (Postgres + pgvector + Auth + Storage)  │
                          │                                                  │
                          │  skills · skill_versions · skill_drafts          │
                          │  deployments · agent_commands · agent_state      │
                          │  agent_logs · agent_conversations · agent_messages│
                          │  decision_snapshots · sim_runs · sim_results     │
                          │  bars · news (+ embeddings) · funding_rates      │
                          │  exchange_credentials (encrypted)                │
                          └──────────────┬──────────────┬────────────────────┘
                                         ▲              ▲
                                   LISTEN/NOTIFY      writes
                                         │              │
                  ┌──────────────────────┴──┐        ┌──┴──────────────────┐
                  │ Live Runner             │        │ Simulator           │
                  │ (Fly Machine per Skill) │        │ (Vercel Workflow /  │
                  │   • control loop        │        │  Fly worker)        │
                  │   • tick loop           │        │   • replay bars     │
                  │   • runSkill()          │        │   • inject news     │
                  │   • snapshot writes     │        │   • runSkill()      │
                  │   • Hyperliquid WS      │        │   • paper broker    │
                  └──────────┬──────────────┘        └──────────┬──────────┘
                             │                                  │
                             └────────────┬─────────────────────┘
                                          │
                                          ▼
                          ┌──────────────────────────────────────────┐
                          │ Execution Engine                         │
                          │   validate → risk-check → record → exec  │
                          └──────────┬───────────────────────────────┘
                                     │
                  ┌──────────────────┼────────────────────────────┐
                  ▼                  ▼                            ▼
          ┌──────────────┐  ┌──────────────────┐         ┌────────────────────┐
          │ Paper Broker │  │ Hyperliquid      │         │ Hyperliquid        │
          │ (sim + dev)  │  │ Testnet adapter  │         │ Mainnet adapter    │
          │              │  │ (live default)   │         │ (gated, post-MVP)  │
          └──────────────┘  └──────────────────┘         └────────────────────┘

                          ┌────────────────────────────────────────────┐
                          │ Data Ingest (scheduled jobs)               │
                          │   • bars (Hyperliquid info API)            │
                          │   • news (CryptoPanic / Tiingo)            │
                          │   • funding rates                          │
                          │   • embed news → pgvector                  │
                          └────────────────────────────────────────────┘

Components

Web app (`apps/web`)

Next.js 16 App Router on Vercel. Hosts:

Authenticated UI for Skill authoring, sim review, deployment management, and chat
Server Actions / route handlers that act as the Control API
The Chat Agent route handler (/api/deployments/[id]/chat) using streamText

Live Runner (`apps/live-runner`)

Long-running Node service deployed as one Fly Machine per active Deployment. Subscribes to Hyperliquid websockets, ticks on the Skill's schedule, calls runSkill(), persists state and snapshots. Listens for commands via Postgres LISTEN/NOTIFY. See live-runtime.md.

Simulator (`packages/simulator` + worker)

Replays a date range of bars + news + funding, calling runSkill() once per tick against a paper broker. Produces a sim_runs record with full decision snapshots. Runs either as a Vercel Workflow (short sims) or as a one-off Fly worker (long sims). See simulation.md.

Chat Agent (`apps/web/app/api/chat`)

Per-request Vercel Function that serves a single, branded, user-scoped chat across the whole product. One agent, three opening modes (authoring / coach / ops) set by the page that mounts the slide-over panel. Tools split into authoring writes (set_* on a draft), user-data reads, market/news reads, and prepare_action (renders a confirm-card the user clicks to commit). Hard cap of 14 tools. Pinned to Claude Sonnet 4.6 by default; OpenRouter auto is intentionally not used here so persona, tool-call quality, and prompt caching stay consistent. See chat-agent.md and ADR-0019.

Execution Engine (`packages/execution-engine`)

The trust boundary. Receives proposed actions from any agent caller, validates them, applies risk caps, records the result in decision_snapshots, and either rejects or hands off to a broker adapter. Same module in sim and live. See execution-engine.md.

Tool Registry (`packages/tools`)

Built-in tool definitions using AI SDK's tool() primitive, plus the loader for MCP-server tools. Each tool is a factory (ctx) => Tool so it gets per-request context (deployment_id, broker handle, etc.). See tools-and-mcp.md.

Agent Runtime (`packages/agent-runtime`)

Thin wrapper around AI SDK that:

Hydrates the Skill's whitelisted tools (built-in + MCP)
Assembles context (bars + news + portfolio + funding) per Skill config
Calls generateText with the configured model via AI Gateway
Returns the typed result for the caller (live runner or simulator)

See agent-runtime.md.

User opens /skills/new. Server creates a skill_drafts row seeded with defaults. Slide-over chat mounts in authoring mode with the draft id.
Chat agent asks tier-appropriate questions; each tool call (set_basics, set_strategy, …) patches skill_drafts.payload and re-runs zod against the partial.
Draft preview pane re-reads on every patch and shows validation chips.
User clicks Save. Server validates the full SkillPayload, inserts into skills (if new) and skill_versions (always), marks the draft saved.
UI shows the new version, ready to backtest.

Running a simulation

User clicks Backtest, picks date range
Server creates sim_runs row (status queued), enqueues a worker job
Worker loads Skill, opens an iterator over bars in range
For each tick: assemble context → runSkill → engine processes proposed action against paper broker → write decision_snapshots row
On completion, compute summary metrics, update sim_runs (status complete)
UI polls or subscribes; renders report

Deploying live

User clicks Deploy Live, confirms (paper / mainnet) — testnet removed per ADR-0015
Server inserts deployments row (status provisioning)
Server calls Fly Machines API: create machine with env DEPLOYMENT_ID=<id> and image tag matching the runner build
Machine starts, runner reads DEPLOYMENT_ID, loads Skill + version from DB
Runner starts control loop (LISTEN) + tick loop (schedule)
Runner updates deployments.status = running and starts writing agent_state snapshots
UI shows deployment as live

Sending a command

User clicks Stop on the deployment detail page
Server INSERTs into agent_commands and NOTIFY agent_commands_<deployment_id>
Runner's control loop wakes, handles the command (drain → close WS → exit)
Runner updates deployments.status = stopped, exits process
Fly destroys the machine

Chatting with the agent

User opens the slide-over chat from any authed page. The page provides an opening context (mode: authoring | coach | ops + a focus id where applicable).
POST to /api/chat with the message history and (on first turn only) the opening context.
Server composes the system prompt (product persona + tier addendum + mode addendum + focus context + trust rules), hydrates the chat tool registry, calls streamText with the configured chat model (default Claude Sonnet 4.6).
AI SDK streams tool calls and text back as SSE. Confirm-cards for write actions render inline; the user clicks Confirm to commit (the click hits the same server actions that buttons elsewhere use).
On finish, server persists user + assistant messages to agent_messages.

Trust and safety boundaries

Boundary	Enforced by
User can only see their own Skills	Supabase RLS
Agent can only call whitelisted tools	Tool hydration in agent runtime
Chat agent cannot place orders	Tool whitelist excludes `propose_order`
Chat agent cannot read other users' data	RLS on every chat read tool
Chat agent cannot speak as the trading agent	System-prompt trust rules + separate persona
Proposed action passes risk caps	Execution Engine
Daily loss halt triggers auto-flatten	Execution Engine (engine-side, not agent-side)
Exchange API keys never reach the agent	Engine holds keys; agent gets a broker handle
Imperative commands cannot come from chat	`prepare_action` only renders confirm-card; user click commits

Full detail: security/trust-boundaries.md.

Tech stack

Layer	Choice	ADR
Web framework	Next.js 16 App Router
Hosting (web + sim)	Vercel (Fluid Compute)
Hosting (live runner)	Fly.io Machines (one per deployment)	0004
Database	Supabase (Postgres + pgvector + Auth + Storage + Realtime)	0003
Agent runtime	Vercel AI SDK v6	0002
Model routing	Vercel AI Gateway (BYOK, 0% markup)	0002
Tool extensibility	Local registry + MCP
Exchange (MVP)	Hyperliquid via `@nktkas/hyperliquid`	0001
Schema validation	zod
Monorepo	pnpm workspaces + Turborepo
CI	GitHub Actions
Error monitoring	Sentry (Phase 3)

Repository layout

agentic-trading/
├── apps/
│   ├── web/                    # Next.js — UI + control API + chat agent route
│   └── live-runner/            # Node service — deployed to Fly per Skill
├── packages/
│   ├── skill-schema/           # zod types, version helpers, prompt assembly
│   ├── agent-runtime/          # runSkill(): tool hydration + generateText
│   ├── tools/                  # built-in tool registry + MCP loader
│   │   └── src/tools/
│   │       ├── fetch-recent-bars.ts
│   │       ├── fetch-news-sentiment.ts
│   │       ├── get-portfolio.ts
│   │       ├── propose-order.ts
│   │       └── introspection/        # read-only tools for chat agent
│   ├── execution-engine/       # validation, risk checks, broker abstraction
│   ├── brokers/
│   │   ├── paper/              # deterministic sim broker (cross-margin, ADR-0014)
│   │   └── hyperliquid-mainnet/   # (Phase 2/3; testnet removed — ADR-0015)
│   ├── simulator/              # replay loop, sim run management
│   ├── data-ingest/            # scheduled price/news/funding ingestion
│   ├── db/                     # Supabase client, RLS policies, migrations
│   └── shared/                 # types used across packages
├── docs/                       # you are here
└── infra/                      # Fly app config, supabase config, vercel.ts

Out-of-the-box defaults

All packages: TypeScript strict mode
All shared types come from packages/shared or packages/skill-schema
All DB access goes through packages/db — no raw Supabase clients in apps
All AI model calls go through AI Gateway — no direct provider SDKs
All exchange calls go through the Execution Engine — agents never have direct broker handles