Roadmap

Status: Draft v0.1 · Last updated 2026-05-27

Phases are sized in weeks of single-engineer effort, not calendar time. They are sequential — each phase assumes the previous one shipped.

Phase 0 — Foundation (week 1)

Goal: repo is alive, can deploy a hello-world, infra accounts are provisioned.

Monorepo scaffolding (pnpm + Turborepo)
Next.js app deployed to Vercel (preview + prod)
Supabase project provisioned via Vercel Marketplace
Fly.io app + machine template created (no real logic yet)
AI Gateway configured, BYOK key set
Hyperliquid mainnet read-only access verified (read-only endpoints don't require credentials; testnet deprecated in ADR-0015)
CI: lint + typecheck on PR
ADRs 0001–0008 ratified

Exit criteria: pnpm dev runs the web app locally with Supabase Auth working end-to-end.

Phase 1 — Skill authoring + Simulation (weeks 2–4)

Goal: an author can describe a strategy and backtest it against historical data.

packages/skill-schema — zod types, version-on-save
Skill editor form in web app
packages/tools registry with first 4 tools (bars, news, portfolio, propose_order)
packages/execution-engine — validation + risk checks + paper broker
packages/agent-runtime — runSkill using AI SDK + Gateway
Historical data ingestion script (Hyperliquid majors, 12 months)
News ingestion + pgvector embeddings (CryptoPanic free tier to start)
packages/simulator — replay loop, context assembly, decision snapshots
CLI: pnpm sim --skill <id> --from <date> --to <date>
Web UI: trigger sim, view equity curve + trade log + decision snapshots

Exit criteria: an author can create a Skill in the UI, click Backtest, and see a complete report within 2 minutes for 30 days of 5m bars.

Phase 2 — Live deployment + Chat (weeks 5–7)

Goal: an author can deploy a tested Skill live (mainnet, conservative caps) and talk to it.

Hyperliquid mainnet broker adapter (gated by explicit per-deployment confirmation; testnet deprecated — ADR-0015)
apps/live-runner — Fly app: control loop + tick loop, command listener via Postgres LISTEN/NOTIFY
Deploy trigger: web → Fly Machines API
Agent state writes (agent_state, agent_logs)
Deployment detail page: state snapshot, decision log, command panel (stop/pause/flatten/kill)
Chat agent route: streamText with introspection tools
Conversation persistence (agent_conversations, agent_messages)
Suggested-prompts row in chat UI

Exit criteria: an author can deploy a Skill on mainnet with conservative caps, watch it tick, ask it "why did you open that position?" and get a grounded answer citing the actual decision snapshot.

Phase 3 — Hardening (weeks 8–10)

Goal: trustworthy enough to point at real money at meaningful size.

Encrypted API key vault for user exchange credentials
Daily loss halt + auto-flatten enforcement (engine-side, not agent-side)
Sentry integration for agent errors
Cost dashboard per Skill (tokens, $ per decision, $ per day)
Rate limiting + retry policy on exchange calls
Sim vs live drift report (compare paper fills to actual fills for the same skill)
Auth hardening: 2FA, session management, audit log

Exit criteria: a Skill running on mainnet for 7 days without engineer intervention, with complete audit trail.

Phase 4 — Iteration acceleration (weeks 11–14)

Goal: make authoring strategies dramatically faster.

Skill forking (clone + version)
Multi-skill comparison view (same date range, side-by-side metrics)
MCP server attachment (pre-approved community tools)
Skill diff view (between versions)
Sim queue with priority (so users running 10 sims don't block each other)
Optional deterministic helpers (entry.ts, exit.ts) compiled and called from agent loop

Exit criteria: an author can fork an existing Skill, change the model and one risk cap, and have a comparative backtest in under 5 minutes.

Phase 5 — Marketplace foundations (weeks 15+)

Goal: Skills become a shareable, valuable artifact — the platform moat.

Public Skill profiles (opt-in)
Standardized performance metrics (Sharpe, max drawdown, profit factor)
Skill provenance + audit (immutable version log)
Skill subscription model (consumer pays author per deployment)
Forking with attribution
Skill discovery / search
Reviews and disputes process

Out of scope: trustless on-chain Skill ownership, copy-trading other people's live positions, social features.

Backlog (no phase assigned)

Multi-exchange support (Binance, Bybit, OKX) — abstract broker interface accommodates this from day one, but adapters are real work
Multi-agent decomposition (analyst → strategist → risk officer → executor handoffs)
Team workspaces with RBAC
Mobile app (probably never; PWA at most)
Skill plagiarism / copy detection in marketplace
Realized-PnL tax reports

What we will deliberately not do

Custody user funds. Always user-keyed exchange access.
Give financial advice. Skills are user configurations; we execute, we don't recommend.
Auto-execute on agent-suggested risk overrides. The engine's risk caps are sovereign.
Promise SLAs we can't back. Until we have ops maturity, no uptime guarantees.

Roadmap

On this page