Roadmap
Status: Draft v0.1 · Last updated 2026-05-27
Phases are sized in weeks of single-engineer effort, not calendar time. They are sequential — each phase assumes the previous one shipped.
Phase 0 — Foundation (week 1)
Goal: repo is alive, can deploy a hello-world, infra accounts are provisioned.
- Monorepo scaffolding (pnpm + Turborepo)
- Next.js app deployed to Vercel (preview + prod)
- Supabase project provisioned via Vercel Marketplace
- Fly.io app + machine template created (no real logic yet)
- AI Gateway configured, BYOK key set
- Hyperliquid mainnet read-only access verified (read-only endpoints don't require credentials; testnet deprecated in ADR-0015)
- CI: lint + typecheck on PR
- ADRs 0001–0008 ratified
Exit criteria: pnpm dev runs the web app locally with Supabase Auth working end-to-end.
Phase 1 — Skill authoring + Simulation (weeks 2–4)
Goal: an author can describe a strategy and backtest it against historical data.
packages/skill-schema— zod types, version-on-save- Skill editor form in web app
packages/toolsregistry with first 4 tools (bars, news, portfolio, propose_order)packages/execution-engine— validation + risk checks + paper brokerpackages/agent-runtime—runSkillusing AI SDK + Gateway- Historical data ingestion script (Hyperliquid majors, 12 months)
- News ingestion + pgvector embeddings (CryptoPanic free tier to start)
packages/simulator— replay loop, context assembly, decision snapshots- CLI:
pnpm sim --skill <id> --from <date> --to <date> - Web UI: trigger sim, view equity curve + trade log + decision snapshots
Exit criteria: an author can create a Skill in the UI, click Backtest, and see a complete report within 2 minutes for 30 days of 5m bars.
Phase 2 — Live deployment + Chat (weeks 5–7)
Goal: an author can deploy a tested Skill live (mainnet, conservative caps) and talk to it.
- Hyperliquid mainnet broker adapter (gated by explicit per-deployment confirmation; testnet deprecated — ADR-0015)
apps/live-runner— Fly app: control loop + tick loop, command listener via Postgres LISTEN/NOTIFY- Deploy trigger: web → Fly Machines API
- Agent state writes (
agent_state,agent_logs) - Deployment detail page: state snapshot, decision log, command panel (stop/pause/flatten/kill)
- Chat agent route:
streamTextwith introspection tools - Conversation persistence (
agent_conversations,agent_messages) - Suggested-prompts row in chat UI
Exit criteria: an author can deploy a Skill on mainnet with conservative caps, watch it tick, ask it "why did you open that position?" and get a grounded answer citing the actual decision snapshot.
Phase 3 — Hardening (weeks 8–10)
Goal: trustworthy enough to point at real money at meaningful size.
- Encrypted API key vault for user exchange credentials
- Daily loss halt + auto-flatten enforcement (engine-side, not agent-side)
- Sentry integration for agent errors
- Cost dashboard per Skill (tokens, $ per decision, $ per day)
- Rate limiting + retry policy on exchange calls
- Sim vs live drift report (compare paper fills to actual fills for the same skill)
- Auth hardening: 2FA, session management, audit log
Exit criteria: a Skill running on mainnet for 7 days without engineer intervention, with complete audit trail.
Phase 4 — Iteration acceleration (weeks 11–14)
Goal: make authoring strategies dramatically faster.
- Skill forking (clone + version)
- Multi-skill comparison view (same date range, side-by-side metrics)
- MCP server attachment (pre-approved community tools)
- Skill diff view (between versions)
- Sim queue with priority (so users running 10 sims don't block each other)
- Optional deterministic helpers (
entry.ts,exit.ts) compiled and called from agent loop
Exit criteria: an author can fork an existing Skill, change the model and one risk cap, and have a comparative backtest in under 5 minutes.
Phase 5 — Marketplace foundations (weeks 15+)
Goal: Skills become a shareable, valuable artifact — the platform moat.
- Public Skill profiles (opt-in)
- Standardized performance metrics (Sharpe, max drawdown, profit factor)
- Skill provenance + audit (immutable version log)
- Skill subscription model (consumer pays author per deployment)
- Forking with attribution
- Skill discovery / search
- Reviews and disputes process
Out of scope: trustless on-chain Skill ownership, copy-trading other people's live positions, social features.
Backlog (no phase assigned)
- Multi-exchange support (Binance, Bybit, OKX) — abstract broker interface accommodates this from day one, but adapters are real work
- Multi-agent decomposition (analyst → strategist → risk officer → executor handoffs)
- Team workspaces with RBAC
- Mobile app (probably never; PWA at most)
- Skill plagiarism / copy detection in marketplace
- Realized-PnL tax reports
What we will deliberately not do
- Custody user funds. Always user-keyed exchange access.
- Give financial advice. Skills are user configurations; we execute, we don't recommend.
- Auto-execute on agent-suggested risk overrides. The engine's risk caps are sovereign.
- Promise SLAs we can't back. Until we have ops maturity, no uptime guarantees.