Product Requirements
Status: Draft v0.1 · Last updated 2026-05-27
Vision
Move trading from human-in-the-loop to agent-in-the-loop. The user describes their edge in plain language; an AI agent makes the discretionary judgment every bar — read the regime, stand aside when it should, time the entry — and executes it automatically, inside risk caps the platform enforces independently. The user proves it works against historical data and deploys it live — all without writing production code.
The distinction matters: most "automated" trading still leaves the hard calls to the human and automates only the mechanical execution. Here the human contributes the edge once; the agent contributes the judgment and the execution. The platform becomes the operating system for autonomous trading strategies: a place where Skills are authored, simulated, deployed, monitored, and (eventually) shared.
Target users
Primary: Skill Author (the quant/trader)
- Has trading intuition and strategy ideas
- Comfortable with markets, less so with infrastructure
- Wants fast iteration on strategy logic, not on glue code
- Cares about: backtest fidelity, risk control, model quality, cost per decision
Secondary: Deployer (often the same person)
- Operates one or more live Skills against their own funds
- Wants visibility into why the agent did something, not just what
- Needs trustworthy kill-switches and risk overrides
- Cares about: uptime, latency to action, explainability, audit trail
Later: Skill Consumer (marketplace)
- Wants to deploy someone else's proven Skill
- Cares about: trust, performance history, transparency, fee model
Problem
Building an AI trading agent today requires:
- Plumbing: exchange APIs, data pipelines, scheduling, observability, deployment
- Safety: risk controls, kill-switches, sandboxing
- AI runtime: model routing, tool calling, context management
- Backtesting: historical replay with realistic fills, news, funding
A quant with a great strategy idea spends 90% of their time on (1)-(4) and 10% on the actual strategy. The cost is dropping for solo LLM use, but stitching it into a live trading agent is still bespoke for every team.
Solution
A platform with four core surfaces:
- Chat Agent — one branded, product-wide assistant the user talks to about everything (see ADR-0019). It authors Skills by conversation (tailored to the user's trading experience tier), coaches the user on positions and market structure, and prepares ops actions (deploy / stop / start backtest) as inline confirm-cards. It replaces the legacy 7-tab Skill editor.
- Skill Draft + Versions — every chat-driven authoring session writes to a
skill_draftsrow, validated against zod on each tool call. The Save button promotes a draft to a new immutableskill_versionsrow that the rest of the system consumes. - Simulation Engine — replay historical price + news + funding context tick by tick. The same agent code that will run live runs in sim, against a paper broker. Output: trade log, equity curve, risk metrics, cost.
- Live Runtime — one process per deployed Skill on Fly.io. Subscribes to Hyperliquid, ticks the agent on schedule, routes proposed actions through the Execution Engine. The Deployer talks to the same Chat Agent for context and commands.
The Execution Engine is the platform's safety promise: AI trading agents never call exchanges directly. They emit JSON proposed actions; the engine validates against Skill-defined risk caps and rejects or executes. The Chat Agent never reaches the engine — its only write path is prepare_action, which renders a confirm-card the user must click.
MVP feature set
Must have
- Auth (Supabase): sign up, sign in
- Trader experience tier collected at first authoring session (
novice/intermediate/expert), persisted on the user profile, editable anytime. - Skill schema (zod) with versioning
-
skill_draftstable with partial-payload validation, auto-save, and resume. - Chat-driven Skill authoring (replaces the legacy 7-tab form). Tier-aware questions, live draft preview pane, Save promotes draft to a new
skill_versionsrow. Strategy mode toggle (thesis / rules / hybrid), leash dial, style picker, and template application are now agent capabilities, not separate UI surfaces. See ADR-0019 and ADR-0012. - Tool registry with the first four tools:
fetch_recent_barsfetch_news_sentimentget_portfoliopropose_order
- Execution Engine: validation pipeline + risk checks + paper-broker adapter
- Historical data ingestion for Hyperliquid majors (BTC, ETH, SOL) — last 12 months at 1m + 5m bars
- News ingestion (provider TBD) with embeddings in pgvector
- Simulation runner (CLI + UI trigger): pick Skill + date range → JSON report
- Sim result viewer: equity curve, trade log, decision snapshots
- Hyperliquid mainnet broker adapter (testnet deprecated — see ADR-0015)
- Live deployment trigger (web → Fly Machine provisioning)
- Live agent control loop with commands: stop, pause, resume, flatten, kill
- Agent state writes: position, equity, last action, last reasoning
- Single chat agent for the whole product: streaming, user-scoped, three opening modes (
authoring/coach/ops), ≤14 tools. Pinned to Claude Sonnet 4.6 by default; configurable. See ADR-0019. -
prepare_actionconfirm-cards in chat for deploy / stop / restart / start_backtest — chat prepares, user clicks. ADR-0007 preserved. - Decision snapshot capture (context, steps, reasoning, action, engine result)
Nice to have (post-MVP, still early)
- Real-money Hyperliquid mainnet (gated behind explicit confirmation)
- Multi-skill comparison view
- Skill forking
- MCP server attachment (community tools)
- Cost dashboard per skill
- Sentry integration for agent errors
Non-goals (for MVP)
- Public marketplace — Skills are private per user.
- Live P&L dashboard — read-only deployment detail page with on-demand state; no streaming UI.
- Multi-exchange — Hyperliquid only.
- Cross-skill portfolio optimization — each Skill manages its own position independently.
- Auto-tuning / parameter sweeps — Authors iterate manually.
- Team workspaces, RBAC — Single user per workspace.
- Mobile app — Web only.
Success metrics (early)
| Metric | Target by MVP end |
|---|---|
| Time from sign-up to first sim run | < 10 min |
| Sim of 30 days at 5m bars completes in | < 2 min |
Time from Deploy Live click to agent tick | < 60 sec |
| Chat agent first token latency | < 2 sec |
| % decisions with complete snapshot captured | 100% |
| Engine rejection rate of unsafe actions | tracked (no target — should be > 0 to prove engine works) |
Risks and open questions
| Risk | Mitigation |
|---|---|
| Hyperliquid mainnet downtime blocks live trading | Cross-margin paper broker (ADR-0014) is the primary dev surface; mainnet for real money only |
| News data provider cost / quality | Spike with CryptoPanic free tier; budget for Tiingo or similar |
| Sim fidelity (slippage, partial fills) doesn't match live | Document paper broker assumptions; validate top sims with small-size mainnet runs |
| AI cost per decision becomes prohibitive at high tick rates | Per-skill model swap (use Haiku for high-freq); enforce caps |
| Quants want to write Python helpers, not just prompts | Phase 2: allow optional entry.ts / exit.ts deterministic hooks |
| Regulatory exposure of platform operating live trading for users | Document clearly: users trade their own funds via their own keys |
Out of scope (and why)
- We do not custody user funds. Users connect their own exchange API keys. The platform is software, not a broker.
- We do not give financial advice. Skills are user-authored configurations; the platform executes them.
- We do not guarantee fills or uptime SLAs in MVP. Add SLAs only when we have ops maturity to back them.