Agentic Trading
Product

Product Requirements

Status: Draft v0.1 · Last updated 2026-05-27

Vision

Move trading from human-in-the-loop to agent-in-the-loop. The user describes their edge in plain language; an AI agent makes the discretionary judgment every bar — read the regime, stand aside when it should, time the entry — and executes it automatically, inside risk caps the platform enforces independently. The user proves it works against historical data and deploys it live — all without writing production code.

The distinction matters: most "automated" trading still leaves the hard calls to the human and automates only the mechanical execution. Here the human contributes the edge once; the agent contributes the judgment and the execution. The platform becomes the operating system for autonomous trading strategies: a place where Skills are authored, simulated, deployed, monitored, and (eventually) shared.

Target users

Primary: Skill Author (the quant/trader)

  • Has trading intuition and strategy ideas
  • Comfortable with markets, less so with infrastructure
  • Wants fast iteration on strategy logic, not on glue code
  • Cares about: backtest fidelity, risk control, model quality, cost per decision

Secondary: Deployer (often the same person)

  • Operates one or more live Skills against their own funds
  • Wants visibility into why the agent did something, not just what
  • Needs trustworthy kill-switches and risk overrides
  • Cares about: uptime, latency to action, explainability, audit trail

Later: Skill Consumer (marketplace)

  • Wants to deploy someone else's proven Skill
  • Cares about: trust, performance history, transparency, fee model

Problem

Building an AI trading agent today requires:

  1. Plumbing: exchange APIs, data pipelines, scheduling, observability, deployment
  2. Safety: risk controls, kill-switches, sandboxing
  3. AI runtime: model routing, tool calling, context management
  4. Backtesting: historical replay with realistic fills, news, funding

A quant with a great strategy idea spends 90% of their time on (1)-(4) and 10% on the actual strategy. The cost is dropping for solo LLM use, but stitching it into a live trading agent is still bespoke for every team.

Solution

A platform with four core surfaces:

  1. Chat Agent — one branded, product-wide assistant the user talks to about everything (see ADR-0019). It authors Skills by conversation (tailored to the user's trading experience tier), coaches the user on positions and market structure, and prepares ops actions (deploy / stop / start backtest) as inline confirm-cards. It replaces the legacy 7-tab Skill editor.
  2. Skill Draft + Versions — every chat-driven authoring session writes to a skill_drafts row, validated against zod on each tool call. The Save button promotes a draft to a new immutable skill_versions row that the rest of the system consumes.
  3. Simulation Engine — replay historical price + news + funding context tick by tick. The same agent code that will run live runs in sim, against a paper broker. Output: trade log, equity curve, risk metrics, cost.
  4. Live Runtime — one process per deployed Skill on Fly.io. Subscribes to Hyperliquid, ticks the agent on schedule, routes proposed actions through the Execution Engine. The Deployer talks to the same Chat Agent for context and commands.

The Execution Engine is the platform's safety promise: AI trading agents never call exchanges directly. They emit JSON proposed actions; the engine validates against Skill-defined risk caps and rejects or executes. The Chat Agent never reaches the engine — its only write path is prepare_action, which renders a confirm-card the user must click.

MVP feature set

Must have

  • Auth (Supabase): sign up, sign in
  • Trader experience tier collected at first authoring session (novice / intermediate / expert), persisted on the user profile, editable anytime.
  • Skill schema (zod) with versioning
  • skill_drafts table with partial-payload validation, auto-save, and resume.
  • Chat-driven Skill authoring (replaces the legacy 7-tab form). Tier-aware questions, live draft preview pane, Save promotes draft to a new skill_versions row. Strategy mode toggle (thesis / rules / hybrid), leash dial, style picker, and template application are now agent capabilities, not separate UI surfaces. See ADR-0019 and ADR-0012.
  • Tool registry with the first four tools:
    • fetch_recent_bars
    • fetch_news_sentiment
    • get_portfolio
    • propose_order
  • Execution Engine: validation pipeline + risk checks + paper-broker adapter
  • Historical data ingestion for Hyperliquid majors (BTC, ETH, SOL) — last 12 months at 1m + 5m bars
  • News ingestion (provider TBD) with embeddings in pgvector
  • Simulation runner (CLI + UI trigger): pick Skill + date range → JSON report
  • Sim result viewer: equity curve, trade log, decision snapshots
  • Hyperliquid mainnet broker adapter (testnet deprecated — see ADR-0015)
  • Live deployment trigger (web → Fly Machine provisioning)
  • Live agent control loop with commands: stop, pause, resume, flatten, kill
  • Agent state writes: position, equity, last action, last reasoning
  • Single chat agent for the whole product: streaming, user-scoped, three opening modes (authoring / coach / ops), ≤14 tools. Pinned to Claude Sonnet 4.6 by default; configurable. See ADR-0019.
  • prepare_action confirm-cards in chat for deploy / stop / restart / start_backtest — chat prepares, user clicks. ADR-0007 preserved.
  • Decision snapshot capture (context, steps, reasoning, action, engine result)

Nice to have (post-MVP, still early)

  • Real-money Hyperliquid mainnet (gated behind explicit confirmation)
  • Multi-skill comparison view
  • Skill forking
  • MCP server attachment (community tools)
  • Cost dashboard per skill
  • Sentry integration for agent errors

Non-goals (for MVP)

  • Public marketplace — Skills are private per user.
  • Live P&L dashboard — read-only deployment detail page with on-demand state; no streaming UI.
  • Multi-exchange — Hyperliquid only.
  • Cross-skill portfolio optimization — each Skill manages its own position independently.
  • Auto-tuning / parameter sweeps — Authors iterate manually.
  • Team workspaces, RBAC — Single user per workspace.
  • Mobile app — Web only.

Success metrics (early)

MetricTarget by MVP end
Time from sign-up to first sim run< 10 min
Sim of 30 days at 5m bars completes in< 2 min
Time from Deploy Live click to agent tick< 60 sec
Chat agent first token latency< 2 sec
% decisions with complete snapshot captured100%
Engine rejection rate of unsafe actionstracked (no target — should be > 0 to prove engine works)

Risks and open questions

RiskMitigation
Hyperliquid mainnet downtime blocks live tradingCross-margin paper broker (ADR-0014) is the primary dev surface; mainnet for real money only
News data provider cost / qualitySpike with CryptoPanic free tier; budget for Tiingo or similar
Sim fidelity (slippage, partial fills) doesn't match liveDocument paper broker assumptions; validate top sims with small-size mainnet runs
AI cost per decision becomes prohibitive at high tick ratesPer-skill model swap (use Haiku for high-freq); enforce caps
Quants want to write Python helpers, not just promptsPhase 2: allow optional entry.ts / exit.ts deterministic hooks
Regulatory exposure of platform operating live trading for usersDocument clearly: users trade their own funds via their own keys

Out of scope (and why)

  • We do not custody user funds. Users connect their own exchange API keys. The platform is software, not a broker.
  • We do not give financial advice. Skills are user-authored configurations; the platform executes them.
  • We do not guarantee fills or uptime SLAs in MVP. Add SLAs only when we have ops maturity to back them.

On this page