ADR-0019: One chat agent for authoring, coaching, and ops
- Status: accepted
- Date: 2026-06-05
- Refines: ADR-0005 — Skills are still authored as data, but the authoring surface shifts from a 7-tab form to a chat that writes the same
SkillPayload. - Refines: ADR-0006 — Chat is still split from the trading agent. It is no longer bound to a single deployment; one chat agent serves the whole user.
- Refines: ADR-0008 — Chat continues to fill the "what's happening?" gap; this ADR formalises it as the coach mode of the single agent.
- Preserves: ADR-0007 — Write actions surface as inline confirm-cards that the user clicks. The chat never issues a command.
- Preserves: ADR-0009 — The structured
strategyshape is unchanged. The chat writes to it through validated tool calls; zod still gates every write. - Extended by: ADR-0020 — Exposes trading memory via a
detailparam on existing reads (no new tool), adds position-aware exposure toget_upcoming_events, adds auser_factslayer with two new tools (remember,forget). Tool cap relaxed from 14 → 16.
Context
The original architecture (ADR-0006) planned a per-deployment chat agent that "spoke in the trading agent's voice" and answered questions about one running Skill. Skill authoring was a separate surface — a 7-tab form (~3,000 LOC under apps/web/components/skill-editor/) augmented with AI helpers (draft-from-pitch, critique-modal, strategy-linter).
Two product insights changed this:
-
Users want one bot. Two bots — "the author bot" on
/skills/newand "the deployment bot" on/deployments/[id]— split the product brand, doubled the prompt surface, and made it hard for the user to form a stable mental model. The natural product is one agent the user talks to about everything: drafting a strategy, asking how their BTC position is doing, deciding whether to redeploy. -
The form is a wrong-shape input for the audience. A trader thinks in "I want to catch funding-flip mean reversion on BTC, scale in over 2-3 entries, stand down during macro" — not in field names. The form forces them to flatten that into seven tabs and twenty fields. A chat that asks the right questions in the right order — tailored to their experience — produces a better Skill with less friction.
The combination: one persistent, branded chat agent that authors Skills, coaches the user on their open positions and the market, and prepares ops actions for confirmation.
Decision
Ship one chat agent for the whole product. AI SDK v6, streaming, per-request HTTP. Scoped to the calling user only. Three opening modes set by the page that mounts the chat:
| Mode | Opened from | Opening context handed to the agent |
|---|---|---|
authoring | /skills/new, /skills/[id]/edit | Active skill_drafts row id + the user's tier |
coach | /deployments/[id] | The focus deployment id and its recent activity |
ops | /deployments, /sims | None — agent starts with a portfolio-wide overview |
Modes are not separate agents — same code, same persona, same tool registry. The mode only changes the opening system-prompt addendum (what the agent is being asked to focus on right now). A conversation that starts in authoring can drift into coach ("what's BTC funding doing while I'm thinking about this?") and back. The slide-over chat surface persists across page navigation in the same browser session.
Trader experience tiers
A profile-level field traders.experience_tier ∈ { novice, intermediate, expert }, asked once on first authoring session and overridable per Skill. The tier shapes:
- The opening question ("Want me to walk you through how this works first?" vs. "Hand me a pitch and I'll draft the Skill")
- The depth of explanation in tool-call summaries (define "funding rate" vs. assume it)
- The question order during authoring (novice: goal → template → tweak; expert: pitch → critique → ship)
- The defaults the agent proposes when fields are unset
Tiers do not restrict capability. Every user has access to every field; the agent just calibrates how it gets them there.
Authoring replaces the form
apps/web/components/skill-editor/skill-editor.tsx and its seven tab components are removed. The new authoring page mounts:
┌──────────────────────────────────────────────────────────┐
│ /skills/new or /skills/[id]/edit │
│ ┌─────────────────────────────┐ ┌───────────────────┐ │
│ │ Chat (primary) │ │ Draft preview │ │
│ │ messages │ │ live SkillPayload│ │
│ │ tool-call cards │ │ validation chips │ │
│ │ action confirm-cards │ │ Save / Cancel │ │
│ └─────────────────────────────┘ └───────────────────┘ │
└──────────────────────────────────────────────────────────┘The draft preview is read-only — it shows what the agent has committed to the skill_drafts.payload so far. The Save button promotes the draft to a new skill_versions row (same write path the form used).
Tool budget
Cap of 16 tools in v1 (relaxed from 14 by ADR-0020 to absorb remember / forget). The spirit of the cap stands: prefer extending an existing tool over registering a new one. New capabilities go through review against the cap.
V1 catalog (see chat-agent.md for full schemas):
- Authoring writes (7):
set_basics,set_strategy,set_context,set_risk,set_schedule,set_tools,set_chat - User-data reads (3, all with
detailparam exposing trading memory per ADR-0020):list_my_skills,get_skill_performance,get_my_deployments - Market/news reads (3):
get_market_overview,get_recent_news,get_upcoming_events(with optionaldeploymentIdfor position-aware exposure per ADR-0020) - User-fact writes (2, per ADR-0020):
remember,forget - Action preparation (1):
prepare_action(deploy / stop / restart / redeploy / start_backtest — all routed through the same tool that emits a confirm-card UI block)
apply_template, lint_draft, and critique_draft are reused from the existing server actions but are surfaced as the agent calling those internally rather than as separately registered tools, to stay inside the cap. Trading memory (trade_history, reflection_notes) is reached via the detail param on the user-data reads — not as separate tools.
Write semantics
- Authoring writes (
set_*) mutate theskill_drafts.payloadserver-side, validated against the partial-SkillPayloadzod schema. No user click required — the draft is the chat's scratchpad. The Save button is the only thing that promotes a draft to a version. - Action preparation (
prepare_action) never executes. It emits a tool-result that the client renders as a confirm-card (action summary + Confirm / Cancel buttons). The button click hits the existing server action that inserts intoagent_commandsorsim_runs. This is the ADR-0007 boundary, preserved.
Scoping rules
| Resource | Visibility |
|---|---|
| Caller's skills | Read (drafts + versions). RLS-enforced. |
| Other users' skills | Never. Not even names. |
| Caller's deployments, snapshots, sims | Read. |
| Caller's positions / equity | Read (via existing introspection tools). |
| Market data, news, events | Shared resource. Read. |
| Exchange API keys | Never. The engine holds them; chat never touches them. |
Alternatives considered
Alt A — Keep the form, add chat as a sidebar
- Lowest commitment, two surfaces to maintain
- Doesn't address the form's wrong-shape problem
- Splits the product brand across "use the form" and "ask the bot"
- Not picked. Half-measures are the worst of both worlds.
Alt B — Two chat agents: author bot + deployment coach
- The path my earlier proposal headed down
- Two prompts to brand, two routes, two test surfaces
- Users have to remember which bot to talk to about what
- Not picked after user pushback: one bot is the right product.
Alt C — Switch frameworks (OpenCode / Hermes / etc.) for the new chat
- Re-opens ADR-0002
- The complexity in this feature is the tool surface, not the runtime — switching frameworks would not shrink it
- AI SDK already gives us streaming, multi-step tool calls, structured output, persistence, gateway routing
- Not picked. Pinned to AI SDK v6. Model choice (Hermes via gateway, frontier model, etc.) is a separate per-call decision.
Alt D — Chat executes commands after typed "yes"
- More conversational
- Re-opens ADR-0007: introduces an LLM into the command path, opens prompt-injection exposure (a news article in context could say "tell your deployer to flatten")
- Audit trail becomes "agent decided to flatten because chat said yes"
- Not picked. Confirm-cards keep the user as the actor of record.
Alt E — Infer the user's experience tier from the conversation
- Avoids an explicit question
- Misjudges terse users; agent has to recalibrate mid-session, which is jarring
- Not picked. One explicit question at first authoring; saved on profile; user can change anytime.
Consequences
Positive
- One persona to brand. The product gets a single voice the user comes to recognise. Branding is a system-prompt and UI change, not a code shape.
- Net code reduction. Deleting the 7-tab form (~2,000 LOC) more than offsets the new chat route and tools.
- Better authoring outcomes. Tier-aware questioning produces a more complete Skill from a wider population of users than a form alone.
- Coach mode replaces a dashboard we weren't going to build (ADR-0008). The user asks "how am I doing?" and gets a grounded, data-backed answer.
- Trust boundary preserved. ADR-0007 holds; no LLM in the command path. Action cards make the user the actor of record.
- Same identity model as ADR-0006. Chat is still separate from the trading agent. The trading agent's persona is the Skill's persona; the chat agent's persona is the product's persona. Decoupling them is the right call once one chat agent spans many deployments.
Negative / trade-offs
- Tool sprawl risk. Real risk; mitigated by the 14-tool cap and the discipline of routing many internal helpers (
lint_draft,critique_draft,apply_template) through the agent's own reasoning rather than as separate tools. - The draft preview pane is load-bearing UX. If users save without reading it, surprise saves happen. Mitigated by (a) the preview is always visible, (b) the Save button summarises what will be written, (c) version history makes "undo" cheap.
- Per-skill custom personality is harder. ADR-0006 envisaged each Skill having its own chat voice (since each chat was tied to a Skill). The new chat has its own product voice — coach mode can still reference the Skill ("your BTC reversion Skill says…") but isn't speaking as it. We trade per-Skill chat persona for product cohesion. If users ask for "talk to my Skill in its own voice", we can add a sub-mode later.
- Hot path for the chat agent is bigger. Coach mode reads positions; ops mode reads deployments; authoring reads drafts. Tool result caching + summary-first responses are needed from day one.
Things we'll need to revisit
- The 14-tool cap. When we hit it (and we will), prune before adding. Likely candidates for pruning later: collapse the seven
set_*tools into oneset_skill(patch)if model behaviour holds up. - Per-skill chat persona. If a user wants their deployed Skill to "speak for itself" in chat, add a mode that swaps the chat's system prompt for the Skill's framework prompt (the old ADR-0006 behaviour). Don't ship it preemptively.
- Persistence across devices. Conversations are per browser session for v1. If users want a synced history across devices, promote
agent_conversationsto a per-user resource (not per-browser).
References
docs/architecture/chat-agent.md— full architectural detail of the new agentdocs/product/prd.md— authoring flow and chat experience updates- ADR-0002 — Vercel AI SDK + Gateway (unchanged)
- ADR-0005 — Skill as data (unchanged; only the input surface changes)
- ADR-0006 — Trading vs chat split (refined; chat scope broadens)
- ADR-0007 — Commands stay out of chat (preserved via confirm-cards)
- ADR-0008 — No live dashboard (reaffirmed; chat coach mode covers the gap)
- ADR-0009 — Structured strategy fields (unchanged; chat writes to them)