Launch checklist
This doc is operational, not aspirational. Every step has a specific command, SQL query, or UI gesture. If you find yourself improvising, slow down — the platform is designed so the slow path is the safe path.
The path from "code is in the repo" to "first real-money tick on Hyperliquid mainnet" with a tiny test position. Five phases — each with verification steps that catch failures before they cost money.
This doc is operational, not aspirational. Every step has a specific command, SQL query, or UI gesture. If you find yourself improvising, slow down — the platform is designed so the slow path is the safe path.
Cross-refs:
architecture/live-runtime.md— runtime topology this checklist exercisessecurity/risk-controls.md— L1–L7 safety stackops/environments.md— env var inventorydecisions/0016-hyperliquid-agent-wallet-model.md— wallet model + revocation semantics
Phase 0 — Platform prep (one-time, before any user)
Platform-side prerequisites. None of them are obvious from the code; all of them will silently bite if you skip.
0.1 — Apply all migrations to prod Supabase
# From the repo root, with supabase CLI linked to the prod project:
supabase link --project-ref <prod-ref>
supabase db pushVerify the wallet/audit tables landed:
select tablename from pg_tables
where tablename in (
'hyperliquid_master_wallets',
'hyperliquid_agents',
'pending_hyperliquid_agents',
'mainnet_orders'
)
order by tablename;
-- Expect 4 rows.
-- Confirm Vault is actually enabled:
select extname, extversion from pg_extension
where extname in ('pgsodium', 'supabase_vault');
-- Expect 2 rows.If supabase_vault is missing, stop and contact Supabase support to enable it before going further — the agent-approval flow can't work without it. Vault availability is project-tier-dependent.
0.2 — Smoke-test the Vault RPCs end-to-end
The single most failure-prone piece. In SQL editor:
-- Should return a UUID:
select public.create_hyperliquid_agent_secret(
'test_secret_value',
'smoke_test_' || gen_random_uuid()::text
);
-- Read it back via the agent path. Returns NULL because no hyperliquid_agents
-- row references this secret — that's expected and proves the join works:
select public.get_hyperliquid_agent_secret(gen_random_uuid());
-- Clean up the orphan vault entry:
delete from vault.secrets where name like 'smoke_test_%';If create_hyperliquid_agent_secret errors with permission denied, the SECURITY DEFINER didn't take — check the function owner is postgres and that the vault.create_secret overload signature matches what migration 0006 declares.
0.3 — Create the Fly app
fly apps create agentic-live-runner --org <your-org>
# App-wide secrets, injected into every machine:
fly secrets set \
SUPABASE_URL="https://<prod>.supabase.co" \
SUPABASE_SERVICE_ROLE_KEY="<service-role-key>" \
SUPABASE_DB_PASSWORD="<db-password>" \
OPENROUTER_API_KEY="<key>" \
HYPERLIQUID_LIVE_ENABLED="true" \
-a agentic-live-runner
# Build + push the image; note the SHA in the output:
fly deploy \
--config apps/live-runner/fly.toml \
--dockerfile apps/live-runner/Dockerfile \
--remote-only --build-only --push \
-a agentic-live-runnerNote on
SUPABASE_DB_PASSWORD: distinct from the service-role JWT. The runner uses rawpgfor the LISTEN/NOTIFY command channel, which needs the database password from Supabase dashboard → Project Settings → Database. The runner also acceptsPOSTGRES_PASSWORD(the name the Supabase Vercel Marketplace integration uses) as a fallback.
Build context: the Dockerfile
COPYs from the repo root, sofly deploymust run from there with--config+--dockerfilepointing atapps/live-runner/. Running from inside the app dir fails with "no such file".
Verify:
fly secrets list -a agentic-live-runner # 5 secrets, all "Deployed"
fly machines list -a agentic-live-runner # empty — machines are on-demand0.4 — Set Vercel env vars
In Vercel dashboard → project → Settings → Environment Variables, production environment:
| Variable | Value |
|---|---|
FLY_API_TOKEN | fly tokens create deploy -a agentic-live-runner |
FLY_LIVE_RUNNER_APP | agentic-live-runner |
FLY_LIVE_RUNNER_IMAGE | SHA-pinned image string from 0.3 (registry.fly.io/agentic-live-runner@sha256:…) |
FLY_LIVE_RUNNER_REGION | nrt |
HYPERLIQUID_LIVE_ENABLED | true |
DEPLOYMENTS_DISABLED | false |
CRON_SECRET | openssl rand -hex 32 output |
NEXT_PUBLIC_THIRDWEB_CLIENT_ID | From thirdweb dashboard → Project → Settings → API Keys |
SUPABASE_URL / SUPABASE_SERVICE_ROLE_KEY / NEXT_PUBLIC_SUPABASE_* | Auto-populated if Supabase Marketplace integration is wired |
SUPABASE_DB_PASSWORD (and/or POSTGRES_PASSWORD) | Forwarded into Fly machines via web's createDeployment; needed for the live runner's LISTEN/NOTIFY connection. Marketplace integrations populate POSTGRES_PASSWORD; both names work. |
OPENROUTER_API_KEY | Same as Fly app secret |
After deploy, verify the crons appear under Vercel → Crons. Both /api/jobs/heartbeat-sweep and /api/jobs/pending-agents-gc should be listed with a next-run time in the future.
Caveat — Vercel Hobby plan cron limit. Hobby plans cap cron frequency at daily. The shipped
apps/web/vercel.jsonruns heartbeat-sweep daily at 03:00 UTC as a workaround. This means a dead Fly machine can sit unnoticed for up to 24 hours — acceptable for the first paper deployment, not acceptable before the first mainnet deployment. Before going live with real money, do one of:
- Upgrade Vercel to Pro and change the schedule back to
*/5 * * * *, or- Move both jobs into a tight loop inside a dedicated always-on Fly machine. (Note:
apps/sim-workeris no longer a viable host for this — it scales to zero between backtests per ADR-0023, so its machine isn't running most of the time.)
0.5 — Hit the cron endpoints manually once
curl -H "Authorization: Bearer $CRON_SECRET" \
https://<your-domain>/api/jobs/heartbeat-sweep
# Expect: {"checked":0,"marked":0}
curl -H "Authorization: Bearer $CRON_SECRET" \
https://<your-domain>/api/jobs/pending-agents-gc
# Expect: {"deleted":0}401 → secret doesn't match what Vercel cron will send. 500 → check Vercel function logs.
Phase 1 — First paper deployment (smoke-test the platform itself)
Don't go straight to mainnet. The paper path exercises ~90% of the same code without exchange risk.
1.1 — Sign up and author a skill
- Sign in via magic link.
- Skills → New → fill in the strategy thesis fields.
- Save. Confirm
skills+skill_versionsrows exist:
select s.name, s.slug, sv.version, sv.payload->'model' as model
from public.skills s
join public.skill_versions sv on sv.skill_id = s.id
order by s.created_at desc limit 5;1.2 — Backtest first
Run a backtest from the skill page (also smoke-tests apps/sim-worker on Fly if you've deployed it). Confirm the run completes and the metrics make sense. If the backtest is degenerate (e.g., agent makes zero proposals across 24h), fix the skill before going further — it'll be degenerate in live too.
1.3 — Deploy paper
- Skill page → Deploy → pick Paper → check the confirmation box → click Deploy.
- The page redirects to
/deployments/<id>. Status should beprovisioning→runningwithin ~30 seconds.
While provisioning, watch the Fly side:
fly machines list -a agentic-live-runner
# Note the new machine's ID.
fly logs -a agentic-live-runner -i <machine-id>
# Look for, in order:
# "live-runner: booting deployment <uuid>"
# "live-runner: loaded skill ... (broker=paper)"
# "live-runner: status=running; entering tick loop"If the machine fails to boot, the deployment row flips to status='error' with error_text. Common causes:
DEPLOYMENT_IDenv not injected — checkfly machines exec <id> -- env- Supabase service role key wrong — auth fails in
loadDeploymentAndSkill - Migration not applied —
hyperliquid_agent_idcolumn missing → check constraint blows up the insert
1.4 — Watch the first three ticks
Refresh the deployment detail page after the first interval (5 min by default). Expect:
agent_state.equity_usd= $10,000 (paper starting balance)- Decisions timeline has rows with verdict badges
- Logs view has
runner_started+reconciled+ tick entries - Token cost appearing in the decision rows'
tokcolumn
-- Verify per-tick cost is being computed:
select tick_at, prompt_tokens, completion_tokens, cost_usd
from public.decision_snapshots
where deployment_id = '<uuid>'
order by tick_at desc limit 5;If cost_usd is null or 0 despite non-zero token counts, the model isn't in MODEL_RATES — check that apps/web/components/skill-editor/model-catalog.ts matches packages/shared/src/model-rates.ts.
1.5 — Exercise every command
From the deployment detail page, click in order: Pause → Resume → Snapshot → Flatten → Stop. After each, verify:
- New row in
agent_commandswithacked_atpopulated within ~1 second - Status pill updates after page revalidates
- Fly machine actually exits after Stop (
fly machines listno longer shows it after ~30s)
Phase 1.6 — Common boot failures + fixes
Captured live during the first Phase-1 smoke test. Each row is a real failure mode that surfaced; the cell is the diagnostic signature you'll see in Vercel logs (server) or agent_logs (runner).
| Symptom | Where it shows | Root cause | Fix |
|---|---|---|---|
Error: Attempted to call getDefaultConfig() from the server but getDefaultConfig is on the client | Vercel λ POST /skills/[id]/deploy 500 | RainbowKit's getDefaultConfig pulled into a server bundle because the file holding wagmiConfig had no 'use client' directive | Split: keep wagmiConfig in a 'use client' module (lib/web3/config.ts); move universal constants to a separate file (lib/hyperliquid/constants.ts) that server actions can import safely |
Error: A "use server" file can only export async functions, found object | Vercel server-component render 500 | Const array exported from a 'use server' module (e.g. COMMAND_KINDS) | Extract the const + its derived types to a non-server module (lib/deployment-commands.ts); re-import in both the server action and any client component |
live-runner: set SUPABASE_DB_URL, or SUPABASE_URL + (SUPABASE_DB_PASSWORD | POSTGRES_PASSWORD)… | agent_logs.event=runner_fatal, deployment status=error | Fly secret missing — runner has no way to open the pg LISTEN connection | fly secrets set SUPABASE_DB_PASSWORD=… -a agentic-live-runner then fly machines restart <id> |
password authentication failed for user "postgres" | agent_logs.event=runner_fatal | Wrong project's password set on Fly. Usually because the local .env.local still references a deleted Supabase project from an earlier Marketplace integration | Reset the DB password in Supabase dashboard, re-set everywhere (Fly, Vercel, local), rewrite .env.local to point at the live project ref |
paper broker: no mark price set for <SYMBOL>; call setMarkPrice first | agent_logs.event=tick_failed | Symbol-discovery skill picked a symbol that wasn't in skill.context.symbols, so refreshMarkPrices never seeded a mark for it | Already fixed in code (Phase-1 commits 79b391a + c5af3d7): the runner wraps the bars client to push marks on fetch AND fetches Hyperliquid allMids at tick start. If this comes back, check that the Fly machine is on an image including those commits. |
Hyperliquid allMids <status> <statusText> | agent_logs.event=all_mids_fetch_failed | Transient HL API blip or network egress issue from Fly machine | Runner auto-falls back to bar-close marks for the skill.context.symbols list. Persistent: check HL status page; check Fly region's outbound connectivity. |
Phase 1.7 — Secret rotation playbook
When you reset the Supabase database password (or rotate any Fly secret), three places need synchronized updates plus a deliberate machine restart. Skipping any step puts the runner in a "auth failed" loop the moment a tick fires.
# 1. Stage the new value on Fly (does NOT touch running machines):
fly secrets set SUPABASE_DB_PASSWORD="<new-value>" -a agentic-live-runner --stage
# 2. Mirror to Vercel (the web app forwards into createDeployment env):
cd apps/web
echo "<new-value>" | vercel env add SUPABASE_DB_PASSWORD production --force
# 3. Restart running Fly machines so they pick up the staged value.
# `fly machines update --image` does NOT reliably re-pull staged
# secrets — use `fly machines restart` (or destroy + redeploy):
for m in $(fly machines list -a agentic-live-runner --json | jq -r '.[].id'); do
fly machines restart "$m" -a agentic-live-runner
done
# 4. Update local .env.local so `pnpm dev` works against the same DB:
# POSTGRES_PASSWORD + the three POSTGRES_URL variants all need the
# new value embedded.Important: fly secrets deploy -a agentic-live-runner does not work here — that command targets apps deployed via fly deploy. Our app uses on-demand machine provisioning via the Machines API, so the staged-secret-to-machine handoff happens at machine create / restart time only.
Phase 2 — First mainnet (the tiny-balance smoke test)
Only proceed if Phase 1 was clean.
2.1 — Pre-flight: HL account preparation
In a fresh wallet (NOT your main wallet for the first test):
- Fund it with $100 of USDC on Arbitrum.
- Bridge to Hyperliquid via their UI.
- Confirm $100 lands in your HL perp account.
2.2 — Pair the master wallet
- Wallet → Connect → pick the test wallet → sign the pairing message.
- Confirm:
select address, verified_at, pairing_sig_hash is not null as has_audit_sig
from public.hyperliquid_master_wallets
order by created_at desc limit 1;2.3 — Create a "tiny test" skill
Author a new skill with deliberately conservative caps:
maxPositionPct: 2 (notional; first position is ~$2 on a $100 wallet at 1x)
maxTotalExposurePct: 2 (single tiny position only)
maxOrdersPerDay: 4 (runaway loop can only burn a few orders of fees)
maxLeverage: 1
minOrderUsd: 10 (HL's venue minimum)
allowedSymbols: ['BTC']
maxMonthlyCostUsd: 10
dailyLossHaltPct: 5Run a paper backtest of this skill (10-min window is fine) to confirm it actually proposes something.
2.4 — Deploy mainnet
- Skill page → Deploy → Hyperliquid mainnet.
- Verify the cross-margin warning is NOT showing (this is the first mainnet deployment on this wallet).
- Click Authorize agent wallet — your wallet pops up with a typed-data signature. Read what you're signing: it should be
HyperliquidTransaction:ApproveAgentwith the platform-generated agent address visible. - Sign.
If submission succeeds:
select agent_address, approved_at, revoked_at, approval_tx_hash
from public.hyperliquid_agents
order by created_at desc limit 1;
-- Verify Vault has the encrypted secret:
select count(*) from public.hyperliquid_agents a
join vault.decrypted_secrets ds on ds.id = a.vault_key_id;- Check the confirmation box → Deploy live. Status →
provisioning→runningin ~30s.
2.5 — Watch the first mainnet tick like a hawk
fly logs -a agentic-live-runner -i <machine-id> \
| grep -E "broker_started|reconciled|order_placed|order_filled|tick_failed"Expect in this order:
broker_startedreconciled— with the actual HL account state (positions=0, equity≈$100)- First tick proceeds — likely
nooporexecutedwith a tiny BTC position
If you see tick_failed with NotYetImplemented, then the Slice-B build never made it to Fly — the broker is still the stub. Re-push the image.
After ~3 minutes (give WS marks time to populate), check Hyperliquid's UI — your account should now show the position the agent opened, OR still be flat if the agent chose noop. Either is fine — you've proved the path works.
2.6 — Verify the audit trail
-- One row per order the agent placed:
select cloid, symbol, side, status, filled_size_base, avg_fill_price,
fee_usd, placed_at, settled_at
from public.mainnet_orders
where deployment_id = '<uuid>'
order by placed_at desc;
-- Cross-reference with agent_logs:
select ts, level, event, message
from public.agent_logs
where deployment_id = '<uuid>'
and event in ('order_placed', 'order_filled', 'order_cancelled')
order by ts desc limit 20;The cloid in mainnet_orders.cloid must match the cloid in the matching order_placed log's data field. If they don't, the order-event routing in apps/live-runner/src/broker.ts is broken.
Phase 3 — Operational dry-runs (prove the safety nets fire)
Each of these is a deliberate failure injection. Do them on the tiny-balance deployment from Phase 2, in a single sitting, before that deployment is upgraded to real size.
3.1 — Cost-guard trip
Easiest test: temporarily lower the skill's effective cap by editing the latest skill_versions.payload directly (throwaway — revert after):
-- Read current MTD:
select sum(cost_usd) as mtd_usd
from public.decision_snapshots
where deployment_id = '<uuid>'
and tick_at >= date_trunc('month', now() at time zone 'utc');
-- Lower the cap below MTD (revert immediately after the test):
update public.skill_versions
set payload = jsonb_set(payload, '{risk,maxMonthlyCostUsd}', '0.01'::jsonb)
where skill_id = '<skill-uuid>' and version = <ver>;Within one tick of the next decision, expect:
agent_commandsrow withkind='pause',payload->>'reason'starting withcost_guard:agent_logsrow withevent='cost_guard_tripped'- Deployment status flips to
paused
Revert the cap before continuing.
3.2 — Heartbeat sweeper
Cleanest test: stop the Fly machine and wait out the threshold.
fly machines stop <machine-id> -a agentic-live-runnerWait at least max(3 * tick_interval, 15 min) past the last agent_state.updated_at, then:
curl -H "Authorization: Bearer $CRON_SECRET" \
https://<your-domain>/api/jobs/heartbeat-sweep
# Expect: {"checked":N, "marked":1}Verify:
select status, error_text from public.deployments where id = '<uuid>';
-- Expect: status='error', error_text='heartbeat lost: ...'3.3 — Revoke flow
Create a separate small mainnet deployment for this test (don't revoke the one you're still verifying). Then:
- Click Revoke on its detail page.
- Confirm
hyperliquid_agents.revoked_atis set. - Confirm a
stopcommand was queued for the active deployment. - Deployment status flips to
stoppedwithin ~1 min. - Now actually go to Hyperliquid's UI and remove the agent from your account — the step the platform can't do (see ADR-0016).
- Verify HL no longer shows the agent under approved agents.
Phase 4 — First 24 hours of real operation
After Phase 3, upgrade the test deployment to a real position size (raise maxPositionPct on a new skill version → redeploy). During the first 24h, watch these specific things.
Things that should happen normally
- New
decision_snapshotsrow every tick interval (5 min by default) - New
agent_logsrows withevent='reconciled'every 5 min (broker drift guard) cost_usdper row stays consistent: ~$0.01–0.10 for Haiku, ~$0.05–0.30 for Sonnet- Cron job runs visible in Vercel dashboard, no failures
Things that should NOT happen (alarm bells)
tick_failedevents more than ~1/hour — WS reconnects and transient HL API issues are fine, more is real- Any
state_driftwarning (REST reconcile disagreeing with WS) — investigate root cause liquidationevent — sweeper should have caught the drawdown approach, but liquidations are still possible in a flash crash- Cost-guard tripping unexpectedly — model is making longer calls than expected
- Heartbeat sweeper marking any deployment dead
Queries to bookmark
-- Active deployments + freshness:
select d.id, d.broker_kind, d.status,
s.updated_at as last_heartbeat,
extract(epoch from (now() - s.updated_at))::int as seconds_stale
from public.deployments d
left join public.agent_state s on s.deployment_id = d.id
where d.status in ('running', 'paused', 'halted')
order by seconds_stale desc nulls first;
-- Today's spend by deployment:
select d.id, sum(ds.cost_usd) as today_usd, count(*) as ticks
from public.deployments d
join public.decision_snapshots ds on ds.deployment_id = d.id
where ds.tick_at >= date_trunc('day', now() at time zone 'utc')
group by d.id
order by today_usd desc;
-- Recent mainnet orders + status:
select cloid, symbol, side, status, size_base, filled_size_base,
avg_fill_price, fee_usd, placed_at, settled_at
from public.mainnet_orders
where placed_at > now() - interval '24 hours'
order by placed_at desc limit 50;Incident response cheat sheet
| Symptom | First action | Root cause path |
|---|---|---|
| Skill is firing rapid orders | Click Pause on deployment page | Check rate cap; if engine should have caught it, it's a bug |
| Position is wrong size vs intended | Click Pause, do NOT flatten yet | Compare decision_snapshots.proposed_action.sizeUsd vs broker's actual order in mainnet_orders.size_base |
| Big loss showing on HL | Click Flatten | Then investigate why daily-loss-halt didn't fire — likely dailyLossHaltPct was set high |
| WS keeps disconnecting | Watch fly logs for webdata2_apply_failed | HL API issue or network — usually self-recovers; flip HYPERLIQUID_LIVE_ENABLED=false on Fly app if it persists |
| Cron secret leaked | Rotate CRON_SECRET in Vercel | Old jobs will start failing within minutes |
| Suspect agent key compromise | Hit Revoke immediately, then go to HL UI | Both steps required for full revocation (ADR-0016) |
| Need to stop everything platform-wide | fly secrets set HYPERLIQUID_LIVE_ENABLED=false -a agentic-live-runner AND vercel env add DEPLOYMENTS_DISABLED true production | Mainnet broker construction refuses; new deployments refused |
What you can't test without paying
Three things you only learn from real load:
- Hyperliquid order fill latency under stress. The 2-second
userFillswait might be tight on slow markets. Plan to revise after a week of real data. - WS reconnect behavior across long-running sessions. The
@nktkas/hyperliquidtransport handles backoff but extended outages aren't well-documented; observe before relying on it past a few hours of disconnect. - Cost-guard query performance as
decision_snapshotsgrows. Currently the runner scans all month-to-date rows for the deployment on every tick. Fine at small scale; at 8k ticks/month/deployment with 10+ active deployments, replace with a materialized view or per-month rollup table.
None of these block the first real trade. All of them belong on a Phase 5 follow-up list once volume justifies the work.