ADR-0011: News integration via CryptoPanic + tool-based delivery
- Status: accepted
- Date: 2026-06-01
- Related: ADR-0006 (agent boundaries), ADR-0009 (strategy fields), ADR-0010 (model routing)
Context
The skill schema already declares a NewsClient interface and a fetch_news_sentiment tool — the wiring is in place, but the only client implementation today is fixtureNewsClient ({ search: async () => [] }). The sim runs without news. We want skills to be able to reason on real headlines so traders can write rules like:
"Before opening a long position, check news in the last hour — if any headline is bearish, skip the entry."
Two product-shaping choices needed an explicit decision:
- Where does news come from? Backtests replay past bars, so the source must be queryable by timestamp range and must not leak future articles into past ticks.
- How does the model consume it? A tool the model can call on demand, or context that's always prepended to the prompt?
Decision
News source: CryptoPanic
We use CryptoPanic's developer API as the source of truth for both historical (within the tier's window) and live news.
What we get from any paid tier:
- Per-currency filter —
?currencies=BTC,ETHmatches our internal symbol convention - Source-side sentiment signals —
votes(counts) andpanic_score(importance); higher tiers add asentimentobject with normalized fields. We map all of these onto a singlesentiment ∈ [-1, +1]for theNewsItemshape - Stable, idempotent ids for dedup on
(source, external_id)
Tier-dependent behavior:
growth_weeklytier (current): accepts but silently ignorespublished_at_from/published_at_to. News coverage is the rolling "current" window only (~last few days). Backtests over older bars run with empty news.developer/ Pro tiers: full historical date-range queries. The same client code switches over by changingCRYPTOPANIC_BASE_URL— no migration needed.
The base URL is configurable via CRYPTOPANIC_BASE_URL env, defaulting to the growth-weekly endpoint. Upgrade is a single env change.
Delivery: tool the model calls (fetch_news_sentiment)
Already defined in packages/tools/src/tools/fetch-news-sentiment.ts. The model decides per tick whether to spend a tool call. The skill author writes rules in plain English in the entry / exit / riskManagement fields ("call fetch_news_sentiment with windowHours=1 before opening a long…").
Backtest leakage guard
NewsClient.search() already accepts asOf: Date and the runtime threads tick.tickAt into it per tick. The new DB-backed client enforces this with a hard SQL constraint:
WHERE ts <= $asOf AND ts > $asOf - $windowHoursThere is no code path in the tool that can resolve to now() when asOf is provided. Live mode passes asOf = new Date() at construction time.
Storage
A new news_items table:
news_items (
id uuid primary key,
ts timestamptz not null,
source text not null,
external_id text not null,
symbols text[] not null,
title text not null,
url text,
sentiment numeric,
metadata jsonb,
ingested_at timestamptz default now(),
unique (source, external_id)
)Indexes: (ts desc) for time-window queries; GIN on symbols for symbols && ARRAY['BTC'] lookups.
RLS mirrors bars: service-role writes only; authenticated reads (news is market data, no PII).
Auto-ingest
packages/simulator/src/persist.ts already auto-ingests bars when a queued sim's range is short on data. We add a best-effort maybeIngestNews step that pulls a 72h pre-roll plus the full sim range from CryptoPanic before the backtest starts. If the env key is unset or the API fails, the sim continues — the tool just returns []. Same status-pumping pattern as bars (computing_metrics → running) so the UI shows progress.
Alternatives considered
Alt A — GDELT 2.0
- Free, 15-year history, every article worldwide every 15 min
- Not picked: not crypto-specific; "BTC" → ticker-to-keyword mapping is noisy ("Bitcoin" matches every article that mentions it offhand). Sentiment scoring is news-domain general (V2GCAM tone), not crypto-domain. Filtering noise would dominate the integration cost.
- Worth revisiting if we want macro / non-crypto news layered alongside (rate decisions, regulatory moves the crypto press hasn't picked up yet).
Alt B — Build our archive forward from now via free tier
- $0/mo and the API still works for live ingestion
- Not picked: free tier has no historical access. Either we cron-ingest forward for 30-60 days before backtests are meaningful, or we accept that early users can't backtest news-aware strategies. Both kill the product use case at launch.
- The right move long-term: cron incremental ingest in addition to on-demand Pro pulls, so popular ranges are pre-warmed and we eventually own our archive.
Alt C — Always-prepend last N headlines to the prompt
- Deterministic; no per-tick decision logic in the skill
- Not picked: burns 500–2000 prompt tokens per tick whether news is relevant or not. At ~$1/M input for Haiku that's a 4–15× cost multiplier for sims that are otherwise sparse. Tool-calling lets the model spend that budget only on the ticks where the strategy actually wants news.
Alt D — Both (tool + compact summary always-on)
- One-line summary in prompt ("BTC: 3 headlines last 1h, sentiment +0.2") + full tool for drill-down
- Not picked yet: more code paths to maintain, and the "compact summary" abstraction is doing the model's job for it. If we observe in production that skills mostly want a quick gut check, revisit.
Consequences
Positive
- Skills become news-aware with zero new tool code — the existing
fetch_news_sentimentlights up. - Backtests stay leakage-safe by construction; the
asOfinvariant lives in SQL, not in tool internals where a refactor could weaken it. - Same data path serves sim and live. The DB-backed
NewsClientdoesn't care; live runners passasOf = new Date(), sims pass the tick timestamp. - Tier-portable: lower tier today (rolling-window news), upgrade is a single env-var change to unlock historical.
- CryptoPanic's vote tags mean we don't have to run our own sentiment model in v1.
Negative / trade-offs
- Tier-limited history: on
growth_weekly, sims older than the tier's rolling window run with empty news. Sim cost report should call this out — "0 news items in range; sentiment-aware rules will no-op". - Vendor lock-in to CryptoPanic. If their API changes or pricing drifts up, we migrate. The
NewsClientinterface gives us the seam — swapping the implementation is a single-file change. - Sentiment scoring is derived from votes, not from article content. Mitigation: surface raw
votes+panic_scoreinmetadataso skills can override the numeric sentiment with their own logic. - No body text in the standard plan — we get title, URL, source domain, tags. Skills that want to read the article have to follow
urlthemselves (we don't add web-fetch to the tool set in v1; out of scope).
Things we'll need to revisit
- Tier upgrade path: once we have steady users wanting backtest history, upgrade from
growth_weeklyto a tier with historical date filtering. The codebase already supports it viaCRYPTOPANIC_BASE_URL. - Cron-driven incremental ingest once we have steady traffic, so popular date ranges hit the DB cache and CryptoPanic's API isn't in the hot path of every sim.
- Sentiment recalibration: optionally re-score articles with our own model and store a
sentiment_internalcolumn. - X / Twitter feed. Crypto news travels through social faster than CryptoPanic indexes it. A separate ingester (likely via a paid X API) is a future ADR.
- Macro feeds (FOMC, SEC filings) via a non-crypto-specific source like GDELT, layered into the same
news_itemstable with a differentsourcecolumn.
References
packages/tools/src/tools/fetch-news-sentiment.ts— the tool definition (no changes; the client behind it is what changed)packages/tools/src/types.ts—NewsClientandNewsIteminterfacespackages/data-ingest/src/cryptopanic.ts— new HTTP clientpackages/db/supabase/migrations/—news_itemsmigration- CryptoPanic API docs: https://cryptopanic.com/developers/api/