ADR-0011: News integration via CryptoPanic + tool-based delivery

Status: accepted
Date: 2026-06-01
Related: ADR-0006 (agent boundaries), ADR-0009 (strategy fields), ADR-0010 (model routing)

Context

The skill schema already declares a NewsClient interface and a fetch_news_sentiment tool — the wiring is in place, but the only client implementation today is fixtureNewsClient ({ search: async () => [] }). The sim runs without news. We want skills to be able to reason on real headlines so traders can write rules like:

"Before opening a long position, check news in the last hour — if any headline is bearish, skip the entry."

Two product-shaping choices needed an explicit decision:

Where does news come from? Backtests replay past bars, so the source must be queryable by timestamp range and must not leak future articles into past ticks.
How does the model consume it? A tool the model can call on demand, or context that's always prepended to the prompt?

Decision

News source: CryptoPanic

We use CryptoPanic's developer API as the source of truth for both historical (within the tier's window) and live news.

What we get from any paid tier:

Per-currency filter — ?currencies=BTC,ETH matches our internal symbol convention
Source-side sentiment signals — votes (counts) and panic_score (importance); higher tiers add a sentiment object with normalized fields. We map all of these onto a single sentiment ∈ [-1, +1] for the NewsItem shape
Stable, idempotent ids for dedup on (source, external_id)

Tier-dependent behavior:

growth_weekly tier (current): accepts but silently ignores published_at_from / published_at_to. News coverage is the rolling "current" window only (~last few days). Backtests over older bars run with empty news.
developer / Pro tiers: full historical date-range queries. The same client code switches over by changing CRYPTOPANIC_BASE_URL — no migration needed.

The base URL is configurable via CRYPTOPANIC_BASE_URL env, defaulting to the growth-weekly endpoint. Upgrade is a single env change.

Delivery: tool the model calls (`fetch_news_sentiment`)

Already defined in packages/tools/src/tools/fetch-news-sentiment.ts. The model decides per tick whether to spend a tool call. The skill author writes rules in plain English in the entry / exit / riskManagement fields ("call fetch_news_sentiment with windowHours=1 before opening a long…").

Backtest leakage guard

NewsClient.search() already accepts asOf: Date and the runtime threads tick.tickAt into it per tick. The new DB-backed client enforces this with a hard SQL constraint:

WHERE ts <= $asOf  AND ts > $asOf - $windowHours

There is no code path in the tool that can resolve to now() when asOf is provided. Live mode passes asOf = new Date() at construction time.

Storage

A new news_items table:

news_items (
  id            uuid primary key,
  ts            timestamptz not null,
  source        text not null,
  external_id   text not null,
  symbols       text[] not null,
  title         text not null,
  url           text,
  sentiment     numeric,
  metadata      jsonb,
  ingested_at   timestamptz default now(),
  unique (source, external_id)
)

Indexes: (ts desc) for time-window queries; GIN on symbols for symbols && ARRAY['BTC'] lookups.

RLS mirrors bars: service-role writes only; authenticated reads (news is market data, no PII).

Auto-ingest

packages/simulator/src/persist.ts already auto-ingests bars when a queued sim's range is short on data. We add a best-effort maybeIngestNews step that pulls a 72h pre-roll plus the full sim range from CryptoPanic before the backtest starts. If the env key is unset or the API fails, the sim continues — the tool just returns []. Same status-pumping pattern as bars (computing_metrics → running) so the UI shows progress.

Alternatives considered

Alt A — GDELT 2.0

Free, 15-year history, every article worldwide every 15 min
Not picked: not crypto-specific; "BTC" → ticker-to-keyword mapping is noisy ("Bitcoin" matches every article that mentions it offhand). Sentiment scoring is news-domain general (V2GCAM tone), not crypto-domain. Filtering noise would dominate the integration cost.
Worth revisiting if we want macro / non-crypto news layered alongside (rate decisions, regulatory moves the crypto press hasn't picked up yet).

Alt B — Build our archive forward from now via free tier

$0/mo and the API still works for live ingestion
Not picked: free tier has no historical access. Either we cron-ingest forward for 30-60 days before backtests are meaningful, or we accept that early users can't backtest news-aware strategies. Both kill the product use case at launch.
The right move long-term: cron incremental ingest in addition to on-demand Pro pulls, so popular ranges are pre-warmed and we eventually own our archive.

Alt C — Always-prepend last N headlines to the prompt

Deterministic; no per-tick decision logic in the skill
Not picked: burns 500–2000 prompt tokens per tick whether news is relevant or not. At ~$1/M input for Haiku that's a 4–15× cost multiplier for sims that are otherwise sparse. Tool-calling lets the model spend that budget only on the ticks where the strategy actually wants news.

Alt D — Both (tool + compact summary always-on)

One-line summary in prompt ("BTC: 3 headlines last 1h, sentiment +0.2") + full tool for drill-down
Not picked yet: more code paths to maintain, and the "compact summary" abstraction is doing the model's job for it. If we observe in production that skills mostly want a quick gut check, revisit.

Consequences

Positive

Skills become news-aware with zero new tool code — the existing fetch_news_sentiment lights up.
Backtests stay leakage-safe by construction; the asOf invariant lives in SQL, not in tool internals where a refactor could weaken it.
Same data path serves sim and live. The DB-backed NewsClient doesn't care; live runners pass asOf = new Date(), sims pass the tick timestamp.
Tier-portable: lower tier today (rolling-window news), upgrade is a single env-var change to unlock historical.
CryptoPanic's vote tags mean we don't have to run our own sentiment model in v1.

Negative / trade-offs

Tier-limited history: on growth_weekly, sims older than the tier's rolling window run with empty news. Sim cost report should call this out — "0 news items in range; sentiment-aware rules will no-op".
Vendor lock-in to CryptoPanic. If their API changes or pricing drifts up, we migrate. The NewsClient interface gives us the seam — swapping the implementation is a single-file change.
Sentiment scoring is derived from votes, not from article content. Mitigation: surface raw votes + panic_score in metadata so skills can override the numeric sentiment with their own logic.
No body text in the standard plan — we get title, URL, source domain, tags. Skills that want to read the article have to follow url themselves (we don't add web-fetch to the tool set in v1; out of scope).

Things we'll need to revisit

Tier upgrade path: once we have steady users wanting backtest history, upgrade from growth_weekly to a tier with historical date filtering. The codebase already supports it via CRYPTOPANIC_BASE_URL.
Cron-driven incremental ingest once we have steady traffic, so popular date ranges hit the DB cache and CryptoPanic's API isn't in the hot path of every sim.
Sentiment recalibration: optionally re-score articles with our own model and store a sentiment_internal column.
X / Twitter feed. Crypto news travels through social faster than CryptoPanic indexes it. A separate ingester (likely via a paid X API) is a future ADR.
Macro feeds (FOMC, SEC filings) via a non-crypto-specific source like GDELT, layered into the same news_items table with a different source column.

References

packages/tools/src/tools/fetch-news-sentiment.ts — the tool definition (no changes; the client behind it is what changed)
packages/tools/src/types.ts — NewsClient and NewsItem interfaces
packages/data-ingest/src/cryptopanic.ts — new HTTP client
packages/db/supabase/migrations/ — news_items migration
CryptoPanic API docs: https://cryptopanic.com/developers/api/

ADR-0011: News integration via CryptoPanic + tool-based delivery

On this page