# FATE — the model, in full

Every number on the page comes from one place: a seeded Monte-Carlo simulation of the entire
World Cup, defined in [`engine.mjs`](group-k-war-room/model/engine.mjs) (the single source of
truth, locked by `conformance.test.mjs`, 14/14). This document explains every step precisely.

## 0. In one sentence
FATE plays the whole tournament nine thousand times (n=9,000) — from each team's EV-implied
strength, honoring real results as they're locked into history — and reports **how often** each
outcome happens. Probabilities are frequencies, not opinions.

## 1. Inputs
- **Team EV** per team (`war_room_data.json → teams[].ev`) — the *only* strength input.
- **72 group fixtures** (+ dates) and the **owners** map (group → fantasy owner).
- **Locked results `R`** — real final scores, fed live from ESPN (`/api/results`).
- **Markets** (Polymarket, Kalshi) and **PELE** (Nate Silver) — shown for comparison and used
  only in the optional champion *ensemble*; they do **not** drive the simulation.

## 2. Strength — one scalar per team
`s = ln(ev / 38) · 0.42`  (`strengthOf`). EV 38 ≈ field average → s ≈ 0. The **log** makes a
*ratio* of EV a fixed strength step (doubling EV adds a constant); the **0.42** sets how much a
strength gap widens the expected goal margin.

## 3. Group-match goals — opponent-aware Poisson
For a fixture, the expected goals are
`λ_home = 1.32 · exp(±ha + (s_home − s_away))`,
`λ_away = 1.32 · exp(∓ha + (s_away − s_home))`
(`MU=1.32` base scoring rate). The bigger the strength gap, the more lopsided the λ's.
**Home advantage is venue-honest:** a World Cup is a *neutral-venue* tournament, so `ha = 0` for
every match **except** a 2026 host (USA / Mexico / Canada) playing at home, where `ha = HOME_ADV =
0.20`. We deliberately do **not** apply a home edge to the nominal "home" team of a fixture (that
field is only listing order) — doing so would inject a phantom ~1.5× edge that compounds through the
group tables, seeding and bracket. **Live in-play** (`lambdas` live branch): λ scales by remaining
minutes, blends the match's xG, and halves on a red card — so an in-progress match tapers correctly.

## 4. Scoreline — Dixon-Coles
`sampleScore` builds the joint distribution over scorelines 0-0…6-6 from each side's Poisson,
then applies the **Dixon-Coles τ correction** (`DC_RHO=−0.11`): it boosts 0-0 and 1-1 and damps
1-0/0-1, because real football has more low-score draws than independent Poisson predicts. A
scoreline is then sampled from that corrected joint distribution. **Locked results bypass this**
— a played match uses its real score, every simulation.

## 5. Group tables — FIFA 2026 Art. 13 tiebreak
Points are 3/1/0. Ties are broken by **head-to-head first** (H2H points → H2H GD → H2H GF among
the tied teams) **before** overall GD/GF, then GF, then a coin (`rankGroup`). This H2H-before-GD
order is the actual 2026 regulation and changes who advances in close groups.

## 6. Third place — real best-8-of-12
All twelve third-placed teams are ranked by (points, GD, GF); the **top 8 advance** to fill the
32-team knockout (`simAllGroups` → `best8`). It's the true cross-group selection, not a fixed map.

## 7. Knockout — strength-seeded single elimination
The 32 qualifiers are seeded by strength into a standard bracket (`seedOrder`). Each tie:
`sampleScore` for regulation; if level → **extra time** (λ × 0.33); if still level → **penalties**
(a logistic coin on the strength difference). Winner advances; depth reached is recorded.

## 8. Monte-Carlo aggregation
The entire tournament above runs **n times** (n = 9,000 for the live river; seeded, so replayable).
- **`survivalRiver`** → each team's P(reach R32 / R16 / QF / SF / Final / Champion) = the fraction
  of simulations reaching that depth. Column totals conserve probability mass and **halve each
  round: [48, 32, 16, 8, 4, 2, 1]** — that's the river's taper, and a built-in correctness check.
- **`simulateLeague`** → each owner's full score distribution (mean, p10–p90, P(win the pool),
  podium, expected rank) and each team's championship probability.

## 9. Champion ensemble — model + market
`championEnsemble`: `P = w·model + (1−w)·market-consensus`, renormalized to sum to 1. This is how
the model and the betting markets can be blended. On the page we keep them **separate and labeled**
(FATE model vs Polymarket vs Kalshi vs PELE) so you always see who says what.

## 9b. Independence, overlap & double-counting (read this)
The projection — every river ribbon, champion %, squad equity and "% to win pool" — is driven by
**exactly one strength input: team EV**, plus **locked real results**. Markets and PELE never enter
the simulation, so the headline numbers are **not** double-counted with them. Honesty still requires
naming the overlaps that *do* exist:

- **EV is one prior, not three.** Strength = EV only, so any bias in EV (or the old phantom home
  edge, now fixed) compounds through the whole bracket. EV is a single point of failure — a
  transparent prior, not ground truth.
- **The "sources" are correlated, not independent.** FATE-model, the betting market and PELE all
  estimate the *same* latent "who is good." Their **agreement does not multiply confidence**
  (correlated errors don't cancel); only their **disagreement** (the Divergence view) is genuinely
  informative.
- **Polymarket ≈ Kalshi.** Two exchanges on the same event are arbitrage-linked — effectively **one**
  market signal shown as two quotes (a range), and it enters any blend **once**, not twice.
- **PELE is one source, two faces.** Nate Silver's *champion %* (paywalled snapshot) and his live
  *strength rating* (`current_pele`, scraped daily) are the **output and input of the same model** —
  one opinion shown two ways.
- **The model–market blend can be circular.** If EV was itself read off betting odds or a power
  ranking, `w·model + (1−w)·market` re-weights the market twice (once inside EV, once explicitly). So
  the blend is **optional and labeled**, sources are shown **separate by default**, and "consensus"
  is a weighted *opinion*, never an independence-based confidence boost.
- **No calibrating to what we compare against.** `MU=1.32` is the historical base scoring rate; the
  `0.42` scale sets how an EV gap widens the goal margin. Neither is fit to the markets/PELE we then
  display — otherwise "the model agrees with the market" would be tautological.

The simulation is clean (EV + real results); the *comparison* layer is honest about correlation.

## 10. Scoring (the sweepstake pool)
`SC` weights: win 3 · draw 1 · group-winner 12 · runner-up 7 · third 4 · reach R32 10 · R16 25 ·
QF 45 · SF 70 · Final 100 · **Champion 150**. These drive each owner's projected score.

## 11. Determinism & verification
A `mulberry32` seeded RNG means a given (results, seed) renders **identically** every time — that
enables replay, the live "breathing" (same data, new seed), and cross-implementation locking.
`conformance.test.mjs` asserts 14 invariants (determinism, Art.13 H2H, draw-gravity, river totals
[48→1], champion Σ=1, ensemble renormalization, …) — all green.

## 12. Live data flow
- **Results / goals:** ESPN keyless scoreboard → `/api/results` → merged into `R` → re-simulate.
- **Odds:** Polymarket Gamma + Kalshi → `/api/markets` → shown per source (never silently blended).
- **PELE:** Nate Silver's model, a dated snapshot (paywalled; partial) — comparison only.

## 13. What FATE does **not** model (the honest limits)
No lineups, injuries, suspensions, or rotation. No player-level or xG *history* (only the live
match's xG, if present). No fatigue, travel, weather, referee, crowd, in-game momentum beyond the
live λ taper, transfers, manager changes, or motivation/dead-rubber effects. FATE is **disciplined
uncertainty** from team strength + real results — explicitly *not* prophecy. Two more honest limits:
(1) **strength is static** — a team's EV does not update from in-tournament *form*; a played match is
locked as ground truth, but it does not nudge that team's strength for *future* matches (PELE, by
contrast, carries a `performance_adj`). (2) **EV provenance is assumed independent** — we treat the
team EV as an independent power prior; if it was in fact eyeballed from odds/rankings, the
model-vs-market comparison is partly circular (see §9b).

## 14. Sources of truth
`engine.mjs` (the model) · `war_room_data.json` (fixtures, EV, locked results) ·
`conformance.test.mjs` (the 14-check lock) · `fate_serve.py` (the live `/api/markets` + `/api/results`).
