Open Methodology · Version tfi-elo-v1 · Deployed 2026-06-01

How We Predict

Every Pilgrim.26 forecast is the combination of two signals: a team's Elo rating and the Travel Fatigue Index (TFI) — a score we compute for every match based on how brutal the trip to that stadium was.

Live Accuracy

Graded

Outcome accuracy

100%

Brier (lower=better)

0.307

Log loss

0.599

Updated live from full-time results. Brier and log-loss are standard probabilistic scoring rules — a perfect forecaster scores 0 on both.

⚠ Small sample (N=3). Accuracy, Brier, and log-loss are noisy below ~20 graded matches — treat as directional only.

Date	Match	Predicted (H/D/A)	Final	Result
JUN 11	MEX vs RSA	58/18/24	2–0	✓ HIT
JUN 12	KOR vs CZE	49/22/29	2–1	✓ HIT
JUN 13	USA vs PAR	58/18/24	4–1	✓ HIT

1 · Elo Rating

Every team starts at 1500. Pre-tournament we seed the 32 teams from their FIFA / world-football Elo history; new entrants stay at baseline. Elo updates after each full-time result using the classic formula.

2 · Travel Fatigue Index (TFI)

The TFI is a 0–100 score for one team arriving at one match. It penalises four things:

Flight distance — 8 points per 1,000 km, capped at 50.
Time-zone shift — 5 points per hour, capped at 25.
Altitude jump — 2 points per 100 m gained, capped at 15.
Short rest — 15 points for every day below 4 days between matches, capped at 30.

We compute the haversine distance between consecutive venues, the tz delta via IANA zones, the altitude delta, and the rest period between fixtures. The first match of a team's tournament uses a baseline TFI of 25 (one long-haul flight in).

3 · From Elo + TFI to probabilities

We apply a TFI penalty of 1.2 Elo points per TFI point to each side, then convert the rating gap to a no-draw win probability using the standard logistic (base 10, scale 400). Draw probability is bounded between 18%and 30% based on the absolute rating gap.

Expected goals are anchored at 1.35 per side, then nudged ±0.4 by the rating gap and fatigue. xG is informational — match outcome is what the table above scores.

4 · What we don't model (yet)

· Squad availability, injuries, suspensions.
· Weather and pitch conditions.
· In-tournament form / momentum beyond Elo updates.
· Referee or VAR effects.

These are deliberate omissions — they add complexity without clear signal at this scale. We publish misses in the table above so you can see exactly when the model is wrong.