TUTORIAL

World Cup 2026 Predictions: Build an ML Model from Historical Data

Build a machine-learning model to predict the 2026 FIFA World Cup. Python + scikit-learn + historical World Cup data via API.

7 min read

Every World Cup brings out the predictors. Banks, betting markets, FiveThirtyEight-style stats sites, individual data scientists on Twitter — everyone publishes their bracket and inevitably gets ~half of it wrong. The math is hard, the sample sizes are small, and there's a lot of noise.

This tutorial walks through building a credible (not perfect) ML-based predictor for the 2026 FIFA World Cup using Python, scikit-learn, and historical World Cup data via REST API.

What we're predicting

Three things at increasing difficulty:

  1. Per-match result (home win / draw / away win) — a multi-class classification problem
  2. Per-match goals (home goals, away goals) — a Poisson regression problem
  3. Tournament winner — Monte Carlo simulation built on top of the per-match model

We'll focus on (1) and use the same model to bootstrap (2) and (3).

Step 1: Pull historical World Cup data

import requests
import pandas as pd

API = 'https://api.thestatsapi.com/api/football/matches'
HEADERS = {'Authorization': 'Bearer YOUR_API_KEY'}

# Every World Cup match from 1990 onwards (when data is most consistent)
all_matches = []
for season in range(1990, 2023, 4):
    r = requests.get(API, params={'competition_id': COMPETITION_ID, 'season': season, 'per_page': 100}, headers=HEADERS)
    all_matches.extend(r.json()['data'])

df = pd.DataFrame([{
    'season': m['season'],
    'home_team': m['home']['name'],
    'away_team': m['away']['name'],
    'home_score': m['home_score'],
    'away_score': m['away_score'],
    'stage': m['stage'],
} for m in all_matches])

print(f"Loaded {len(df)} historical matches")
# ~520 matches across 9 World Cups

Step 2: Build features

Football match prediction features fall into a few buckets:

  • Team strength — FIFA rank, Elo rating, recent form
  • Squad quality — average market value of the starting XI
  • Stage — group-stage games have different dynamics from knockouts
  • Rest — days since last match (fatigue matters in compressed tournaments)
  • Geography — distance travelled, time zone differences from home

For a first model, FIFA rank + recent form is enough:

# Add FIFA rank at the time of each match
fifa_ranks = pd.read_csv('historical_fifa_ranks.csv')  # source from API or Wikipedia
df = df.merge(fifa_ranks, left_on=['season', 'home_team'], right_on=['year', 'team'], how='left')
df = df.rename(columns={'rank': 'home_rank'}).drop(columns=['year', 'team'])
df = df.merge(fifa_ranks, left_on=['season', 'away_team'], right_on=['year', 'team'], how='left')
df = df.rename(columns={'rank': 'away_rank'}).drop(columns=['year', 'team'])

df['rank_diff'] = df['home_rank'] - df['away_rank']
df['result'] = df.apply(lambda r: 'H' if r['home_score'] > r['away_score'] else 'A' if r['home_score'] < r['away_score'] else 'D', axis=1)

Step 3: Train a classifier

from sklearn.model_selection import train_test_split
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.metrics import classification_report, log_loss

features = ['rank_diff', 'home_rank', 'away_rank']
X = df[features].dropna()
y = df.loc[X.index, 'result']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = GradientBoostingClassifier(n_estimators=200, max_depth=3, random_state=42)
model.fit(X_train, y_train)

y_pred = model.predict(X_test)
y_proba = model.predict_proba(X_test)
print(classification_report(y_test, y_pred))
print(f"Log loss: {log_loss(y_test, y_proba):.3f}")

A simple model like this typically gets to ~50% accuracy (vs ~33% baseline for "always home win") and log loss around 1.0. That's enough to outperform random guessing, but won't beat the closing line at Pinnacle.

Step 4: Add xG-based features

The biggest single improvement: include each team's pre-tournament xG form from recent friendlies and qualifiers:

# Pull recent xG form for each team
def recent_xg_form(team_slug, before_date, n_matches=10):
    r = requests.get(
        f'https://api.thestatsapi.com/api/football/teams/{team_slug}/recent',
        params={'before': before_date, 'limit': n_matches},
        headers=HEADERS,
    )
    matches = r.json()['data']
    avg_xg = sum(m['xg_for'] for m in matches) / len(matches)
    avg_xg_against = sum(m['xg_against'] for m in matches) / len(matches)
    return avg_xg, avg_xg_against

# Add as features
df['home_recent_xg'], df['home_recent_xga'] = zip(*df.apply(
    lambda r: recent_xg_form(r['home_team'].lower().replace(' ', '-'), r['date']), axis=1
))

Now retrain. Expect a 5-10% accuracy boost.

Step 5: Predict the 2026 tournament

Pull the 2026 fixtures (with placeholders for knockout matches):

fixtures_2026 = requests.get(
    API,
    params={'competition_id': COMPETITION_ID, 'season': 2026, 'per_page': 104},
    headers=HEADERS,
).json()['data']

# For each group-stage match, predict win/draw/away probability
group_matches = [f for f in fixtures_2026 if f['stage'] == 'group-stage']

predictions = []
for f in group_matches:
    home = f['home']['slug']
    away = f['away']['slug']
    home_rank = current_fifa_rank(home)
    away_rank = current_fifa_rank(away)

    features = pd.DataFrame([{
        'rank_diff': home_rank - away_rank,
        'home_rank': home_rank,
        'away_rank': away_rank,
    }])
    proba = model.predict_proba(features)[0]
    predictions.append({
        'match': f"{f['home']['name']} vs {f['away']['name']}",
        'p_home': proba[list(model.classes_).index('H')],
        'p_draw': proba[list(model.classes_).index('D')],
        'p_away': proba[list(model.classes_).index('A')],
    })

print(pd.DataFrame(predictions).head(10))

Step 6: Monte Carlo the tournament

Once you can predict per-match win probabilities, simulate the entire tournament 10,000 times:

import random
import numpy as np

def simulate_tournament(model, fixtures, n_simulations=10000):
    champion_counts = {}
    for _ in range(n_simulations):
        # Simulate group stage → standings → R32 → ... → Final
        # Track who wins the final
        winner = simulate_one_run(model, fixtures)
        champion_counts[winner] = champion_counts.get(winner, 0) + 1

    return {team: count / n_simulations for team, count in champion_counts.items()}

odds = simulate_tournament(model, fixtures_2026)
for team, prob in sorted(odds.items(), key=lambda x: -x[1])[:10]:
    print(f"{team}: {prob:.1%}")

Step 7: Validate against bookmaker odds

The acid test for any prediction model is whether it beats the closing line at Pinnacle:

def compare_to_pinnacle(predictions, fixture_id):
    r = requests.get(f'https://api.thestatsapi.com/api/football/matches/{fixture_id}/odds')
    pinnacle = next((b for b in r.json()['data']['bookmakers'] if b['name'] == 'pinnacle'), None)
    if not pinnacle: return None

    p_market = {
        'H': 1 / pinnacle['markets'][0]['outcomes'][0]['price'],
        'D': 1 / pinnacle['markets'][0]['outcomes'][1]['price'],
        'A': 1 / pinnacle['markets'][0]['outcomes'][2]['price'],
    }
    # Remove vig
    overround = sum(p_market.values())
    p_market = {k: v / overround for k, v in p_market.items()}

    return {
        'model': predictions,
        'market': p_market,
        'edge_home': predictions['p_home'] - p_market['H'],
    }

If your model consistently disagrees with Pinnacle by more than a few percentage points and you're right more often than them, you've built something genuinely sharp. Most models are not — and that's fine for a fun side project.

What our free predictor tool does

Our free Poisson Score Predictor tool implements a much simpler version of this: input two teams' expected goals, get a score distribution out. It's pure client-side, takes 10 seconds to use, and is great for one-off questions.

Honest expectations

  • Random baseline: ~33% accuracy
  • "Always pick the higher-ranked team": ~45% accuracy
  • Decent ML model with FIFA rank + xG features: ~50-55% accuracy
  • Pinnacle closing line implied: ~55-58% accuracy
  • Beating Pinnacle: very hard

Tournament predictions are noisier than per-match predictions. Even a good model will get the winner wrong 60-70% of the time — the underdog factor is real.

Frequently Asked Questions

What's the minimum dataset I need?

About 200+ historical matches is enough to fit a simple gradient-boosted classifier without overfitting. The World Cup has ~520 matches since 1990, plus thousands of qualifying and friendly matches that can be used to learn team-strength priors.

Should I use neural networks?

For football prediction with this much data, gradient-boosted trees (XGBoost, LightGBM) typically beat neural networks. Football outcomes are noisy and small datasets favour simpler models with strong regularisation.

How do I handle teams that haven't played each other?

Use FIFA rank, Elo rating, or xG-based team strength as bridging features. The model doesn't need a direct head-to-head history — relative strength is enough.

Can I get betting markets back-tested?

Yes — historical Pinnacle closing lines are available via the API for matches going back several years. Compare your model's predictions against the closing line on each match and see whether you've achieved CLV.

Is this a get-rich-quick scheme?

No. Even good models that occasionally beat the line lose money to the vig over time. Treat this as a fun analytics project, not a guaranteed income.

What features matter most?

In our testing: relative team strength (Elo or FIFA rank), recent xG form (last 10 matches), squad market value, days of rest, and travel distance. Stage of tournament also matters — knockout games have different draw probabilities than group games.

How do I evaluate my model fairly?

Use temporal cross-validation: train on World Cups 1990-2014, validate on 2018, test on 2022. Don't use random splits — that leaks information across years.

Start building today

Ready to Power Your Sports App?

Start your 7-day free trial. All endpoints included on every plan.

Cancel anytime
7-day free trial
Setup in 5 minutes