Download Historical Football Data with Node.js and Export CSV
Use Node.js fetch to download historical football matches, paginate seasons, add match stats, and export clean CSV files from TheStatsAPI.
Node.js is a good fit for historical football data backfills: native fetch, easy scheduling, simple CSV writing, and the same code can later run in a worker or serverless job.
This guide downloads Premier League historical match data from TheStatsAPI and writes two CSV files:
matches.csvfor fixtures, teams, scores, status, and coverage flags.match_stats.csvfor shots, possession, xG, non-penalty xG, and corners.
Create a tiny API client
Use Node.js 18+ so fetch is built in.
export THESTATSAPI_KEY="your_api_key"
Create client.mjs:
const BASE_URL = "https://api.thestatsapi.com/api";
const API_KEY = process.env.THESTATSAPI_KEY;
export async function get(endpoint, params = {}) {
const url = new URL(`${BASE_URL}${endpoint}`);
for (const [key, value] of Object.entries(params)) {
if (value !== undefined && value !== null && value !== "") {
url.searchParams.set(key, value);
}
}
const response = await fetch(url, {
headers: {
Authorization: `Bearer ${API_KEY}`,
Accept: "application/json",
},
});
if (response.status === 429) {
throw new Error("Rate limited. Retry with a delay or queue.");
}
if (!response.ok) {
throw new Error(`HTTP ${response.status}: ${await response.text()}`);
}
return response.json();
}
Resolve competition and season IDs
Do this before downloading matches. Plain years are not the API contract; season_id is.
import { get } from "./client.mjs";
const competitions = await get("/football/competitions", {
search: "Premier League",
per_page: 10,
});
const premierLeague = competitions.data.find((competition) => (
competition.name === "Premier League" && competition.country === "England"
));
console.log(premierLeague.id); // comp_3039
const seasons = await get(`/football/competitions/${premierLeague.id}/seasons`, {
per_page: 20,
});
console.table(seasons.data.map((season) => ({
id: season.id,
name: season.name,
year: season.year,
})));
Use the IDs returned by the API. For example, Premier League 2024-25 is currently sn_3057848.
Paginate matches
import { get } from "./client.mjs";
async function getAll(endpoint, params = {}) {
const rows = [];
let page = 1;
while (true) {
const payload = await get(endpoint, { ...params, page, per_page: 100 });
rows.push(...payload.data);
if (page >= payload.meta.total_pages) break;
page += 1;
await new Promise((resolve) => setTimeout(resolve, 2000));
}
return rows;
}
const matches = await getAll("/football/matches", {
competition_id: "comp_3039",
season_id: "sn_3057848",
status: "finished",
});
console.log(`Fetched ${matches.length} matches`);
Write a CSV helper
import { writeFile } from "node:fs/promises";
function toCsv(rows) {
if (rows.length === 0) return "";
const headers = Object.keys(rows[0]);
const escape = (value) => {
if (value === null || value === undefined) return "";
const text = String(value);
return /[",\n]/.test(text) ? `"${text.replaceAll('"', '""')}"` : text;
};
return [
headers.join(","),
...rows.map((row) => headers.map((header) => escape(row[header])).join(",")),
].join("\n");
}
Export match results
const matchRows = matches.map((match) => ({
match_id: match.id,
utc_date: match.utc_date,
competition_id: match.competition_id,
season_id: match.season_id,
home_team_id: match.home_team?.id,
home_team: match.home_team?.name,
away_team_id: match.away_team?.id,
away_team: match.away_team?.name,
home_score: match.score?.home,
away_score: match.score?.away,
status: match.status,
xg_available: match.xg_available,
odds_available: match.odds_available,
}));
await writeFile("matches.csv", `${toCsv(matchRows)}\n`);
Add match stats
The match stats endpoint returns grouped objects. The most common match-model fields are under data.overview.
async function getMatchStatsRow(match) {
const { data } = await get(`/football/matches/${match.id}/stats`);
const overview = data.overview;
return {
match_id: match.id,
home_team: match.home_team?.name,
away_team: match.away_team?.name,
home_xg: overview.expected_goals?.all?.home,
away_xg: overview.expected_goals?.all?.away,
home_np_xg: data.np_expected_goals?.all?.home,
away_np_xg: data.np_expected_goals?.all?.away,
home_shots: overview.total_shots?.all?.home,
away_shots: overview.total_shots?.all?.away,
home_shots_on_target: overview.shots_on_target?.all?.home,
away_shots_on_target: overview.shots_on_target?.all?.away,
home_possession: overview.ball_possession?.all?.home,
away_possession: overview.ball_possession?.all?.away,
home_corners: overview.corner_kicks?.all?.home,
away_corners: overview.corner_kicks?.all?.away,
};
}
const statRows = [];
for (const match of matches) {
if (!match.xg_available) continue;
statRows.push(await getMatchStatsRow(match));
await new Promise((resolve) => setTimeout(resolve, 2000));
}
await writeFile("match_stats.csv", `${toCsv(statRows)}\n`);
Add shot-level xG later
If you need shot maps, call /football/matches/{match_id}/shotmap. That response returns one row per shot with player_id, player_name, team_id, team_name, coordinates, minute, result, body part, situation, and expected_goals.
Keep shot data in a third file:
shots.csv
match_id,shot_id,player_id,team_id,minute,result,expected_goals,x,y
Do not flatten shots into match rows. One match has many shots, so a separate CSV or database table is cleaner.
FAQ
Why not use season=2024?
The API uses season_id, such as sn_3057848. Resolve it through /football/competitions/{competition_id}/seasons.
How often should I delay between requests?
For Starter plan limits, a two-second delay keeps bulk jobs under 30 requests per minute. For larger plans, use a queue with configurable concurrency.
Can this run as a cron job?
Yes. Store completed season/page state, run the job nightly, and only request status=finished rows for seasons you are backfilling.
Ready to Power Your Sports App?
Start your 7-day free trial. All endpoints included on every plan.