NBA Player Risk Analytics · Updated with 2025–26 Data

Every game has a cost.

InjuryIndex applies advanced statistical modelling to NBA player workload, travel, and fatigue data — quantifying injury risk before each game.

Player-Games Analysed

Active Players Tracked

Risk Factors Modelled

What we measure

Built on real
statistical logic

The same probabilistic techniques used in statistical risk modelling — applied to the NBA. We don't guess. We calculate.

Workload Analysis

Rolling minutes load over the last 3, 7, and 15 games. Cumulative stress builds — our model captures it.

Fatigue & Rest

Back-to-backs, days of rest, games in last 7 days. Fatigue is invisible to the eye but visible in the data.

Travel Load

Distance travelled between games using the Haversine formula. Transcontinental schedules carry measurable risk.

Player Profile

Age, position, and injury history. Risk compounds differently across player archetypes.

Logistic Regression

A probabilistic model trained on thousands of historical game-player observations. Output is a true probability.

Survival Analysis

Hazard rate modelling estimates how long until injury occurs.

Methodology

How the
model
works.

A transparent explanation of the statistical techniques, data sources, and modelling decisions behind InjuryIndex.

01 — The Problem

What are we predicting?

InjuryIndex predicts the probability that a given player will suffer an injury that causes them to miss playing time in their next game. This is a binary outcome modelled as a Bernoulli random variable — either an injury occurs (1) or it doesn't (0).

Formally, for player i in game g, we estimate:

P(Yᵢᵍ = 1 | X₁, X₂, ... Xₙ)

Where Yᵢᵍ = 1 if player i is injured before game g+1, and X₁...Xₙ are our risk features.

We understand the erratic and often random nature of injuries in sports. Our model draws upon past data and analytics to provide educated estimates — not certainties. Injuries remain partially unpredictable by nature, and these outputs should be interpreted as statistical tendencies, not medical predictions.

02 — Features

What factors drive risk?

Our model is built on three categories of risk features, computed from historical game log and schedule data.

Minutes load (L3, L7, L15) — Rolling average of minutes played over the last 3, 7, and 15 games. High workload is the strongest single predictor.
Back-to-back indicator — Binary flag (0/1) for whether the previous game was played the night before. Increases risk significantly.
Days of rest — Days elapsed since last game. More rest generally reduces risk, with diminishing returns.
Games in last 7 days — Captures schedule density independent of individual game length.
Travel distance — Kilometres travelled between cities using the Haversine formula on GPS coordinates. Transcontinental travel adds measurable fatigue.
Cumulative travel (L3 games) — Rolling travel load over last 3 games to capture sustained road trip fatigue.
Player age — Older players show non-linear increases in injury probability, particularly for soft tissue injuries.
Injury history — Prior injury flag significantly increases future risk — one of the strongest predictors in the model.

03 — The Model

Logistic regression

The core model is a logistic regression — a probabilistic classifier that maps our feature vector to a probability between 0 and 1. Logistic regression is preferred for its interpretability — transparency matters when communicating risk estimates.

P(injury) = 1 / (1 + e^−(β₀ + β₁x₁ + β₂x₂ + ... + βₙxₙ))

Each β coefficient represents the log-odds contribution of that feature. A positive β increases risk; negative decreases it.

The model is trained on historical NBA game logs and injury records, with the dataset constructed at the player-game level — one row per player per game, with a binary injury outcome for the next game.

04 — Survival Analysis

Hazard rate modelling

Beyond per-game probability, we also model the time-to-injury using survival analysis — the same statistical framework used in life insurance to model time-to-death or time-to-claim.

The hazard function h(t) represents the instantaneous rate of injury at game t, given survival (no injury) up to that point:

h(t) = h₀(t) · e^(β₁x₁ + β₂x₂ + ... + βₙxₙ)

Cox proportional hazards model. h₀(t) is the baseline hazard; covariates scale it multiplicatively.

This allows us to estimate multi-game risk projections: the probability a player suffers at least one injury over the next N games.

05 — Multi-game projection

Cumulative risk

Per-game probabilities are compounded to produce short-term risk windows:

P(at least 1 injury in N games) = 1 − (1 − p)ᴺ

Where p is the per-game risk probability. For example: 3% nightly risk over 10 games = 26% cumulative risk.

Interactive Risk Calculator

Minutes load (L7 avg)34 min

Days of rest2 days

Travel distance (km)800 km

Player age26

Back-to-back?No

2.1%

Estimated injury probability for next game.
Low risk based on current inputs.

06 — Data Sources

Where the data comes from

All data is sourced from publicly available historical records:

Basketball Reference — Player game logs, minutes, schedule, team locations going back to the 1980s.
Kaggle NBA Injury Dataset — Compiled injury records (1951–2023) with player, date, team, and injury type.
GPS city coordinates — Used to compute Haversine travel distances between arenas.

The training dataset contains approximately 4,200+ player-game observations with binary injury outcomes, spanning multiple seasons.

07 — Limitations

What the model can't know

Statistical models are inherently limited. InjuryIndex does not account for:

Pre-existing undisclosed medical conditions
Contact injuries from collisions — fundamentally random events
Player-specific biomechanical risk factors
In-game decisions to limit minutes or sit players
Practice and training load between games

InjuryIndex provides statistical risk estimates for analytical and research purposes only. These outputs are not medical advice and should not be used to make medical decisions. Always consult qualified medical professionals for health-related decisions.

Every game has a cost.

Built on realstatistical logic

How themodelworks.

Built on real
statistical logic

How the
model
works.