"> InjuryIndex — NBA Player Risk Analytics
NBA Player Risk Analytics · Updated with 2025–26 Data

Every game has a cost.

InjuryIndex applies advanced statistical modelling to NBA player workload, travel, and fatigue data — quantifying injury risk before each game.

0+
Player-Games Analysed
0
Active Players Tracked
0
Risk Factors Modelled

Built on real
statistical logic

The same probabilistic techniques used in statistical risk modelling — applied to the NBA. We don't guess. We calculate.

01
Workload Analysis
Rolling minutes load over the last 3, 7, and 15 games. Cumulative stress builds — our model captures it.
02
Fatigue & Rest
Back-to-backs, days of rest, games in last 7 days. Fatigue is invisible to the eye but visible in the data.
03
Travel Load
Distance travelled between games using the Haversine formula. Transcontinental schedules carry measurable risk.
04
Player Profile
Age, position, and injury history. Risk compounds differently across player archetypes.
05
Logistic Regression
A probabilistic model trained on thousands of historical game-player observations. Output is a true probability.
06
Survival Analysis
Hazard rate modelling estimates how long until injury occurs.
InjuryIndex · Statistical risk modelling for the NBA Not medical advice · For analytical purposes only
2025–26 NBA Season
Risk Dashboard
Model trained on
4,200+ game observations
Risk scores based on data as of April 2026 — re-run model to update Model trained on 18,000+ injury labels · 258,735 player-games
We understand the erratic and often random nature of injuries in sports, yet our model draws upon past data and analytics to provide educated estimates. These probabilities are statistical outputs — not medical predictions — and should be interpreted accordingly.
High Risk
Elevated Risk
Low Risk
Players Tracked
InjuryIndex · Risk estimates based on workload, travel & fatigue modelling Not medical advice · For analytical purposes only
Methodology

How the
model
works.

A transparent explanation of the statistical techniques, data sources, and modelling decisions behind InjuryIndex.

01 — The Problem
What are we predicting?

InjuryIndex predicts the probability that a given player will suffer an injury that causes them to miss playing time in their next game. This is a binary outcome modelled as a Bernoulli random variable — either an injury occurs (1) or it doesn't (0).

Formally, for player i in game g, we estimate:

P(Yᵢᵍ = 1 | X₁, X₂, ... Xₙ)
Where Yᵢᵍ = 1 if player i is injured before game g+1, and X₁...Xₙ are our risk features.

We understand the erratic and often random nature of injuries in sports. Our model draws upon past data and analytics to provide educated estimates — not certainties. Injuries remain partially unpredictable by nature, and these outputs should be interpreted as statistical tendencies, not medical predictions.

02 — Features
What factors drive risk?

Our model is built on three categories of risk features, computed from historical game log and schedule data.

  • Minutes load (L3, L7, L15) — Rolling average of minutes played over the last 3, 7, and 15 games. High workload is the strongest single predictor.
  • Back-to-back indicator — Binary flag (0/1) for whether the previous game was played the night before. Increases risk significantly.
  • Days of rest — Days elapsed since last game. More rest generally reduces risk, with diminishing returns.
  • Games in last 7 days — Captures schedule density independent of individual game length.
  • Travel distance — Kilometres travelled between cities using the Haversine formula on GPS coordinates. Transcontinental travel adds measurable fatigue.
  • Cumulative travel (L3 games) — Rolling travel load over last 3 games to capture sustained road trip fatigue.
  • Player age — Older players show non-linear increases in injury probability, particularly for soft tissue injuries.
  • Injury history — Prior injury flag significantly increases future risk — one of the strongest predictors in the model.
03 — The Model
Logistic regression

The core model is a logistic regression — a probabilistic classifier that maps our feature vector to a probability between 0 and 1. Logistic regression is preferred for its interpretability — transparency matters when communicating risk estimates.

P(injury) = 1 / (1 + e^−(β₀ + β₁x₁ + β₂x₂ + ... + βₙxₙ))
Each β coefficient represents the log-odds contribution of that feature. A positive β increases risk; negative decreases it.

The model is trained on historical NBA game logs and injury records, with the dataset constructed at the player-game level — one row per player per game, with a binary injury outcome for the next game.

04 — Survival Analysis
Hazard rate modelling

Beyond per-game probability, we also model the time-to-injury using survival analysis — the same statistical framework used in life insurance to model time-to-death or time-to-claim.

The hazard function h(t) represents the instantaneous rate of injury at game t, given survival (no injury) up to that point:

h(t) = h₀(t) · e^(β₁x₁ + β₂x₂ + ... + βₙxₙ)
Cox proportional hazards model. h₀(t) is the baseline hazard; covariates scale it multiplicatively.

This allows us to estimate multi-game risk projections: the probability a player suffers at least one injury over the next N games.

05 — Multi-game projection
Cumulative risk

Per-game probabilities are compounded to produce short-term risk windows:

P(at least 1 injury in N games) = 1 − (1 − p)ᴺ
Where p is the per-game risk probability. For example: 3% nightly risk over 10 games = 26% cumulative risk.
Interactive Risk Calculator
Minutes load (L7 avg)34 min
Days of rest2 days
Travel distance (km)800 km
Player age26
Back-to-back?No
2.1%
Estimated injury probability for next game.
Low risk based on current inputs.
06 — Data Sources
Where the data comes from

All data is sourced from publicly available historical records:

  • Basketball Reference — Player game logs, minutes, schedule, team locations going back to the 1980s.
  • Kaggle NBA Injury Dataset — Compiled injury records (1951–2023) with player, date, team, and injury type.
  • GPS city coordinates — Used to compute Haversine travel distances between arenas.

The training dataset contains approximately 4,200+ player-game observations with binary injury outcomes, spanning multiple seasons.

07 — Limitations
What the model can't know

Statistical models are inherently limited. InjuryIndex does not account for:

  • Pre-existing undisclosed medical conditions
  • Contact injuries from collisions — fundamentally random events
  • Player-specific biomechanical risk factors
  • In-game decisions to limit minutes or sit players
  • Practice and training load between games

InjuryIndex provides statistical risk estimates for analytical and research purposes only. These outputs are not medical advice and should not be used to make medical decisions. Always consult qualified medical professionals for health-related decisions.

InjuryIndex · Statistical risk modelling applied to NBA analytics Not medical advice · For analytical purposes only