2026 Gridiron Edge is live — ratings, projections, and Week 1 model lines vs. Kalshi

Power ratings, MVP & championship edges, and every Week 1 game priced against the live prediction market.

Gridiron EdgeMethodology

How Gridiron Edge Works

Q: What is DAEPA?

DAEPA stands for Defense-Adjusted Expected Points Added. It takes the standard EPA metric and applies an iterative opponent-weighting process — similar conceptually to PageRank — so that a play's value is scaled by the quality of the defense it was executed against. This makes DAEPA a better predictive tool than raw EPA, which treats all opponents equally.

Q: How accurate is Gridiron Edge at predicting NFL games?

Gridiron Edge achieved 66.2% win prediction accuracy and a 0.616 log loss across a strict walk-forward backtest of 6,218 NFL games from 2003 through 2025. No future data was used to train past predictions — the model was trained on all data through season N and tested on season N+1, advancing one year at a time. That is in line with the leading public opponent-adjusted efficiency benchmarks, which land around 66%.

Q: What is PWR (the power rating)?

PWR is each team's rating in points per game above an average team on a neutral field. A team at PWR +3.5 is expected to win by about 3.5 points on a neutral field against a PWR 0 opponent; on a 17-game schedule that scales to roughly 9–10 wins. PWR decomposes into Off PR + Def PR + ST PR — the same three buckets, all measured on the same points scale so they can be added directly. Tiers (Elite ≥+7, Contender ≥+4, Playoff ≥+1, Average ≥−2, Below Avg ≥−5, Rebuild <−5) translate PWR into a plain-English bucket.

Q: Why did GER get retired in favor of PWR?

GER (the legacy composite Gridiron Edge Rating) summed three DAEPA components on a z-score×100 scale, which normalized each component to the same variance. That meant special-teams was artificially weighted equal to offense or defense in the sum, producing rankings where ST-strong teams ranked unexpectedly high. PWR fixes this structurally by working on the native points-per-game scale, where league-wide range is roughly ±10 points for offense and defense vs. ±1.4 points for special teams — so summing produces a defensible total. GER may still surface in older articles; PWR is the current canonical team rating.

Q: How does the Gridiron Edge win model use the opening spread?

For games with an opening spread (~96% of cases), Gridiron Edge converts the opening line directly to a win probability using a normal distribution calibrated to NFL margin-of-victory distributions (σ = 13.45 points). DAEPA serves as a cross-check — when the DAEPA-implied probability diverges significantly from the spread-implied probability, that gap is the edge signal. For games without a spread (~4%), a logistic regression model trained on DAEPA features generates the win probability.

Q: What is a Player Importance Score (PIS)?

PIS quantifies how much an individual player's presence or absence shifts a team's expected DAEPA. It is derived from the player's historical DAEPA contribution in similar roles, weighted by snap share and positional leverage. PIS is used for two purposes: pre-game injury adjustments (removing a player's contribution when they are ruled out) and off-season roster projections (quantifying the impact of free agent departures and additions).

Q: How are prediction market edges calculated?

Gridiron Edge generates an independent win probability estimate using DAEPA and the opening spread. That probability is compared to the contract price on Kalshi, Polymarket, and the DraftKings line — each a market-implied probability. The gap is the edge, expressed in percentage points. Edges are categorized as STRONG (≥12pp), HIGH (≥8pp), MODERATE (≥5pp), or LOW (<5pp). Only edges where the DAEPA signal and spread signal agree are elevated to HIGH or STRONG. Cross-platform confirmation — two or more markets agreeing on the direction of mispricing — strengthens conviction.

Q: Which NFL markets does Gridiron Edge cover?

Game markets (moneyline, spread, total), player props (passing, rushing, receiving stat lines), win total futures, division and conference winners, Super Bowl champion, and season awards like MVP, ROY, OPOY, and DPOY. Coverage tracks what Kalshi, Polymarket, and DraftKings post — some markets are only live seasonally.

Defense-Adjusted EPA, walk-forward validation, and Kalshi edge detection — the full technical methodology behind every number on this site.

66.2%

Win accuracy

0.616

Log loss

6,218

Games tested

2003–2025

Seasons

+0.61pp

Avg CLV

BETA

TL;DR

Gridiron Edge predicts NFL game outcomes using Defense-Adjusted EPA (DAEPA) — an iterative opponent-weighted metric conceptually similar to PageRank — combined with the opening spread as a prior.
A strict walk-forward backtest across 6,218 NFL games from 2003–2025 produced 66.2% accuracy, 0.616 log loss, and +0.60pp average CLV vs. closing lines — in line with the leading public opponent-adjusted efficiency benchmarks.
An edge exists when the DAEPA-implied win probability diverges from both the spread and the Kalshi contract price. Edges are tiered STRONG (≥12pp), HIGH (≥8pp), MODERATE (≥5pp), LOW (<5pp); only directionally-agreeing DAEPA + spread signals qualify for HIGH or STRONG.

Definitions

Term	Plain-English Definition
DAEPA	Defense-Adjusted Expected Points Added. EPA scaled by opponent quality so dominant production against a top defense outranks identical production against a bottom-tier defense.
PWR	Points-per-game power rating. Expected margin vs. an average team on a neutral field. Decomposes into Off PR + Def PR + ST PR on the same scale.
Tier	Plain-English bucket from PWR cutoffs: Elite ≥+7, Contender ≥+4, Playoff ≥+1, Average ≥−2, Below Avg ≥−5, Rebuild <−5.
PIS	Player Importance Score. How much a specific player shifts a team’s expected DAEPA when active vs. absent. Used for injury adjustments and off-season roster turnover.
Game Context Weighting	Per-play weight applied before aggregating DAEPA. Garbage-time plays, clock-kill runs, and prevent-defense passes are discounted so ratings reflect real competitive intent.
Per-Component Bayesian Shrinkage	Each season’s Week 1 prior shrinks the prior-season Off PR, Def PR, and ST PR toward zero by empirically-measured alpha coefficients (0.47 / 0.31 / 0.53 from 829 team-pairs, 1999-2025), then layers coaching and roster overrides on the points scale.
Walk-Forward Backtest	Train on all data through season N, predict season N+1, advance one year, repeat. No future data ever informs past predictions. No hindsight, no look-ahead bias.
CLV	Closing Line Value. Average edge vs. the closing market line — the most rigorous out-of-sample measure of predictive skill in sports modeling.

DAEPA vs. EPA

DAEPA is not a replacement for raw EPA — it's an opponent-adjusted evolution of it. Where EPA grades every play against a league-average baseline, DAEPA scales that value by the quality of the opponent it was earned against.

Metric	What It Measures	Opponent-Adjusted?	Originator
EPA	Expected Points Added per play. Raw play-level efficiency.	No	Carnegie Mellon (Burke, Romer); popularized via nflfastR.
DAEPA	EPA adjusted via iterative opponent weighting (PageRank-style) with per-play context weighting and Bayesian season blending.	Yes (iterative)	Gridiron Edge / The 7 Oracles, 2026.

The Problem With Standard NFL Analytics

Most publicly available NFL analytics stop at EPA — Expected Points Added. EPA is a useful play-level metric, but it has a structural flaw: it treats a 10-yard completion against the 2024 San Francisco 49ers defense the same as a 10-yard completion against the 2017 Cleveland Browns. The opponent doesn't exist in the math.

That's fine for descriptive purposes. It fails for prediction.

A team that ran up gaudy EPA numbers against a weak schedule looks identical to a team that ground out the same numbers against a murderers' row. When those two teams meet, standard EPA-based models give you the wrong answer more often than they should. Gridiron Edge was built specifically to close that gap.

Defense-Adjusted EPA (DAEPA)

The core metric is DAEPA — Defense-Adjusted Expected Points Added.

The adjustment uses an iterative opponent-weighting process conceptually similar to PageRank: a play's value is scaled by the quality of the opponent it was executed against, and opponent quality is itself a function of the quality of theiropponents. The system runs until convergence — typically around a dozen iterations — producing ratings where each team's offensive and defensive efficiency is expressed in a common unit that accounts for the full strength-of-schedule context.

The result is three per-team DAEPA components:

Offensive DAEPA — efficiency generating EPA, adjusted for defensive quality faced
Defensive DAEPA — efficiency suppressing EPA, adjusted for offensive quality faced
Special Teams DAEPA — adjusted field position and scoring value from the kicking game

These three components are converted to the points-per-game scale and summed into PWR— each team's expected margin in points vs. an average team on a neutral field. A team at PWR +3.5 projects to win by about 3.5 points on a neutral field against a PWR 0 opponent. PWR translates into a plain-English tier (Elite, Contender, Playoff, Average, Below Avg, Rebuild) so the number reads quickly without losing the underlying precision.

Game Context Weighting

Raw EPA includes plays that should carry little analytical weight — a prevent defense giving up a meaningless 15-yard gain with 90 seconds left in a 28-point blowout tells you almost nothing about either team's real capability.

Gridiron Edge applies a per-play context weight before aggregating DAEPA. Three conditions are handled:

Garbage time discounting: Plays in late-game blowouts receive reduced weight using a smooth function driven by win probability and score differential — not a binary cutoff. A 3-point game in the fourth quarter is barely discounted; a 28-point game with two minutes left is heavily discounted.

Clock-kill flagging: Run plays by a leading team in the fourth quarter designed to kill time rather than gain yards are identified and excluded from offensive efficiency calculations.

Prevent defense flagging: Pass plays against a defense in prevent coverage — trailing by multiple scores late — are flagged and discounted. Prevent defense statistics are notoriously misleading as indicators of pass defense quality.

Recency Weighting and Season Blending

Gridiron Edge applies exponential recency weighting to all play-level data, with a half-life tuned to NFL roster stability patterns. Recent games carry more weight; earlier games decay.

Cross-season blending uses a per-component Bayesian shrinkage framework on the points-per-game scale. Heading into Week 1, a team's prior shrinks the prior-season Off PR, Def PR, and ST PR toward zero by empirically-measured α coefficients (see the next section). Coaching changes and roster overrides are layered in as additive points-scale deltas — not as multiplicative scale factors, which is how earlier versions of this model accidentally produced no-op roster adjustments. As the new season's data accumulates, the posterior shifts toward current performance; the prior is effectively washed out by midseason.

Empirically-Calibrated Per-Component Shrinkage

The α (year-over-year shrinkage) and β (component win-weight) coefficients quoted above are measured, not assumed. The calibration ran in May 2026 over the full 1999–2025 play-by-play dataset.

Year-over-year stability (α coefficients).Measured as Pearson R between each team's prior-season component rating and the same franchise's next-season rating, across 829 team-pair observations from 1999 through 2025.

Component	α (Pearson R)	95% CI	Plain English
Off PR	0.47	0.41 – 0.52	Offense regresses ~53% toward the league mean each year.
Def PR	0.31	0.25 – 0.37	Defense is the LEAST sticky — pass-rush health, turnover variance, scheme turnover.
ST PR	0.53	0.48 – 0.57	Special teams is the MOST sticky — kickers and punters stay with teams for years.

The ST finding is counter-intuitive: blocked kicks and return touchdowns feel like luck. They are, on the play level. But over a 17-game season, ST is dominated by kicker FG accuracy and punter net yardage — both highly stable skills tied to specific players. Defense, by contrast, depends on edge-rusher health, turnover luck, scheme chemistry, and opponent quality — all of which churn year-to-year.

Component win-weights (β coefficients). Fit by OLS regression on 745 team-seasons from 2000–2024: season_wins ~ β₀ + β_off · off_pr + β_def · def_pr + β_st · st_pr. R² = 0.78 — the three components explain 78% of season-win variance; the remaining 22% is schedule strength, injury luck, turnover variance, and tactical matchups.

Term	β (wins per point)	SE	95% CI
intercept	8.35	0.06	8.22 – 8.46
Off PR	0.757	0.018	0.72 – 0.79
Def PR	0.613	0.022	0.57 – 0.66
ST PR	0.628	0.105	0.42 – 0.83

The β values for Off / Def / ST sit within ~25% of each other — well inside the standard-error envelope. This is why PWR sums the three components directly with equal weights rather than applying a fancy β-weighted formula; the ranking changes from β-weighting would be smaller than the noise in the α shrinkage.

Why this matters for the rankings. The previous version of these rankings summed components on a z-score×100 scale, which made special teams artificially equal in weight to offense or defense. On the native points-per-game scale, offense and defense range about ±10 points league-wide; special teams ranges only about ±1.4 points. Summing on the points scale produces a defensible total where the JAX-style "ST-strong team accidentally ranks top-5" failure mode goes away structurally.

Roster Adjustment and Injury Impact

The roster adjustment layer quantifies player impact via Player Importance Scores (PIS)— a per-player measure of how much that player's presence or absence shifts a team's expected DAEPA. PIS is derived from the player's historical DAEPA contribution in similar roles, weighted by snap share and positional leverage.

This is particularly important during the off-season: free agent departures, draft picks, and trade additions are all reflected in the team's projected ratings before the first preseason snap. Active injury adjustments at game time use the same PIS framework — when a player is ruled out, their contribution is removed and redistributed to their replacement using historical replacement-level baselines by position.

The Win Model

Win probability prediction uses a hybrid approach based on data availability:

Spread-available games (~96%): The opening line is the single most information-rich predictor of NFL game outcomes. For these games, Gridiron Edge converts the opening spread directly to a win probability using a normal distribution calibrated to NFL margin-of-victory distributions. DAEPA serves as a cross-check and edge-identification tool rather than the primary predictor.

No-spread games (~4%): A logistic regression model trained on DAEPA features generates the win probability for games without an opening line.

The value of DAEPA in the spread regime isn't in replacing the line — it's in identifying where the model disagrees with the market and why. A significant divergence between DAEPA-implied probability and spread-implied probability is the edge signal.

Walk-Forward Validation

Every performance metric published by Gridiron Edge comes from a strict walk-forward backtest: train on all data through season N, predict season N+1, advance one year, repeat. No future data ever influences a past prediction. No hindsight.

The backtest covers 6,218 games from 2003 through 2025 — the full era of modern NFL data with consistent play-by-play tracking.

66.2%Win prediction accuracyin line with public efficiency benchmarks ~66%
0.616Log losspenalizes confident wrong predictions
+0.60ppAverage CLVvs. closing line, 6,017 matched games

What This Means for Prediction Markets

Kalshi, Polymarket, and DraftKings all price NFL outcomes as independent markets. Sometimes they disagree — and those disagreements are signal. Gridiron Edge generates an independent probability estimate from DAEPA and the opening spread, compares it to each market's implied probability, and surfaces the largest gaps.

Edges are categorized as STRONG (≥12pp), HIGH (≥8pp), MODERATE (≥5pp), or LOW(<5pp) based on the magnitude of the model-market divergence. Only edges with directional agreement between the DAEPA signal and the spread signal are elevated to HIGH or STRONG. Cross-platform confirmation — two or more markets agreeing on the direction of mispricing — strengthens conviction further.

Market Coverage

The DAEPA engine feeds edge detection across every NFL market type we can find pricing for:

Moneyline — head-to-head winner, compared across Kalshi, Polymarket, and DraftKings
Spread — point-differential prediction vs. the opening and closing lines
Total — combined scoring over/under
Player props — passing, rushing, and receiving stat lines (Kalshi + DraftKings)
Win total futures — seasonal O/U per team, run via Monte Carlo over full schedules
Division, conference, Super Bowl — playoff bracket probabilities from a season-plus-postseason simulation
Season awards — MVP, ROY, OPOY, DPOY when Kalshi or Polymarket posts them

Not every market is live year-round. Preseason coverage emphasizes futures and win totals; in-season shifts to weekly games and player props. The backtest number above (66.2%) is the moneyline figure — spread, total, and prop performance is tracked separately as those product lines mature.

Frequently Asked Questions

What is DAEPA?

DAEPA stands for Defense-Adjusted Expected Points Added. It takes the standard EPA metric and applies an iterative opponent-weighting process — similar conceptually to PageRank — so that a play's value is scaled by the quality of the defense it was executed against. This makes DAEPA a better predictive tool than raw EPA, which treats all opponents equally.

How accurate is Gridiron Edge at predicting NFL games?

Gridiron Edge achieved 66.2% win prediction accuracy and a 0.616 log loss across a strict walk-forward backtest of 6,218 NFL games from 2003 through 2025. No future data was used to train past predictions — the model was trained on all data through season N and tested on season N+1, advancing one year at a time. That is in line with the leading public opponent-adjusted efficiency benchmarks, which land around 66%.

What is PWR (the power rating)?

PWR is each team's rating in points per game above an average team on a neutral field. A team at PWR +3.5 is expected to win by about 3.5 points on a neutral field against a PWR 0 opponent; on a 17-game schedule that scales to roughly 9–10 wins. PWR decomposes into Off PR + Def PR + ST PR — the same three buckets, all measured on the same points scale so they can be added directly. Tiers (Elite ≥+7, Contender ≥+4, Playoff ≥+1, Average ≥−2, Below Avg ≥−5, Rebuild <−5) translate PWR into a plain-English bucket.

Why did GER get retired in favor of PWR?

GER (the legacy composite Gridiron Edge Rating) summed three DAEPA components on a z-score×100 scale, which normalized each component to the same variance. That meant special-teams was artificially weighted equal to offense or defense in the sum, producing rankings where ST-strong teams ranked unexpectedly high. PWR fixes this structurally by working on the native points-per-game scale, where league-wide range is roughly ±10 points for offense and defense vs. ±1.4 points for special teams — so summing produces a defensible total. GER may still surface in older articles; PWR is the current canonical team rating.

How does the Gridiron Edge win model use the opening spread?

For games with an opening spread (~96% of cases), Gridiron Edge converts the opening line directly to a win probability using a normal distribution calibrated to NFL margin-of-victory distributions (σ = 13.45 points). DAEPA serves as a cross-check — when the DAEPA-implied probability diverges significantly from the spread-implied probability, that gap is the edge signal. For games without a spread (~4%), a logistic regression model trained on DAEPA features generates the win probability.

What is a Player Importance Score (PIS)?

PIS quantifies how much an individual player's presence or absence shifts a team's expected DAEPA. It is derived from the player's historical DAEPA contribution in similar roles, weighted by snap share and positional leverage. PIS is used for two purposes: pre-game injury adjustments (removing a player's contribution when they are ruled out) and off-season roster projections (quantifying the impact of free agent departures and additions).

How are prediction market edges calculated?

Gridiron Edge generates an independent win probability estimate using DAEPA and the opening spread. That probability is compared to the contract price on Kalshi, Polymarket, and the DraftKings line — each a market-implied probability. The gap is the edge, expressed in percentage points. Edges are categorized as STRONG (≥12pp), HIGH (≥8pp), MODERATE (≥5pp), or LOW (<5pp). Only edges where the DAEPA signal and spread signal agree are elevated to HIGH or STRONG. Cross-platform confirmation — two or more markets agreeing on the direction of mispricing — strengthens conviction.

Which NFL markets does Gridiron Edge cover?

Game markets (moneyline, spread, total), player props (passing, rushing, receiving stat lines), win total futures, division and conference winners, Super Bowl champion, and season awards like MVP, ROY, OPOY, and DPOY. Coverage tracks what Kalshi, Polymarket, and DraftKings post — some markets are only live seasonally.

References & Data Sources

Gridiron Edge is built entirely on public play-by-play, official injury reports, and market prices. Every published figure is reproducible from those inputs.

EPA framework — Romer, D. (2006). Do Firms Maximize? Evidence from Professional Football. Journal of Political Economy, 114(2). Foundational expected-points work adapted widely in NFL analytics.
Opponent-adjusted efficiency benchmarks— established public opponent-adjusted NFL efficiency ratings are the closest external benchmark for a model of this class, and land around 66% game-winner accuracy. Cited as the "~66% accuracy" reference point.
Play-by-play data — nflfastR (Carl, Baldwin) for canonical play-level data from 1999 onward.
Opponent weighting (PageRank analogy) — Page, Brin, Motwani & Winograd (1998). The PageRank Citation Ranking.Stanford Digital Library Technologies. DAEPA's iterative rating solver borrows the fixed-point structure, not the link graph.
Walk-forward backtest standards — López de Prado, M. (2018). Advances in Financial Machine Learning. Wiley. Same leakage-prevention framework applied to sports modeling here.
Injury data — NFL public injury report feed, cross-referenced against public position-snap baselines for PIS replacement-level estimation.
Market pricing — Kalshi contract prices (event-level tickers, api.elections.kalshi.com), Polymarket via the Gamma API (gamma-api.polymarket.com), and DraftKings lines via The Odds API. Opening and closing lines from consensus sportsbook feeds are retained for CLV computation.

Further reading: We Backtested 20 Years of Madden Ratings Against Real NFL Results — the same walk-forward discipline applied to Madden ratings (4,237 games, 2010–2025): 61% straight-up, where the signal actually lives, and how it feeds our preseason priors.

See the Model in Action

NFL Predictions — every game vs the market → — Free model win probability vs Kalshi on all 18 weeks
2026 NFL Win Totals — model vs Kalshi → — All 32 teams: projected wins vs the KXNFLWINS ladder
NFL Game Edges — where the model disagrees → — Moneyline, spread & total edges vs Kalshi prices
NFL MVP Edges → — Live MVP market mispricings vs Kalshi KXNFLMVP
Win Total Futures → — Monte Carlo season wins vs the Kalshi line
Championship Edges → — Playoff / conference / Super Bowl probabilities
Kalshi NFL Markets → — How Kalshi prices every NFL market + live edges
Power Rankings — the model methodology anchor → — All 32 teams on the points-per-game scale

Methodology current as of the 2025 NFL season, last reviewed June 2026. The engine covers team ratings, game win probabilities, player props, and seasonal futures; playoff-bracket and cross-platform prop coverage expand through the 2026 season. Model parameters are subject to revision if underlying data sources or NFL structural dynamics change materially. Proprietary parameters are not disclosed.