WORLD CUP 2026

Top mispricings — 10K sim vs. Kalshi

14d to kickoff

BITCOIN EDGE

Live BTC edge vs. Kalshi hourly markets

PICK OF THE DAY

Today's Oracle play

Gridiron Edge refreshes May 1 with 2026 season rosters

In the meantime, check the NFL Draft board — live Kalshi, Polymarket, and DraftKings odds for every top pick.

NFL Draft board →
Gridiron EdgeMethodology

How Gridiron Edge Works

Defense-Adjusted EPA, walk-forward validation, and Kalshi edge detection — the full technical methodology behind every number on this site.

66.5%

Win accuracy

0.616

Log loss

5,934

Games tested

2003–2024

Seasons

+0.61pp

Avg CLV

BETA

TL;DR

Definitions

TermPlain-English Definition
DAEPADefense-Adjusted Expected Points Added. EPA scaled by opponent quality so dominant production against a top defense outranks identical production against a bottom-tier defense.
PWRPoints-per-game power rating. Expected margin vs. an average team on a neutral field. Decomposes into Off PR + Def PR + ST PR on the same scale.
TierPlain-English bucket from PWR cutoffs: Elite ≥+7, Contender ≥+4, Playoff ≥+1, Average ≥−2, Below Avg ≥−5, Rebuild <−5.
PISPlayer Importance Score. How much a specific player shifts a team’s expected DAEPA when active vs. absent. Used for injury adjustments and off-season roster turnover.
Game Context WeightingPer-play weight applied before aggregating DAEPA. Garbage-time plays, clock-kill runs, and prevent-defense passes are discounted so ratings reflect real competitive intent.
Per-Component Bayesian ShrinkageEach season’s Week 1 prior shrinks the prior-season Off PR, Def PR, and ST PR toward zero by empirically-measured alpha coefficients (0.47 / 0.31 / 0.53 from 829 team-pairs, 1999-2025), then layers coaching and roster overrides on the points scale.
Walk-Forward BacktestTrain on all data through season N, predict season N+1, advance one year, repeat. No future data ever informs past predictions. No hindsight, no look-ahead bias.
CLVClosing Line Value. Average edge vs. the closing market line — the most rigorous out-of-sample measure of predictive skill in sports modeling.

DAEPA vs. DVOA vs. EPA

DAEPA is not a replacement for EPA or DVOA — it's a synthesis. Each metric answers a different question.

MetricWhat It MeasuresOpponent-Adjusted?Originator
EPAExpected Points Added per play. Raw play-level efficiency.NoCarnegie Mellon (Burke, Romer); popularized via nflfastR.
DVOADefense-adjusted Value Over Average. Play value vs. league baseline, scaled by opponent strength.YesAaron Schatz, Football Outsiders / FTN.
DAEPAEPA adjusted via iterative opponent weighting (PageRank-style) with per-play context weighting and Bayesian season blending.Yes (iterative)Gridiron Edge / The 7 Oracles, 2026.

The Problem With Standard NFL Analytics

Most publicly available NFL analytics stop at EPA — Expected Points Added. EPA is a useful play-level metric, but it has a structural flaw: it treats a 10-yard completion against the 2024 San Francisco 49ers defense the same as a 10-yard completion against the 2017 Cleveland Browns. The opponent doesn't exist in the math.

That's fine for descriptive purposes. It fails for prediction.

A team that ran up gaudy EPA numbers against a weak schedule looks identical to a team that ground out the same numbers against a murderers' row. When those two teams meet, standard EPA-based models give you the wrong answer more often than they should. Gridiron Edge was built specifically to close that gap.

Defense-Adjusted EPA (DAEPA)

The core metric is DAEPA — Defense-Adjusted Expected Points Added.

The adjustment uses an iterative opponent-weighting process conceptually similar to PageRank: a play's value is scaled by the quality of the opponent it was executed against, and opponent quality is itself a function of the quality of theiropponents. The system runs until convergence — typically around a dozen iterations — producing ratings where each team's offensive and defensive efficiency is expressed in a common unit that accounts for the full strength-of-schedule context.

The result is three per-team DAEPA components:

  • Offensive DAEPA — efficiency generating EPA, adjusted for defensive quality faced
  • Defensive DAEPA — efficiency suppressing EPA, adjusted for offensive quality faced
  • Special Teams DAEPA — adjusted field position and scoring value from the kicking game

These three components are converted to the points-per-game scale and summed into PWR— each team's expected margin in points vs. an average team on a neutral field. A team at PWR +3.5 projects to win by about 3.5 points on a neutral field against a PWR 0 opponent. PWR translates into a plain-English tier (Elite, Contender, Playoff, Average, Below Avg, Rebuild) so the number reads quickly without losing the underlying precision.

Game Context Weighting

Raw EPA includes plays that should carry little analytical weight — a prevent defense giving up a meaningless 15-yard gain with 90 seconds left in a 28-point blowout tells you almost nothing about either team's real capability.

Gridiron Edge applies a per-play context weight before aggregating DAEPA. Three conditions are handled:

Garbage time discounting: Plays in late-game blowouts receive reduced weight using a smooth function driven by win probability and score differential — not a binary cutoff. A 3-point game in the fourth quarter is barely discounted; a 28-point game with two minutes left is heavily discounted.

Clock-kill flagging: Run plays by a leading team in the fourth quarter designed to kill time rather than gain yards are identified and excluded from offensive efficiency calculations.

Prevent defense flagging: Pass plays against a defense in prevent coverage — trailing by multiple scores late — are flagged and discounted. Prevent defense statistics are notoriously misleading as indicators of pass defense quality.

Recency Weighting and Season Blending

Gridiron Edge applies exponential recency weighting to all play-level data, with a half-life tuned to NFL roster stability patterns. Recent games carry more weight; earlier games decay.

Cross-season blending uses a per-component Bayesian shrinkage framework on the points-per-game scale. Heading into Week 1, a team's prior shrinks the prior-season Off PR, Def PR, and ST PR toward zero by empirically-measured α coefficients (see the next section). Coaching changes and roster overrides are layered in as additive points-scale deltas — not as multiplicative scale factors, which is how earlier versions of this model accidentally produced no-op roster adjustments. As the new season's data accumulates, the posterior shifts toward current performance; the prior is effectively washed out by midseason.

Empirically-Calibrated Per-Component Shrinkage

The α (year-over-year shrinkage) and β (component win-weight) coefficients quoted above are measured, not assumed. The calibration ran in May 2026 over the full 1999–2025 play-by-play dataset.

Year-over-year stability (α coefficients).Measured as Pearson R between each team's prior-season component rating and the same franchise's next-season rating, across 829 team-pair observations from 1999 through 2025.

Componentα (Pearson R)95% CIPlain English
Off PR0.470.41 – 0.52Offense regresses ~53% toward the league mean each year.
Def PR0.310.25 – 0.37Defense is the LEAST sticky — pass-rush health, turnover variance, scheme turnover.
ST PR0.530.48 – 0.57Special teams is the MOST sticky — kickers and punters stay with teams for years.

The ST finding is counter-intuitive: blocked kicks and return touchdowns feel like luck. They are, on the play level. But over a 17-game season, ST is dominated by kicker FG accuracy and punter net yardage — both highly stable skills tied to specific players. Defense, by contrast, depends on edge-rusher health, turnover luck, scheme chemistry, and opponent quality — all of which churn year-to-year.

Component win-weights (β coefficients). Fit by OLS regression on 745 team-seasons from 2000–2024: season_wins ~ β₀ + β_off · off_pr + β_def · def_pr + β_st · st_pr. R² = 0.78 — the three components explain 78% of season-win variance; the remaining 22% is schedule strength, injury luck, turnover variance, and tactical matchups.

Termβ (wins per point)SE95% CI
intercept8.350.068.22 – 8.46
Off PR0.7570.0180.72 – 0.79
Def PR0.6130.0220.57 – 0.66
ST PR0.6280.1050.42 – 0.83

The β values for Off / Def / ST sit within ~25% of each other — well inside the standard-error envelope. This is why PWR sums the three components directly with equal weights rather than applying a fancy β-weighted formula; the ranking changes from β-weighting would be smaller than the noise in the α shrinkage.

Why this matters for the rankings. The previous version of these rankings summed components on a z-score×100 scale, which made special teams artificially equal in weight to offense or defense. On the native points-per-game scale, offense and defense range about ±10 points league-wide; special teams ranges only about ±1.4 points. Summing on the points scale produces a defensible total where the JAX-style "ST-strong team accidentally ranks top-5" failure mode goes away structurally.

Roster Adjustment and Injury Impact

The roster adjustment layer quantifies player impact via Player Importance Scores (PIS)— a per-player measure of how much that player's presence or absence shifts a team's expected DAEPA. PIS is derived from the player's historical DAEPA contribution in similar roles, weighted by snap share and positional leverage.

This is particularly important during the off-season: free agent departures, draft picks, and trade additions are all reflected in the team's projected ratings before the first preseason snap. Active injury adjustments at game time use the same PIS framework — when a player is ruled out, their contribution is removed and redistributed to their replacement using historical replacement-level baselines by position.

The Win Model

Win probability prediction uses a hybrid approach based on data availability:

Spread-available games (~96%): The opening line is the single most information-rich predictor of NFL game outcomes. For these games, Gridiron Edge converts the opening spread directly to a win probability using a normal distribution calibrated to NFL margin-of-victory distributions. DAEPA serves as a cross-check and edge-identification tool rather than the primary predictor.

No-spread games (~4%): A logistic regression model trained on DAEPA features generates the win probability for games without an opening line.

The value of DAEPA in the spread regime isn't in replacing the line — it's in identifying where the model disagrees with the market and why. A significant divergence between DAEPA-implied probability and spread-implied probability is the edge signal.

Walk-Forward Validation

Every performance metric published by Gridiron Edge comes from a strict walk-forward backtest: train on all data through season N, predict season N+1, advance one year, repeat. No future data ever influences a past prediction. No hindsight.

The backtest covers 5,934 games from 2003 through 2024 — the full era of modern NFL data with consistent play-by-play tracking.

  • 66.5%Win prediction accuracyvs. DVOA benchmark ~66%
  • 0.616Log losspenalizes confident wrong predictions
  • +0.61ppAverage CLVvs. closing line, 5,733 matched games

What This Means for Prediction Markets

Kalshi, Polymarket, and DraftKings all price NFL outcomes as independent markets. Sometimes they disagree — and those disagreements are signal. Gridiron Edge generates an independent probability estimate from DAEPA and the opening spread, compares it to each market's implied probability, and surfaces the largest gaps.

Edges are categorized as STRONG (≥12pp), HIGH (≥8pp), MODERATE (≥5pp), or LOW(<5pp) based on the magnitude of the model-market divergence. Only edges with directional agreement between the DAEPA signal and the spread signal are elevated to HIGH or STRONG. Cross-platform confirmation — two or more markets agreeing on the direction of mispricing — strengthens conviction further.

Market Coverage

The DAEPA engine feeds edge detection across every NFL market type we can find pricing for:

  • Moneyline — head-to-head winner, compared across Kalshi, Polymarket, and DraftKings
  • Spread — point-differential prediction vs. the opening and closing lines
  • Total — combined scoring over/under
  • Player props — passing, rushing, and receiving stat lines (Kalshi + DraftKings)
  • Win total futures — seasonal O/U per team, run via Monte Carlo over full schedules
  • Division, conference, Super Bowl — playoff bracket probabilities from a season-plus-postseason simulation
  • Season awards — MVP, ROY, OPOY, DPOY when Kalshi or Polymarket posts them

Not every market is live year-round. Preseason coverage emphasizes futures and win totals; in-season shifts to weekly games and player props. The backtest number above (66.5%) is the moneyline figure — spread, total, and prop performance is tracked separately as those product lines mature.

Frequently Asked Questions

What is DAEPA?

DAEPA stands for Defense-Adjusted Expected Points Added. It takes the standard EPA metric and applies an iterative opponent-weighting process — similar conceptually to PageRank — so that a play's value is scaled by the quality of the defense it was executed against. This makes DAEPA a better predictive tool than raw EPA, which treats all opponents equally.

How accurate is Gridiron Edge at predicting NFL games?

Gridiron Edge achieved 66.5% win prediction accuracy and a 0.616 log loss across a strict walk-forward backtest of 5,934 NFL games from 2003 through 2024. No future data was used to train past predictions — the model was trained on all data through season N and tested on season N+1, advancing one year at a time. This matches the Football Outsiders DVOA benchmark of approximately 66%.

What is PWR (the power rating)?

PWR is each team's rating in points per game above an average team on a neutral field. A team at PWR +3.5 is expected to win by about 3.5 points on a neutral field against a PWR 0 opponent; on a 17-game schedule that scales to roughly 9–10 wins. PWR decomposes into Off PR + Def PR + ST PR — the same three buckets, all measured on the same points scale so they can be added directly. Tiers (Elite ≥+7, Contender ≥+4, Playoff ≥+1, Average ≥−2, Below Avg ≥−5, Rebuild <−5) translate PWR into a plain-English bucket.

Why did GER get retired in favor of PWR?

GER (the legacy composite Gridiron Edge Rating) summed three DAEPA components on a z-score×100 scale, which normalized each component to the same variance. That meant special-teams was artificially weighted equal to offense or defense in the sum, producing rankings where ST-strong teams ranked unexpectedly high. PWR fixes this structurally by working on the native points-per-game scale, where league-wide range is roughly ±10 points for offense and defense vs. ±1.4 points for special teams — so summing produces a defensible total. GER may still surface in older articles; PWR is the current canonical team rating.

How does the Gridiron Edge win model use the opening spread?

For games with an opening spread (~96% of cases), Gridiron Edge converts the opening line directly to a win probability using a normal distribution calibrated to NFL margin-of-victory distributions (σ = 13.45 points). DAEPA serves as a cross-check — when the DAEPA-implied probability diverges significantly from the spread-implied probability, that gap is the edge signal. For games without a spread (~4%), a logistic regression model trained on DAEPA features generates the win probability.

What is a Player Importance Score (PIS)?

PIS quantifies how much an individual player's presence or absence shifts a team's expected DAEPA. It is derived from the player's historical DAEPA contribution in similar roles, weighted by snap share and positional leverage. PIS is used for two purposes: pre-game injury adjustments (removing a player's contribution when they are ruled out) and off-season roster projections (quantifying the impact of free agent departures and additions).

How are prediction market edges calculated?

Gridiron Edge generates an independent win probability estimate using DAEPA and the opening spread. That probability is compared to the contract price on Kalshi, Polymarket, and the DraftKings line — each a market-implied probability. The gap is the edge, expressed in percentage points. Edges are categorized as STRONG (≥12pp), HIGH (≥8pp), MODERATE (≥5pp), or LOW (<5pp). Only edges where the DAEPA signal and spread signal agree are elevated to HIGH or STRONG. Cross-platform confirmation — two or more markets agreeing on the direction of mispricing — strengthens conviction.

Which NFL markets does Gridiron Edge cover?

Game markets (moneyline, spread, total), player props (passing, rushing, receiving stat lines), win total futures, division and conference winners, Super Bowl champion, and season awards like MVP, ROY, OPOY, and DPOY. Coverage tracks what Kalshi, Polymarket, and DraftKings post — some markets are only live seasonally.

References & Data Sources

Methodology current as of the 2025 NFL season, last reviewed April 2026. The engine covers team ratings, game win probabilities, player props, and seasonal futures; playoff-bracket and cross-platform prop coverage expand through the 2026 season. Model parameters are subject to revision if underlying data sources or NFL structural dynamics change materially. Proprietary parameters are not disclosed.