analytic.football - Model Explainer

So why build a model anyway?

Trying to figure out who is good and who has been lucky is to my mind inherently interesting, if only to attempt to put the Leeds anxiety in my brain to rest. We all love table sorting! But that comes with lots of caveats, and by building a model you can incorporate many of the common ones before sorting your tables.

At its basic structure, the model (https://analytic.football) makes predictions about a match using its ratings, and then uses the errors in its prediction to update the ratings for those teams. It then uses these new ratings the next time it makes a prediction, and so on.

The model uses relatively few inputs (non-penalty xG, penalties, goals, red cards) and is aimed at squeezing the most information possible out of these. Experiments with non-shot xG, regressing to bookie predictions between seasons, squad value or wage bill all yielded pretty poor results for me. As a bonus, this allows for continuous ratings from season to season (with only a small adjustment required for different strengths of new vs relegated teams).

Penalties & red cards: adjusting xG

When a match is played, we make a couple of adjustments before updating the ratings. Penalties get down-weighted to ~34% (~0.27 xG, ~0.34 or 0 goals) for the match total. This indicates there is some signal in teams winning penalties for future performance, but you’re also seeing half an xG of noise on a simple table sort.

Teams get credit or penalised for any red cards in the match. The model expects a team 11v10 to create +0.7 xG/90 & concede -0.3 xG/90, the total’s for the match are adjusted to reflect the game as if it was 11 v 11, without removing any data from the sample.

Building a rating: separating chance creation & finishing

While we see finishing behave noisily on a season-to-season level, we also have well-documented “superteam” effects, and there are plausible reasons to think some teams may be able to create particularly set piece chances that are hard for xG models to reflect the quality of. Blending in goals to a model is therefore commonplace - typically in a ~30% proportion.

I take a different approach, and model directly both a team’s ability to create chances and their finishing efficiency. This allows for the model to move at different speeds when weighting the xG & goals outcomes in a way that blending the input data wouldn’t.

Putting it together, a team’s rating is therefore:

Team.attack_rating = Team.attack_chance_creation_rating x Team.attack_finishing_efficiency + CONSTANT

Here, the constant term (approx 0.03) is a small value on all teams’ attack & defence ratings, representing the part of the penalties the model chooses not to include. As it doesn’t get affected by any matches this is the last you’ll hear of it.

Updating the finishing rating

For each team the model keeps an exponential moving average of goals scored, xG created, goals conceded and xG conceded. After each match the model adjusts the teams finishing efficiency ratings some proportion towards the G/xG scored and G/xG conceded.

The model wants to move faster on finishing for attack than in defence. It uses a smaller sample to update towards, with the half-life being approximately 11 games for the moving average vs ~59 in defence.

The ratings themselves move relatively slowly though. Since they are tracking the ratio, you can think of them as a moving average of a moving average, with the attack having a half-life of 70 games and defence 75 games.

The upshot is the model will give you a bunch of credit if you do outlier finishing in attack for 2-3 years, and defence for 4-5 years, and it will demand you keep showing it the finishing is real.

To give you an idea of the impact, current values at the time of writing range from +15% in attack (Arsenal, Man City) to -13% (Everton), and +8% (Leeds, we will get to promoted teams later) or +6% (Forest, lowest of the non-promoted teams) to -3% (Man City) in defensive finishing efficiency.

Updating the chance creation rating

Here we once again split the rating into two. A team’s chance creation rating is made by blending two pieces: a short-term rating, updated directly from the match errors, and a long-term rating, which updates each match towards the short-term rating. This allows the model to separate out how quickly it wants to respond to new information from how quickly it wants to throw away old information.

The long-term rating makes up approximately 25-30% of the total rating. This ballast, which moves slowly & keeps old information in the rating can provide some of the potential benefits of regressing to wage bill - where historical performance has predictive value. It means the model is automatically increasingly sceptical of a team’s performance the more out of whack it is from their historical norm (& thus, will force it to perform at a higher level to maintain rating). This inbuilt smoothing allows it to remain responsive to short-term changes without spiking wildly.

Updating the short-term rating

To update the short-term rating, the model makes a prediction for the xG created or conceded for each team, based on that team’s short-term chance creation rating against the opposition’s chance creation rating. The model then compares this to the xG outturn, and generates 4 errors (home xG for/against, away xG for/against).

This match error is added, only on the defence side, to (1.13x) the exponential moving average of previous prediction errors. This means the model has “momentum”. This means the model will move faster when it is consistently wrong, and will provide stabilisation when a team throws in a one-off performance.

The model then updates the ratings based on this combined error, and hey presto, you have new short-term ratings. The half-lives for these are relatively quick, 12 games in attack and 15 games in defence, stabilised by the long-term ballast moving at a snail’s pace 111-game half-life in attack and 42 games in defence.

Handling promoted teams

Obviously with promoted teams we have no prior season PL data to construct a rating from. I use a regression based on Championship xG for/against and spread market points predicted totals to generate initial ratings for their chance creation. The model will hold weaker priors in a promoted team’s defence as a result of the momentum in short-term ratings. With fewer matches in the average it will respond quicker to large errors in the initial rating.

All promoted teams start with a finishing rating of -6.5% in attack and +8% in defence. This is based off historical average for promoted teams. Again, the model holds a weakened prior here, the initial EMA values (G=xG) will mean the rating will start to update towards 1 in the absence of further input.