Model Description: Expected Points

When you score, how much more likely are you to take points from the game you're playing? How much less likely are you to take points if you're scored on? What if you take a penalty? I made an expected standings points model to answer these questions. I don't love it enough to name it, unlike some of my other models. It is just "my expected [standings] points model".

Once you have such a model, you can use it to make descriptions of how a game was won or lost, for instance, like opening night of 2015-2016 where Montreal (road team, in green) played Toronto (home team, in blue).

The Leafs start out slightly higher, at around 1.15 points; they're at home and home teams win more. The Habs start out at around 1 expected point. The Canadiens score early, and jump up to almost 1.5 points; a one-goal lead is good but this early it's hardly enough to be very sure of points. They take a penalty soon after and drop some expected points, but they kill it off and go back to 1.5 points. Close to the 20 minute mark they take another penalty and this time the Leafs score, putting them back on top; the Leafs draw another penalty around 28 minutes but then take one shortly after. A long stretch of open play sees both lines rise slowly together---as the game wears on, overtime is more and more likely, where both teams will get points. Just over half-way through the third, however, the Habs score to take a 2-1 lead, this goal is much later in the game and so drops the Leafs expected points much more than the 1-0 goal did. The Leafs pull their goalie which gives them a slight extra chance but the Habs hold on to win and the final score is shown at the end of the graphs: two points for Montreal, zero points for Toronto. In fact the Canadiens score an empty net goal but it happens so late and affects the outcome so faintly that it barely registers on the chart.

I like to be able to tell stories like that; so the question is: where do the curves in the above graph come from? This is what this article is for.

Model Inputs

I consider directly three kinds of inputs: the venue, the score difference, and the game time. The venue is either "home" or "away"; the score difference is tied, up or down one, up or down two, or up or down three or more; and the game time in the number of seconds since the game began.

For each regular season game since 2007-2008, above, I record the time and the number of standings points the home team eventually obtained, with overtime and shootout results considered as having conferred 1.5 points to each team, since I consider such post-regulation results to be irrelevant to this model. Thus for each triple of venue, score difference, and game time $x$, I can compute the average number $y$ of points obtained.

From this data I fit a logistic function of the following form: $$ y \sim A\left(\frac{1}{1 + \exp(-k(x-x_0))}-\frac{1}{2}\right)+y_0 $$ The "midpoints" $(x_0,y_0)$ of these curves are constrained to be at $x_0 = 3600$ and $y_0$ according to the below:

When the home score is greater, the home curve is constrained to $y_0 = 2$ and the away curve is constrained to $y_0 = 0$.
When the home score is lesser, the home curve is constrained to $y_0 = 0$ and the away curve is constrained to $y_0 = 2$.
When the scores are tied, both curves are constrained to $y_0 = 1.5$.

The fitting produces the $A$ and $k$ which describe the data at hand, namely:

Model Usage

For a given game we would like to use these curves to build the chart that I used as an example above, among other things. In particular, we would like to adjust these curves based on skater number difference, since we know that having more skaters than the other team makes it more likely that a given team will score and thus more likely that they will finish the game with more standings points. To estimate this, I compute the historical two-minute goal probability for teams up one skater, namely 21.4%, and the same for a two-skater advantage, 67.7%. Then to be up one skater in a given situation is to be 21.4% of the way to being in the same situation but with one extra goal; producing the deviations seen above. (A truly sophisticated approach would also include the number of seconds expected to remain in the skater number mismatch, but I have not done this at this time.)

I treat empty-skater substitutions in the same way as skater number advantages produced by penalties.

Expected Standings Points

August 16, 2022, Micah Blake McCurdy, @IneffectiveMath

Model Inputs

Model Usage