The historically boring 2019 grand final and the analytical press demonstrating most grand finals historically haven’t been the most exciting games led me to wonder if games could be ranked based on how exciting they were.
The last few minutes of a close game are one of the many appeals of Aussie rules. In terms of games which use a clock to keep time, the back-and-forth nature of a close AFL game makes for incredibly exciting finishes. Tracking the “excitement” of a game, then, could be determined by the game’s leverage.
I’m approaching the concept of leverage from the writings on kenpom.com, specifically Nic Reiner’s posts about tense NCAA tournament games and unbeaten teams, now from several years ago. In the footy context, leverage measures how much is at stake in a given minute, defined here as: how much would the win probability change if either team kicked a goal in that specific minute?
In doing so, I decided to write a simulation which would generate both a running winning percentage and a running leverage percentage.
A very similar calculation has already been done by Aflalytics in measuring the game’s tension. The findings from the simulator I wrote and the equation used over on that site end up matching quite closely.
I also want to highlight two things – I’m currently writing a footy simulator, Australian Football Coach – and in order to create more game data for grand finals, I have recently also captured the scoring plays from the 1990 and 1999 grand finals and sent them into AFL Tables. If you have raw video of old AFL games (before 2001), consider helping add to the database of games we have full scoring summaries for – I’ll write a further blog post about capturing this data.
Collecting the data
Before the simulator could be written, I needed game data. I ended up writing a Python script which didn’t automatically scrape afltables.com, but could take the scoring table which AFL Tables generates at the end of games and turn the raw data directly copied from the site and pasted into a text file into a format which could be used by a simulator. It’s more likely I haven’t stumbled upon a proper open database, but the script still has its uses. I’ll host the script on git if it’s of interest to anyone – please get in touch on Twitter (@thejohnholden) or through the Contacts page and I’ll put in the work to make it reusable.
Creating the simulator
While I fully admit a simulation pales in elegance to a proper equation, like the one used over at Aflalytics, I love writing simulators. My footy simulator is a collection of many different micro-simulations, and I’ve also done work writing quick-running AI simulations for some of the New Star games, especially the old and now hard to find New Star Tennis.
I needed the simulation’s inputs to be as simple as possible. I needed the simulator to read from a match file which tells the simulator when points were scored, how, and by which team, and I needed an input winning percentage. The match events were generated by the above script, the latter taken from Squiggle AFL. I only ran 11 games to start – the 2017, 2018, and 2019 grand finals, and the remainder of the 2019 finals series – partially because the simulation’s a bit slow, and partially because I got bored manually collecting game data. There was a simulator to write!
The simulation worked similarly to the simulation I wrote to check the winning percentage of the GWS-Collingwood game. For each minute of the game, the simulation checks to see if any scoring shots were kicked in real life, adds them to the score, and then runs three different simulations 10,000 times each: a simulation using the current score, a simulation where the winning team kicked an additional goal that minute, and a simulation where the losing team kicked an additional goal in that minute. To calculate leverage, I then took the absolute value of the difference of the winning percentage of the two simulations where I added a goal to each team.
To simulate the games, I predicted the chance a scoring shot would be kicked in a given minute, approximately 50%. I then determine if the scoring shot was a goal or a behind using the ratio of goals to behinds from the 2019 season, or a 23 in 40 chance of scoring a goal. I then used the initial winning percentage to generate whether the winning or losing team kicked the scoring shot.
The latter part proved slightly difficult, as I needed to calculate how likely a team with a given win ratio would be expected to kick a scoring shot in the game. My initial calculations were incorrect, and I ended up having to graph the expected winning percentage to the game-generated winning percentage in LibreOffice Calc – it turns out the y-intercept is 37.9%, meaning a team with a 1% initial winning percentage will still generate 38% of the scoring shots in the simulation. Fortunately, it now simulates out properly. Perhaps the biggest issue with the simulation is it does not predict multiple scoring shots in a minute, and while this happens – eyeballing the raw data – once a game, I’m not too concerned with this assumption.
The results
Of our 11-game set, the most exciting game was the 2019 elimination final between Brisbane and GWS. The Giants had only a 40% win chance and ended up winning by three points after kicking the winning goal with five minutes left in the match. The Giants jumped out to a 24-0 lead by kicking the first four goals, but wound up behind 32-26 at quarter time. This variation meant the most interesting first quarter based on average leverage in the data set wound up being the GWS-Bulldogs game in the first round of finals – only five goals were kicked, but with one exception the teams traded goals before GWS ended up breaking the game open in the third quarter. Second was the 2018 grand final, in part because the Brisbane-GWS game was much more back-and-forth: as Magpies fans are all too well aware, Collingwood jumped out to a healthy lead in the opening part of the 2018 decider, which had a negative impact on the leverage.
The least interesting game actually wasn’t the 2019 grand final, even though that game had by far the most boring second half of any game in the data set. Because GWS kept the first quarter of the grand final relatively close, the West Coast-Essendon blowout actually has a lower leverage score, mostly because the Eagles jumped on the Bombers faster than the Tigers pulled away. Five of the 11 games had an average fourth quarter leverage of less than 1% – the 2017 and 2019 grand finals, the Richmond-Brisbane first round final, GWS-Bulldogs as mentioned above, and West Coast-Essendon.
Here are three examples of different types of games: the very close 2018 grand final, the historical blowout 2019 grand final, and the most average game out of the 11, the Collingwood-Geelong game where no team scored for roughly 20 minutes. The red line is the leverage, the dotted line is the estimated winning percentage, and the x-axis is minutes since the start of the game.
Finally here’s a table of all 11 games which have been simulated, ranked by average quarter leverage and showing average leverage within each quarter. Note the 0% Q4 leverage for the grand final – this means that no scoring shot in that quarter moved the winning percentage at all:
Game | Average Qtr
Leverage |
Q1 | Q2 | Q3 | Q4 |
2019 EF BRI-GWS | 25.02% | 16.00% | 18.92% | 20.98% | 44.18% |
2018 GF | 22.76% | 12.35% | 14.06% | 22.79% | 41.84% |
2019 PF GWS-COL | 20.45% | 14.41% | 18.01% | 18.48% | 30.89% |
2019 PF RIC-GEE | 19.53% | 15.53% | 17.82% | 23.58% | 21.17% |
2019 EF GEE-WCE | 15.36% | 11.09% | 8.24% | 24.55% | 17.56% |
2019 QF COL-GEE | 12.93% | 15.99% | 12.21% | 14.53% | 9.00% |
2019 EF GWS-WB | 12.12% | 16.40% | 15.87% | 16.03% | 0.18% |
2017 GF | 11.68% | 15.91% | 18.31% | 11.76% | 0.75% |
2019 QF RIC-BRI | 11.13% | 16.39% | 19.17% | 8.40% | 0.55% |
2019 GF | 5.97% | 14.88% | 8.28% | 0.73% | 0.00% |
2019 EF WCE-ESS | 5.02% | 12.13% | 3.86% | 3.87% | 0.21% |
Next Steps
I’d like to generate leverage outcomes for more games, but I’m looking for winning percentage over a longer period of time (Squiggle only goes back to 2017.) I also need to generate a larger data set of games to run. Ultimately I want to use the simulator to determine which player kicks goals at the most opportune times (though this data set may already exist somewhere) and write a piece to test my hypothesis on how grand finals tend not to be the most exciting games. If you have any other ideas about how this leverage simulator can be used, please get in touch.