Double Fault: The multiple failures of the Djokovic visa story

The lack of finding the thing I’d like to read often compels me to write. In this case, finding decent discourse surrounding the Djokovic affair  has been a struggle due to both the spin of politics and the needs of media to simplify the affair. The story itself? A complete train wreck. Often it takes multiple actors to cause something well-engineered to go off the rails. As a lawyer (not in Australia, though) and former multiple visa applicant who likes the odd disaster story, I felt compelled to write what I wanted to read. Please note in the analysis I’m often reading between the lines. While I think this is correct, since I’ve read most of the legal documentation and a lot of articles, there’s a lot of unknowns, and I’ve tried to identify areas of uncertainty where I can.

I’ve also listed everything out at the bottom for simplicity’s sake.

The thing I’m most surprised by in the story is that Djokovic isn’t completely to blame here. Of course, had he been vaccinated, the story would never have occurred, and we’d all be focusing on a tennis tournament. He does not get excused of blame. However, he also appears to be a victim in the story, as he bears the brunt of failures, mostly the inability to receive the proper documentation which would have let him in the country.

The visa and exemption
As of 15 December 2021, it is fairly easy to enter Australia. If you’re vaccinated, you need a valid visa, and possibly an individual exemption, though most visas now come with an automatic exemption for the vaccinated.

If you are unvaccinated, however, you must have both a visa and a federal individual exemption. Furthermore, you may be subject to quarantine, handled by the states, but you can apply for a quarantine exemption too. That’s two separate exemptions. You apply for the first one on the federal immigration website and the second one at coronavirus.vic.gov.au. This is an important detail. The medical exemption only exists for the quarantine exemption.

Tennis players enter Australia on a short term work visa (a 408-class). It appears Tennis Australia co-ordinates the visas with the federal government, and, due to the quarantine requirements, the state government. Having applied for a (non-Australian) visa myself during the age of COVID-19, I do not believe the visa application discussed any vaccination or health requirements. It’s very possible the application asked no questions about vaccination status, instead leaving that all to the exemptions – I’d be surprised if the visa and the exemption rules weren’t separated out. Therefore, the “why was he issued a visa at all?” questions asked by supporters and politicians alike is simple: Djokovic was eligible for one. In times of COVID-19, though, a visa grant alone isn’t enough for someone to legally enter Australia.

In order to enter Australia as an unvaccinated person, you need an individual exemption. Based on Novak’s interview with the Border Force, it doesn’t appear anyone (Djokovic or Tennis Australia) applied for the federal exemption for Novak to enter the country. Several categories exist where you’re exempt without needing to apply for an individual exemption, but these are for citizens, New Zealanders/Singaporeans/Koreans/Japanese, airline crew, diplomats, 188-class visa holders(!), and fully vaccinated people with a valid visa. None of them apply to Djokovic, so he would need to apply. (The one possible exemption is for 408 visas in the “Post COVID-19 Economic Recovery Event” stream and supported by a task force, but while I can’t disprove it applying here, I don’t think that rule applies.)

The individual exemptions the unvaccinated can apply for are more limited in scope. The relevant one here appears to be “a foreign national whose entry into Australia would be in the national interest, supported by the Australian Government or a state or territory government authority.” Tennis players aren’t entering the country as students, on a COVID-19 response team, as a prospective marriage visa, or for compassionate reasons, so the remaining ones don’t really apply. Still, applying the relevant text, the Victorian government could have granted Djokovic the federal individual exemption.

Instead, Victoria granted Djokovic a “medical exemption” for quarantine. In order to get Djokovic’s medical exemption, Tennis Australia clearly co-ordinated with the state government, who claimed that catching COVID-19 in the last six months qualified (Djokovic’s lawyers produced letters to this effect saying that quarantine wouldn’t be necessary once he got into the country.) Those letters are potentially confusing as the Victorian government used the phrase “quarantine-free entry,” which potentially implies not just that you don’t have to quarantine, but that you can also enter the country without issue. And since Victoria could have granted him an individual exemption, it’s possible someone, whether it be Djokovic or Tennis Australia, thought this was enough. This also may have been why Prime Minister Scott Morrison said it was up to Victoria, i.e. the federal government wouldn’t be the ones granting the exemption.

However, if you want the exemption, you have to clearly apply online, and it’s not clear (and nevertheless unlikely) whether an exemption to quarantine would also trigger a sponsorship of an individual exemption to enter the country.

It’s also clear from the Border Force interview Djokovic didn’t have any idea he had the incorrect documentation to enter Australia, which I believe is more on Tennis Australia (or his agent) than Djokovic. While this normally should be on the person trying to enter the country, it’s clear from the provided documentation Tennis Australia was trying to handle at least some of the visa issues on his behalf, as they likely do on an annual basis. Djokovic also confirmed an agent handled at least some of his immigration applications. For this reason I think he’s at least somewhat of a victim. Yes, he’s not as much a victim as he is the story’s agitator, but consider the alternative: if he or Tennis Australia would have applied for the federal individual exemption, the story would have been the government either granting or denying his application before his journey (and it likely would have been denied unless the Victorian government sponsored him, based on the available criteria.) This clearly would have still hit the news cycle, but not in the way it did.

It’s also possible Djokovic lied on the form, but I haven’t seen any official indication of this, and it doesn’t matter anyway.

The Border Force and Appeal
The first error was that Djokovic should not have been able to board the flight to Australia. I can’t speak as to why he was able to do so, or why others with the exact same issue were also able to do so – the government web site says you would be denied boarding. This may be a reflection of the burden of the airlines. I believe the rule goes as follows: if I travel from country A to country B by aeroplane but can’t get into country B, the airline who transported me has the burden of transporting me back to country A, but I can’t find the source of the rule. I don’t know if there’s a computerised system which didn’t work or if the airline saw the Victorian “quarantine-free” document combined with a valid visa and let them board.

The border force made one mistake, which was the entire basis for the first legal hearing. The first border force officer knew Novak didn’t have the individual exemption and offered him time to sort it out (until 8.30am), as the interview took place in the middle of the night. Then there was a shift change, the new officers apparently didn’t get the memo about the 8.30am agreement, and cancelled his visa. My biggest issue with this was the reporting, especially that Djokovic “won” and the government “lost.” The judge didn’t even make a ruling on the issue but rather accepted the agreement of the parties, where the government reserved the right to deport him, making the first legal hearing one of the most mischaracterised parts of the entire saga.

Finally, it’s hard to see the timing of the visa cancellation as anything other than political. Cancelling on a Friday with the first match due to be played on Monday several days after Djokovic was in Australia was clearly designed both to win a media cycle and to “win” the outcome of the case by denying Djokovic the opportunity to participate in the event his visa was initially granted for. The reasons for cancelling hadn’t changed – Djokovic didn’t qualify for a health and safety exemption and could always have been removed – and I thought one of two things was possible: either the government had caved (which would have been the exact same result of just granting an individual exemption) or they were trying to prepare a foolproof way to win the next case (which happened, but I thought they’d combine the ministerial exemption with some sort of additional proof just to be on the safe side as opposed to dragging everything out.)

Additionally, I don’t think the government will exclude Djokovic for three years, as has been reported. I’m not sure about the special circumstances which go into that decision-making process, but there are exceptions to that rule, and a highly ranked tennis player would probably trigger them.

This affair also highlights how one immigration law has to cater to two completely separate groups of individuals. With refugees in Djokovic’s hotel, the populace has clear images of how different coming to Australia is for different groups of people. I also mentioned the “188 visa” earlier, which is a business visa: if you are coming to start a business in Australia and have at least AUD$1.25 million, you don’t have to be vaccinated, nor do you have to seek an exemption. I think this an even clearer, if less obvious, showing of one set of immigration rules for the wealthy, one set of immigration rules for the non-wealthy than even the Djokovic/refugee juxtaposition. With human rights complaints not just in Australia but also in the U.S. dictating potential immigration policy on the left, it’s important to design immigration policy which allows the country to protect its borders while simultaneously respecting applicants. I can personally vouch visas are stressful. When non-tourist visas aren’t aimed at allowing business to continue, they generally reflect a desire for a better life elsewhere.

In Short: A Summary
1) Djokovic had multiple streams to enter Australia legally (including getting vaccinated);
2) Djokovic did not have the proper documentation/exemption for his chosen stream (no federal individual exemption, which is NOT the medical exemption he got from the Victoria state government);
3) Djokovic thought he had the proper documentation/exemption for his chosen stream (“but I’ve had COVID in the last six months!” only worked for avoiding state quarantine, and he probably didn’t lodge his application himself but rather had help from Tennis Australia);
5) Djokovic’s visa was likely issued without regard to his vaccination status (“why was his visa even issued at all” political responses/”they issued him the visa!” responses);
6) The COVID six month quarantine rule/the fact a state medical board signed off isn’t relevant, apart from the fact Tennis Australia and/or Djokovic thought it may have been the only exemption Djokovic needed. This had nothing to do with the individual exemption Djokovic needed;
7) The government is not overturning any judicial decision, as the judge had no decision to make, and was limited to overturning the original cancellation: the government admitted his visa was cancelled early, “settled” on getting him out of detention, and said they reserved the right to cancel the visa again. (“Djokovic WINS appeal!” better written as “Australian gov’t settles with Djokovic”). There’s a lot of false news going around that the judge did more than what actually happened as a result.
8) Djokovic could be excluded from Australia for three years, but I’d assume an exception will apply.

Of course, I’m writing this before the story’s over, and Djokovic could possibly win the appeal and be allowed to play. I think that’s unlikely at this point. After the first appeal falling through on purely procedural issues, I don’t think the Australian government would make the same mistake twice, and that’s really Djokovic’s only hope. If he plays in the Australian Open I’d consider it a huge upset.

Even if you disagree with the health requirements, they existed and were in place when he attempted to come to Australia, and as such there are valid reasons for him to be deported: it’s because he’s unvaccinated, but it’s not just because he’s unvaccinated. He or his handlers simply didn’t do enough to get an exemption to enter Australia, and the exemption would have required a political decision regardless. The spotlight should also fall on Tennis Australia in this debacle, who clearly helped Djokovic (and others) with the documentation, and I hope it does in the days that follow. I’m also interested to know why he was allowed to board the flight at all, and how others with similar documentation were initially let into the country before having their visas cancelled.

There’s no winners in this saga, but there’s also a lot of misconceptions. At the end of the day, Djokovic did not qualify to come to Australia under the current immigration rules: no more, no less. I hope this has provided at least some clearer insight as to what you’d need to do to come to Australia and where all the story’s failure points existed. Also consider all of the alternatives to all of the points of failure: Djokovic applies for the federal exemption and gets denied, Djokovic gets denied boarding, the Border Force give him the agreed time to call his lawyers and Tennis Australia – none of these end with Djokovic gaining access to Australia. The only way to guarantee entry would have been to take a very simple health measure.

Improving European football through a simple incentive change

It’s been clear since even before the twelve breakaway clubs announced the European Super League that European football doesn’t quite work.

The biggest clubs on the continent are unhappy since they don’t play the other biggest clubs enough. Big clubs from small countries seem unhappy because the playing field isn’t level. Medium-sized clubs from big countries have to be happy with just qualifying. Unless you’re a small league which gets four entrants into an early qualifying round, the entire setup seems inefficient. The Super League breakaway, while correctly shelved, will hopefully have a positive impact: forcing UEFA to make their competitions better.

Even with the Champions League reform and the brand new UEFA Conference League, one simple policy change would increase the competitiveness of European competitions. Currently, England, Germany, Italy, and Spain place four teams in the group stage of the Champions League for being at the top of the UEFA co-efficient table, taking up half of the available group stage spots. Meanwhile, clubs in countries ranked 11th through 55th do not have a guarantee of a group stage spot.

This prioritises domestic competitions over clubs. It is not enough to play well in Europe, other clubs in the domestic league you play in must also do well. This holds back clubs who consistently do well in their domestic leagues, but where the league lacks depth.

Scotland’s decline may be the best illustration of this. With Rangers declaring bankruptcy in 2012, Scotland fell from 15th place, good for two Champions League spots, to 26th by 2018, ensuring champions Celtic would start in the first or second qualifying round from 2013 onwards. With one of the two Scottish teams currently capable of progressing in Europe working their way up the lowoer divisions, Celtic made the group stage of the Champions League four times and the group stage of the Europa League six times in their ten-year span of Scottish dominance. In 2008, Celtic qualified directly for the group stage since Scotland were in 10th place. With Rangers now doing consistently well – better than Celtic, even – in Europe, Scotland again sit in 10th place.

While losing to CFJ Cluj, Ferencváros, Midtjylland, Malmö, AFC Athens, and Maribor in the preliminary rounds, as Celtic did, should necessarily preclude you from the Champions League group stage, Celtic in particular have been held back by a system which promotes countries over clubs by starting the competition at a far too early stage.

Instead, UEFA should abolish the x teams per country rule and instead seed each round of the competition based on club co-efficient. The number of qualifying spots would remain the same. For the Champions League, this would look like:

– The winners of the Europa League and Champions League get slotted into the group stage automatically.

– The winners of the top 6 leagues also get slotted into the group stage automatically – if a league winner also won the Champions League or Europa League, this group expands to 7 or 8, respectively.

– All remaining league champions are sorted by their club co-efficient. The top five teams qualify automatically for the group stage, the next two qualify for the play-off round, the next two qualify for the third qualiying round, the next four for the second qualifying round, and the remainder to the first qualifying round.

– Country co-efficients should still be used to determine the preliminary round, since none of these teams would be prejudiced by their country’s poor collective performance – one good season and they wouldn’t be in the bottom four.

– The remaining best placed teams would be sorted by co-efficient. The top 13 would make the group stage automatically, while the next five would make the third qualifying round with the remaining six in the second qualifying round.

This seems like a major change, but in reality it would affect very little. Among league champions, Red Bull Salzburg would have qualified for the group stage in place of Club Brugge – Salzburg were in the third pot, finished second in their group, and advanced, while Brugge, in the fourth pot, finished 4th. Dinamo Zagreb would move up to the play-off round from the first qualifying round, switching spots straight-up with Brøndby; Dinamo advanced to the knockout stage in second place while Brøndby finished 4th in their Europa League group and were eliminated. No other champion would move more than one round away from where they started: Slavia Prague would also move up to the play-off round, Olympiakos would move from the second round to the third, Omonia would move from the second round to the first.

There would be only two changes from the best placed teams: Milan and Wolfsburg would slide from the group stage to the third and second qualifying round, respectively, with Shakhtar Donetsk and Benfica taking their place. Milan and Wolfsburg may feel like big names, but both finished fourth in their group. While Shakhtar also finished fourth, Benfica finished second – while the finishing places don’t necessarily mean anything, the teams which would benefit from this proposal all did generally well this year, while the teams which didn’t generally didn’t.

This would have had a minimal impact on the Europa League, as well, the one issue being the cut-off would have been a tie between Real Sociedad and Real Betis, who had the same country co-efficient, with AZ Alkmaar qualifying for the group stage directly, and Randers falling into the third qualifying round in place of Czechia’s Jablonec. AZ lost to Celtic in the play-off round but then handily won their Conference League group, and both Spanish clubs predictably advanced to the knockout round. (As an aside, the Europa League’s arbitrary cut-off at the 15tth best league in Europe probably needs to be expanded as well, as Luzern, PAOK, and Partizan Belgrade would have been the next three teams in line, and the latter two especially could have qualified through to the group stage. Luzern got thumped.)

The Conference League’s impact would be similarly minimal, though Rennes and Union Berlin would slip to the second qualifying round from the play-off stage whil Basel and Copenhagen filling their spots. Interestingly, all but two of the teams in the third qualifying stage would slip into the second stage, but apart from the two group teams, no team would move more than one starting round ahead or behind.

If not much changes, how does this, in the words of the Bobs from Office Space, “fix the glitch?”

68 of the top 100 teams in the UEFA co-efficients have played in Europe each of the last five seasons. This change would effectively stack the deck in favour of clubs, regardless of where they play. This should allow for more certainty, especially for a club like Ajax which has the chance to go far in the Champions League, but which doesn’t play in a country where the depth teams can consistently churn out a good co-efficient. It also doesn’t hurt the big clubs – the teams who consistently play in the Champions League – at all. It doesn’t necessarily help them, especially if they start failing to qualify for Europe consistently, but they face no detriment and relatively little risk. It also makes every European game more important – Tottenham and Roma may not want the Conference League’s qualification to the Europa League Group Stage, but this should better incentivise those clubs to treat the European competition seriously.

This does stack the deck against teams who don’t play in Europe very often, but those teams will be less dependent on revenue from Europe and starting in a lower qualifying round is unlikely to prejudice the club that badly by allowing supporters access to more European games than they otherwise would have played. The only problem is a team finishing second may start in a qualifying round while a team finishing fourth would start in the group stage, but I do not see this as a problem, since the current reliance on fixed domestic league places fixes exactly the problem this tries to solve.

There’s a rhyme and a reason to European football. The Super League clubs made the point the same teams qualify for Europe year in, year out, but the system doesn’t currently support large markets in smaller countries. A simple and painless change would instantly improve European competition by making it more club-orientated.

Game Design: Developing a sports management game

I love sports management games.

I don’t remember the first one I played – perhaps it was Soccer Management Simulator on a shareware disk, perhaps Tony La Russa Baseball II if you remove the game’s arcade element. When I learned to program, that was what I wanted to make – something which simulated the outcome of sporting events. I’ve since gone on to program several, including Australian Football Coach 2020.
The landscape now is very different to what it was 25 years ago, though. It’s easier to make a game than ever before, but with the paradox that it’s actually harder to make a game. With increases in technology, I think users now demand a certain level of graphical polish from a game that you didn’t need in the 90s or early 2000s to have a successful game. The tools to make those games are out there, and some amazing games are being made by people who haven’t programmed much if at all before making their game – The First Tree comes to mind.
As a lover of the sports sim genre, this has actually meant games have consolidated. If you want to manage a football team, play Football Manager. If you want to manage a baseball team, you have Out of the Park Baseball. Draft Day Sports has sims for many of the US games. Hockey sims come and go but Franchise Hockey Manager currently holds down the fort there.
Of course, there are others, but the field doesn’t seem as diverse as it did, partially for two reasons: many developers are choosing the online multiplayer route, or develop specifically for mobile, which there’s nothing at all wrong with. That being said, as the market has consolidated, only a small number of companies are really making sports simulation games.
That being said, I’ve encountered a number of people who enjoy and want to write these sorts of games. I recently encountered a question on a Discord thread and wanted to write a couple of posts based on my experience developing Australian Football Coach to help people get started – mostly because I want to play the game that’s kicking around your head!
The following isn’t about actual game development, but considerations you should keep in mind before developing a sports management sim, though much of it is also related to general independent game developing.
Don’t do this for the money
Video games are currently an exceptionally crowded market at the moment. There are tons of quality games out there, many of which will never get played or make a cent.
I’ll say this again over and over. If you’ve never written a game before, especially if you haven’t programmed before, don’t expect to make any money, and I mean not a single quid. I don’t mean unprofitable.
Therefore, you need to do two things if you want to make a game: value the time you’re spending on the game as an activity enjoyable unto itself, and only spend what you’re comfortable spending on making it. You don’t need great graphics to make a good game, and a lot of the times with sports management games good graphics can actually be a hindrance. The market leaders – I consider these Football Manager and Out of the Park Baseball – have sleek GUIs and facial graphics and 3D games, but the graphics in many ways are very simple and not entirely necessary. For instance, I loved FM2006, which had only a 2D engine. And I’m not sure you can call Front Office Football’s graphics sleek and it’s had many different iterations!
Develop where you’re comfortable
The Discord question revolved around whether a developer should program for Android or PC. There’s a correct answer to that in my mind. Unless you are specifically performing market analysis for a new game, program a game using something you’re comfortable using. There’s nothing wrong with even releasing a game that runs in the Python console – you probably won’t be able to sell it, and it may not be distributed very widely, but if you enjoy playing it, that’s what counts. One of my favourite things I’ve ever programmed – a soccer simulation with little to no user input that can simulate 1,000 seasons very quickly – runs in Python with no GUI.
If you’ve never programmed before, there are dozens of options. I like BlitzMax NG for PC, but the learning curve there’s probably higher than something like Unity or GameMaker, both of which can churn out good games very quickly.
Plan out your game
If you’re just learning to program, there’s nothing wrong with getting the basics down by trial and error. That being said, it certainly helps to plan out what you want to do in your game. Keep the features as simple as possible to start. You might have an intricate game designed in your mind, but focus on the core features! If you build incrementally you’ll have a game that can be released early that works, even if it’s not as fun as you’d like it to be. For instance, in Australian Football Coach, I really want to add league expansion/contraction, but it’s not important to getting the game out of early access, so I haven’t worked on it much.
I’ve found thinking in terms of screens to be helpful as well. Think of your favourite games – they all have different screens, a match screen, a tactics screen, a team information screen. If you map out what screens you’ll think you’ll need at the start of the game, that provides you a useful “to-do” list and will make navigation easier for the user as well. AFC2020 uses one screen, but a bunch of different panels that are made visible/invisible as needed.
It’s all about the GUI
I spend probably 2/3rds of my time building the user interface. It’s painful. Don’t underestimate the amount of time it takes to build out a good frontend, especially if you’re more interested in the backend. The original Australian Football Coach used linear regressions, calculated in-game, to determine player salary modeling. The user never knew it, since the GUI was clunky. If you’re programming for yourself and hate GUI programming, I highly recommend building out a console game in Python. They’re quick and easy to develop. Python’s also great for writing quick models for your game simulator.
Keep it simple
Keep in mind a game does not have to have the depth of Football Manager to be fun. There’s a great old soccerfootball card game I think from the early 2000s where the simulation mechanic is flipping over different action cards. I’ve played that game for hours. Make sure you enjoy what you make and that you’re not trying to do something really complicated from the get-go.
A piece of advice I received a long time ago was to look to board games, especially sports board games. Board games have to distil a sport down to something very simple while still being fun. If you’re having difficulty figuring out the actual simulation, look to see how board games which simulate the sport you’re interested in play out. You don’t need to copy them precisely, but they should give you a good starting point.
 
If it’s your goal, be prepared to make money
Your first priority should be to make a good game, but if you find that you’re creating something that’s fun, maybe you’ve sent it to a few people and they like it, don’t be afraid to set up a company for your game. This is a good idea even if you’re a “hobbyist” game developer. In the US, you can set up a LLC very easily by filling out a few forms – it varies by state but is around $100/year. You can then set up a bank account for the company. This is especially useful if you have multiple partners or owners, or product investors. You’ll have to wrap your mind around sales taxes unless you sell directly through Steam. It also has the added benefit of giving you a layer of protection in the worst case scenario that you get embroiled in a lawsuit, which should hopefully be rare.
Even if you don’t start a company, don’t be afraid to spend a few dollars on the game if you want to put it out there for a price. For PC games, $100 is the Steam fee, and you get that fee back once you make $1,000 (and you may not make $1,000). Steam does take a cut of the game’s revenue as well, but it makes a number of things much easier for you and is worth it. For Australian Football Coach, I paid for some graphic design, for someone to work on the game’s database, for an advertising spot, and for web hosting/a domain. All of these expenses added up to around US$1,000, and not all of them were strictly necessary, but all of them directly improved the game.
That being said, even with an extremely niche product – currently there are no other Aussie Rules games updated for the 2020 season, with one possible exception that’s really just a free update of a 20-year-old game – the game’s income hasn’t justified the labour cost. The biggest financial benefit to the game came as a happy accident. I released AFC2020 in March 2020. Because of COVID-19, I had just lost a contract I was about to start. Fortunately, the game sold well enough in that month to cover most of what I would have made on the contract.
However, it has not been a good investment in terms of time spent, if you consider money to be the only important factor. I’m glad I spent the time making AFC (over seven years in total now), I’m glad others enjoy the game, and I’m happy I can pay expenses and have a bit left over, but I am nowhere close to being able to do this full time. If you want to write a game like this, you might feel like you have to or need to – run with it, the process can be very satisfying.
Don’t forget product support. Every person you interact with has given you money for something you’ve made. This can take up a significant amount of time and you can run into the same problems over and over again. It’s still important with a freeware game, but I’m less likely to be upset about something free not working than something I’ve paid for. Be prepared for it.
In terms of alternate funding sources, I’ve seen a couple Kickstarters for sports management games. I’ve never used one myself, but I typically find Kickstarters to be most useful for games that have been mostly developed or prototyped and the game’s designer needs money up front to pay for a specific aspect of release, or for when experienced game designers gauge interest in a new game (Spiderweb Software comes to mind.) I will not support a Kickstarter for a game where not a single line of code has been written unless I absolutely trust the developer to produce the game, and I think others are wary of that as well.
Porting your game
Porting to different systems – such as Mac or Linux for PC games, or Android/Apple – can be tricky. If you’re starting out, I wouldn’t worry too much about this at all. If there’s a game in your head, take the path of least resistance to getting it out of your head and into code form. If it only works on Android or Windows or Mac, that’s fine, especially if you haven’t written a game before.
Many platforms allow for easy cross-compiling, now, though. If you are looking for good cross-platform coding engines, I can recommend both Cerberus X or BlitzMax NG, which are open-source, or Unity, which is closed-source (and which everyone seems to be using nowadays.) If you were to develop commercially for only one platform, my list would be Windows, with iOS/Apple mobile a distant second and Android a close third to Apple.
I’ll say it again: path of least resistance. Make the best game you can first.
Marketing matters
If you want people to play your game, you have to market. If you’re writing for PC, a good target would be to get Steam to refund the $100 fee. At my price point, this threshold wasn’t quite 100 copies sold, a threshold which fortunately came rather quickly. You could also avoid Steam completely and self-publish, or publish on itch.io, which makes sense if you’re not looking at big sales platforms.
Most people will buy your game within a month of release. There’s a definite tail to game sales. If you’re going the Steam route, you’ll want to announce your game in advance. Steam players will wishlist games that aren’t out yet. Expect sales in your first week or so to be half of the number of people that have wishlisted your game. That rule is likely a maximum which reflects people who have wishlisted your game organically. It should not be an incentive to get you to get people to wishlist the game for the sake of wishlisting the game, since those users won’t buy it when it comes out! You’ll need to produce marketing products that make people want to wishlist the game, not try to pad your wishlist numbers – the wishlist should reflect the engagement your users have with the game. I consider Australian Football Coach’s wishlist similar to a marketing email list – you would purchase the game, maybe you will buy when it comes out, maybe you will buy when it goes on sale, but your target is people who want to be on that email list.
You’ll also want to promote your product through any free mechanism possible. Set up Facebook, Twitter, and Discord if you’re serious about making money. I think a Discord is fine if you’re looking to do this as a hobby. Chris at GMGames.org has been fantastic in terms of developing a list of available sports management games on his website. Write blog posts. Put up gameplay videos on YouTube.
The other rule of marketing that I’ve found for low budget games – the more you do, the more you sell, the more you engage people. There was a month lull due to COVID because footy matches weren’t being played, and I lost some interest in the game (for a number of different reasons.) The game didn’t sell more than three copies on any single day in the entire month of May. When June arrived and sports picked back up, I did more marketing and released a Mac version and while sales didn’t go crazy, the engagement with customers definitely lifted sales.
Be prepared to sell zero copies of your game. With a sports title, you’re already releasing a product that’s niche enough, but I’ve found it’s better to prepare for disappointment and be happy than the other way around. Treat it like a hobby and go out and do your best – you’ll probably end up doing better than zero! If you make a good game, you’ve got a side hustle going. You can get a hit out of your first game, but in reality, it can take a long time.
Conclusion
If you’re considering writing a sports management game, there’s really been no better time to try. I look forward to seeing what you come up with! Make sure to check out Australian Football Coach 2020, and watch this space for a future post on the actual nuts and bolts of writing a simulation engine.

Maps: Automated Redistricting for QGIS

After mostly finishing up the QGIS redistricting tool, I realised that by reusing some of the code, I could come up with a QGIS plugin that could redistrict automatically.

This isn’t a novel concept, as a quick search of automated redistricters bring up attempts to redistrict automatically using such mathematical devices as Markov chains, or even autoredistrict.org, free and open source software which does exactly this. So why try to solve the problem again using QGIS? Simply because it’s something I’ve wondered about for awhile, and because I thought I could put it together quickly, which turned out to be the case.

If you want to jump in head first without reading further, you can download the plugin from https://github.com/johnpholden/qgisautoredistrict (and please throw a star on the repository if you find it useful.) If it doesn’t work (or if it does), please let me know – I may have missed uploading something. Now, into how this works!

This automated redistricter uses a flood-fill mechanism to assign districts. The workflow is fairly straightforward:
1. Pick an active district.
2. Pick a geography.
3. Has the district reached its target population? If not, assign the geography to the district.
4. Pick an adjacent geography and return to step #3 until the district has reached its target population.

In other words, the program picks a point and expands the district outwards until a target population gets reached.

For testing, my goal was to redistrict Connecticut into 151 state legislature seats as equally as possible. Why Connecticut? It’s geographically small, but densely populated, and isn’t too big and isn’t too small. There’s about 67,000 polygons in the file.

The first attempt went fine, but I ran into several problems. The redistricter is very stupid, so by picking neighboring geographies at random and then stopping when the target population is met, it creates enclaves, or unassigned areas in between other districts. As a design rule, I ensured the program could go over or under the number of target districts, and I stopped the first run after over 600 districts had been created. Most of these districts contained a very small number of people, were completely inside a completed district, and typically were only one or two polygons in size.

So I set about creating an enclave checker. After another test, I realised I also needed to account for very small districts which nevertheless touched two or three different districts. So I added in some more logic: if a neighboring district has been assigned, remember that district. Then, if the district hasn’t reached a certain target population threshold, assign it to the neighboring district with the lowest population.

A simple bug caused this to initially fail, as I forgot to tell the software not to add the current district to the list, so the program would never finish as it kept trying to recursively add a district to itself as it always had the lowest population. However, once I sorted this, the software hit the number of target districts square on the nose. Some districts were far too small and some were far too big, but the overall result was relatively nice.

The initial result – geographic boundaries were ignored, so the districts look a bit splattered onto the map.

However, the software remained stupid in two different ways. First, the software treated as a neighbor any polygon which touched another polygon at only a point, like Arizona and Colorado or Utah and New Mexico. This led to some interesting – and topologically incorrect – districts.

Second, the software didn’t take geography into account at all. The first attempt at a five-district Connecticut meant the first district spread out happily along the entire coastline, ultimately looking like some sort of gerrymandered serpent. (Spoiler: The second attempt also had this problem, but that’s because I used tracts, not counties, as the geography. It didn’t know to stop at the county line and start back-filling, as you’ll see.)

I needed to fix both of these things. First, I changed the neighbor checker to only include polygons which shared an edge or (sigh) overlapped, which fortunately didn’t make this any slower. I also decided to add in a geography field. If the program finds a neighbor which does not share the active geography field, it assigns it to a list. Only once the program has exhausted all of the fields in the active polygon, it will go on to the next geography.

I also decided to make the program smarter by choosing a neighbor directionally. This led to a very interesting map, where districts were created longitudinally, as the program would always pick the westernmost available district instead of spreading out normally, so I decided to revert this addition. However, the program can cycle through the direction that it picks the next available geography for the next district – I’m not sure how useful this will be, but it does add a bit of variety to the software.

Always picking the next adjacent westernmost district led to a map that looks almost like it’s crying.

Unfortunately, the final product isn’t perfect. In the final run, the software generates 152 districts for Connecticut where 151 is the target, and only 93 of these districts are within 2% of being a “perfect” district of equal population. Many of these districts are far too large as a result of the enclave assigner, with 50 of the 59 districts out of the 2% range being above their target population.

The final result – these districts look properly compact, except for the one which greedily gobbled up ocean
The final result near Hartford with an overlay, showing a decent amount of respected geographic boundaries

Unfortunately, this isn’t the easiest fix. The software doesn’t yet know if reassigning a mesh block that already has an assignment will split the district in two or not, and the initial attempt to fix this had the most abstract art map of all – running districts down zero-population freeway medians, which I intentionally kept in. One district, low on people and desperately needing to bulk up, found the median of Connecticut State Route Two, which had a median, and decided it should connect Glastonbury with Colchester while never being wider than about an eighth of a mile on either side of the highway. Other small districts did the same.

A discarded result from alpha testing – the bright yellow district connects disparate areas using a freeway median. North Carolina would be proud

The easiest thing to do would be to remove zero-population blocks completely from the map by merging them with neighbors, but I want to make sure this works out of the box.

In the congressional simulation, four districts were between 3,500 and 34,000 voters too large, and the remaining district was 74,130 voters too small. This isn’t good enough if you’re trying to generate perfect districts, but it’s a great starting point for users who want to create a congressional district map from something other than scratch.

The final congressional map. The red district is the one that is 10% short of its voters target, but all in all, not a terrible result despite the interestingly coastal district.

It’s also important to note at no point in time does this software use partisan indexes to create districts. It’s based purely on geography, as I believe districts represent both places and people, and that while we should strive to create competitive districts, we should also strive to create districts that people can easily geographically understand. It’s much easier to say “I represent people on this side of Princes Highway and to the west of Springvale Road” than it is to describe pretty much any district anywhere in the United States. But enough of the soapbox.

I’m probably going to continue working on this, but not immediately. However, I did want to open source the code on this in case others are interested in playing around with the formulas.

The next step will be to figure out how to reassign mesh blocks for districts which are over their population target (my first bid failed) and to add a second geography column to avoid situations such as the one seen in the Connecticut plan, where the first district expands along the shoreline unconstrained by a change in counties. If the geography column looked at both county and tract, the checker would have stopped at the Fairfield county/New Haven county border, there would have been one partial district for Fairfield county and one district for New Haven county, and we all would have been a lot happier.

Once again, the plugin can be found at https://github.com/johnpholden/qgisautoredistrict. May your results be interesting!

AFL: The complete winning percentage lookup table

5.6 billion Aussie rules games, simulated on an off-the-shelf laptop.

In the work on leverage I’ve done over the past couple of months, I figured there’s a fixed winning percentage given the minute, the margin, and a team’s initial (estimated) winning percentage. I also figured this could be easily achieved through simulation.

I simulated 10,000 games for each minute, margin (within 12 goals), and initial winning percentage to get the estimated winning percentage, in order to create a master table of winning percentages which can be used by anyone. While computers are amazing, lookup tables still definitely have their uses.

Methodology

The simulator made some basic assumptions: there was a 52.4% chance of a scoring shot in any given minute, and an additional 6.8% chance of an additional, second scoring shot, based on 2019 data scraped from afltables.com. I did not simulate a third scoring shot, which is possible but very rare, because I only found two or three instances of that in the data set. I didn’t check to make sure there wasn’t a string of behinds which generated that event chain, simply choosing to ignore it completely.

The simulator also translated the winning percentage into expected goals using a best-fit line from spreads using Squiggle AFL. Unfortunately, this did not work well as the winning percentage increased, as I had less data for predicted blowouts, so I ended up using trial and error to figure out which values corresponded with the initial winning percentage (if the initial winning percentage is 98%, the favourite should win roughly 9800 out of 10,000 games in all instances, just as a team with a 51% chance to win should win roughly 5100 out of 10,000 games.)

I ended up using a fourth-order polynomial equation to generate expected goals for percentages greater than 82%, and it matched very well at the initial win percentage. An 80% win percentage with a tie score at the start of the game averaged out to 79.9% over 10,000,000 simulations, with a standard deviation of only 41 wins, and no simulation more than 1.5% away from the 80%. Similar results existed at the 90% mark, though this was skewed slightly above 90% and had a slightly lower standard deviation. Because of the margin of error involved, I would not use this data set to bet on any game below a certain threshold.

I also wrote the simulator in php.

In 2012, I was tasked with taking a large data set from a website, parsing it for updates, and then emailing a spreadsheet to my boss, who would present the spreadsheet at a daily 8:30am meeting. I wrote a php script which worked quickly and wonderfully.

We hired someone with Silicon Valley credentials who asked me what I was doing and asked why I wasn’t using Python to run the script. I told him it was because I didn’t know Python that well yet (true) and that php was an order of magnitude faster (also true.) He ended up taking the project off my hands and re-writing it in Python even though the rewrite conveyed no benefit whatsoever, apart, apparently, from the fact it was in Python.

I ended up re-writing the simulator in php. After letting it run overnight, it had almost finished. By my timer, Python would have taken several days. I then fed the finished data set into Python and graphed it.

Finally, I ended up simulating margins which could only be considered theoretical, such as being down by 12 goals in the first minute. I thought the information would be very valuable in determining initial spreads given a win percentage. How many goals down does the favourite have to be to have a 50-50% of winning the match?

Findings

Because the data is four-dimensional, it’s obviously not the easiest to graph as an entire data set, but if you hold one or two of the values constant, there are wonderful graphs to be made.

For instance, holding the margin constant at zero shows how the longer the underdog can keep the game tied, the more likely the favourite won’t win. While that’s an obvious conclusion, what’s interesting to me is just how even the game becomes if tied in the last ten minutes. Also note the noise.

I found the initial winning percentages interesting as well. The simulator predicts a team with an 85% initial winning percentage should win by five goals 50% of the time, as the simulator predicts the spread to be 30 points. At 90%, the spread should be six goals, increasing dramatically to 99%, which the simulator estimates would be a 12-goal initial spread. The 99% team would also still be favoured to win if they were down by five goals or less at halftime. Unfortunately, I don’t have betting data to look at to see if this holds true over time.

These are just a couple of the stories you can tell from the simulated data set – if you come up with any more on your own, I’d love to read about them.

Download

I’ve put both the PHP code and the final table up at https://github.com/johnpholden/afl-leverage-simulator. I just noticed the “home team” should be labeled “favourite” and the “away team” should be labeled “underdog” in the table.

If you end up using the table or code, or performing any sort of statistical analysis on this, please let me know.

Maps: QGIS Redistricting Plugin released

There’s more to blog about than just sports statistics, and that includes the release of the QGISv3 redistricting plugin I’ve written! The website https://www.qgisredistricting.com is now live and provides details on how to download and install the plugin from a zip along with operational instructions.

The plugin comes from a desire to be able to redistrict using desktop GIS software without having to spend any money. I used Azavea’s District Builder back in 2010 when I participated in a team redistricting competition. I also have colleagues outside the U.S. who have almost completely switched from MapInfo to QGIS, but the MapInfo redistricting tool was one of the areas where they didn’t switch over.

The plugin forces you to supply your own redistricting data where other software (particularly online software) provides it for you, but these files can generally be found pretty easily.

If redistricting and open source software interests you at all, head on over to the github and try it out. Do note it’s not entirely free of bugs, so treat it as sort of a final beta – for now, avoid making multiple plans on the same file (it tends to overwrite the other files) and please report any issues you have, but my sense is it’s ready to be used in production environments.

I’m also working on an automated redistricter QGIS plugin which automatically generates districts, so watch this space if that interests you. An early version of that code is also up on github.

AFL: When has a team definitively won the game?

Even though only my extreme partisanship completely explains why, I greatly enjoyed the first live footy match I watched in 2019.

Most neutrals – if they were watching at all – had turned off the North Melbourne-Carlton game after watching the Kangas kick the opening seven goals to lead 10.6.66 to 1.7.13 at halftime. We stayed around the entire second half just because we found the game severely enjoyable. However, the game was long over, with only the first goal of the second half, scored by Jack Silvagni to get the Blues back within 9 goals, registering at all on the leverage calculation.

While a close AFL game is as exciting as any sport in the world, I’m still curious to know: when are AFL games won or lost?

Thinking specifically about “when are games won?” may conjure up the West Coast-Port Adelaide after-the-siren final a couple years ago in your mind, or the 2018 Grand Final where the game was again won late by West Coast.

The North-Carlton game, however, was won at some point in the first quarter.

One of my favourite infographics from Matter of Stats, Types of Grand Finals, looks at grand final outcomes by quarter scores and demonstrates this isn’t an obvious question. Most grand finals end up with one team winning start to finish – even a game between two of the best teams on the biggest stage of the season doesn’t mean the game itself will be all that good, one of my working hypotheses throughout these blog posts.

However, I wanted to take this a step further with the concept of leverage. While there’s nothing wrong with looking at games based on their quarter time score, calculating leverage gives us 20 times the amount of data points, and includes initial winning percentage in the calculation.

Leverage, as I define it, is a percentage which reflects how much the game’s winning percentage would change if either team kicked a goal in that minute. The closer the game, the more evenly the teams matched, the higher the leverage will be. For instance, in the 2018 grand final, the leverage with one minute to play rose close to 100%, whereas in the 2019 grand final the leverage was functionally 0% for the entire second half as Richmond’s winning percentage was close to 100% and GWS didn’t launch any sort of comeback.

Therefore, leverage can be used to determine at which points AFL games get won or lost. The 2018 grand final was won in the final minutes, whereas the 2019 grand final was arguably won in the second quarter.

I’ve gone ahead and categorised the different types of games which the 2019 season gave us. It won’t add up to 207 – I noticed an error I made typing in winning percentages for three of the finals games (switched home and away) and haven’t re-run them yet – but that won’t make a difference for this blog post. I’ve done this by taking the average leverage for each quarter instead of looking at who’s winning at quarter time.

For instance, the Carlton-North Melbourne game where North played a wonderful first quarter featured an average leverage in the first quarter of 10.8%, second quarter of 3.1%, and third quarter of 0.3%, and a fourth quarter where no score changed North’s 100% winning percentage. The most exciting game of the season by this metric, the Fremantle-Sydney game, featured a Q1 of 17.0%, Q2 of 19.0%, Q3 of 23.5%, and a fourth quarter average leverage of 44.8%, in part because the one-point game was so close the leverage line trended towards 100% by the end of the game.

For our purposes, in the first instance, North “won” the game in the first quarter, since that quarter had the highest average leverage. The Freo-Sydney game came down to the wire, making the fourth quarter the time when the game was “won”.

I’m specifically curious about this in part because of my above hypothesis, as a grand final is more likely to be dull than not – and also for predictive analytics. Predicting large wins could have uses in both gambling and modeling. We typically associate winning percentage with average margin, to the point where the leverage simulator even uses this to translate how many goals each team will score given a starting winning percentage based on Sqiggle. I kind of want to know if we can figure out the odds of a specific game being horrifically dull.

So, without further ado, the most likely leverage patterns in 2019 AFL games. The numbers represent quarters – the first number on the left is the quarter with the highest rank, the last number on the right represents the quarter with the lowest leverage.

The 5 most common types of AFL games

1234 – The early blowout – 42 instances
You were more likely to see an 2019 AFL game end in a blowout where the first quarter was the most interesting than any other type of game. None of these games rate very highly on the intrigue scale, and the majority of them were straight blowouts – though this category does include games such as the GWS-Footscray final where the first quarter was close all the way through, the second and third quarters weren’t as close but the game wasn’t over, and the fourth quarter was functionally predetermined. None of the games in this category featured interesting final quarters.
Most interesting game: May 19, Hawthorn-Richmond (117th most interesting)
Least interesting game: May 19, Carlton-GWS (dullest game of the year)
Most games played in this category: 8, GWS and Gold Coast
Fewest games played in this category: 2, St. Kilda and Sydney

4321 – The closest of games – 36 instances
9 of the top 10 “most interesting” games fall into this category since leverage is higher in the fourth quarter. These games were close almost all the way through – even some of the games which got away, like the Footscray-Geelong game on July 6th, were close for almost the entire game.
Most interesting game: July 20, Fremantle-Sydney (most interesting game of the year)
Least interesting game: July 6, Footscray-Geelong (75th most interesting)
Most games played in this category: 7, Melbourne and Essendon
Fewest games played in this category: 2, Adelaide, Footscray, Wst Coast, Richmond, Hawthorn

2134 – The second-quarter blowout – 30 instances
Similar to the early blowout, almost none of these games had interesting second halves, but the winning team didn’t quite pull away as quickly as they could have. A perfect example is the Richmond-Brisbane game the first week of finals where Richmond increased their winning percentage from the 30% range all the way up to 80% by the end of the second quarter. Many of these games have very similar leverages between the first and second quarters, implying the first quarter was either fairly even or the underdog stayed in the game. Some of these games even had fairly interesting third quarters, but whoever was in front always pulled away.
Most interesting game: July 20, Carlton-Gold Coast (100th most interesting game of the year)
Least interesting game: August 10, Port Adelaide-Sydney (161st least interesting game of the year)
Most games played in this category: 7, Richmond (Port and Essendon had 5)
Fewest games played in this category: 1, North Melbourne

3214 – the third-quarter pull-away – 26 instances
Yes, more games with boring fourth quarters, though these games were a bit more exciting generally. These games have stayed close until the third quarter until one team ends up running away with it – the Brisbane-Melbourne game is a perfect example of this, with Brisbane starting the quarter six points behind and finishing the quarter with a 14-point lead after kicking 6.2 to 3.1, and then piling on in the final term. These games can even have what appear to be exciting fourth quarters ahead – for instance, the Hawthorn-North Melbourne game where the Hawks were ahead by a point entering the final quarter and then kicked five of the next six goals had a fairly decent leverage score in the 4th quarter.
Most interesting game: April 7, Hawthorn-North Melbourne (61st overall)
Least interesting game: July 13, Geelong-St. Kilda (136th overall)
Most games played in this category: 6, St. Kilda
Fewest games played in this category: 0, Carlton (Essendon and Gold Coast on 1)

4312 – the boring second quarter – 10 instances
A very interesting category! Consider the Carlton-Fremantle game, where Freo were up 29 after the first quarter, 10 at halftime, and lost by 4 – doesn’t seem like it would be a candidate to have a lower leverage second quarter, but the simulator had Fremantle as fairly heavy favourites at 78%, which meant their win percentage at Q1 time was greater than 90% – the Blues’ second-quarter mini- comeback only got the win percentage back to about 85% down 10 at halftime. Or the Collingwood-Footscray game from April where both teams had a relatively quiet first half – the score at Q1 time was 1.2.8 to 0.3.3, and the Magpies’ lead remained between five and 17 points the entire second quarter. The game was close late until Collingwood broke the game open about halfway through the fourth quarter. There really wasn’t much difference between the Q1 and Q2 leverage in this game, but the fairly even scoreline and the favourite slowly staying ahead meant the Q2 leverage was a bit lower than the Q1 leverage.
Most interesting game: June 29, Hawthorn-West Coast (8th overall)
Least interesting game: June 30, Carlton-Fremantle (79th – a bit low, but consider the game wasn’t close until late)
Most games played in this category: 4, Collingwood (3 each for Fremantle and Carlton)
Least games played in this category: Six teams avoided this entirely (Melbourne, Adelaide, Gold Coast, GWS, and Port Adelaide.)

The 16 other types of AFL games (which occurred in 2019)

The 2314 – 9 instances – decently entertaining games with generally uninteresting final quarters, either the losing team kept the game close enough in the third to stage a comeback, or the underdog ended up with a blowout.
The 1324 – 8 instances – whoever is getting blown out gets blown out quickly but shows just a glimmer of life in the 3rd quarter (June 8, North Melbourne-Gold Coast, in part due to a late Gold Coast second quarter run, meaning if they had kept it up in the third the game would have been a toss-up.)
The 4213 – 7 instances – exceptional games, as the fourth quarter’s quite exciting but the third quarter isn’t. Think St Kilda-Hawthorn where the underdog comes back to win in the final quarter. The favourite/team in the lead at halftime has a good third quarter so the leverage lowers, and winning percentage assumes the team in the lead will go on and win, so these games aren’t rated very highly compared to the games which are close throughout – but these are games you’d like to watch no matter which team you support.
The 3241 – 7 instances – These games weren’t evenly matched but stay close until the late third or fourth quarter – not dissimilar to 4321s with a tight first half but where the fourth quarter isn’t close.
The 3412 – 6 instances, and the 3421 – 6 instances
Functionally equivalent instances such as the Carlton-Footscray game from June where the underdog has a big third quarter after a quiet first half, or the Geelong-West Coast final where the underdog has a big third quarter after a quiet first half, or the North Melbourne-Adelaide game where the teams were roughly equal but North came back to win with a big third quarter, or the Adelaide-Geelong game from April where Adelaide lost by four goals but had a big third quarter – sensing a pattern yet? The “big third quarter” game, typically by the team who was no longer favoured to win at half-time.
The 3124 – 4 instances – think that Richmond-GWS game from July where the teams were evenly matched, Richmond jumps to a 31-point lead in the mid-second quarter, GWS get back within 10 in the 3rd, and then Richmond close the game out. Seemingly involves only Richmond or Fremantle (2 games each.) Yawn.
The 1243 – 4 instances – the team getting blown out maybe gets a couple goals in the 4th. Yawn.
The 4132 and 4123 – 2 instances each – Round 1. Footscray-Sydney. Footscray are up 32 and win by 17, but not before Sydney got within 4 with only a few minutes to play. Melbourne are up 30 on Carlton at Q3 time. Carlton storm back, take the lead at 99-98, then still lose. The Collingwood-West Coast rematch where Collingwood, down 16, kick 2.6 to 0.1 in the final quarter to win. In short, the big 4th quarter comebacks.
The 2143 – 2 instances – These were both comebacks (Adelaide-Richmond in June and Footscray-Hawthorn in March.)
The 2341 – 2 instances – One blowout and one game where the losing team made a run in the second quarter but lost decisively.
The 4231 – 1 instance – the Brisbane-GWS final had a slightly more interesting 2nd quarter.
The 3142 – 1 instance – this was the Richmond-Carlton season opener. The leverage was always so low throughout the entire game, the result is effectively just noise.
The 1432 – 1 instance – Gold Coast won the second half against West Coast. They still lost by 23.
The 1324 – 1 instance – That West Coast-Fremantle game where Freo never came close to leading but still got within 10 points in the 3rd.

You may have realised there are 24 possible outcomes (4x3x2x1 = 24) and I’ve only given you 21 – the three outcomes which didn’t appear in 2019 were the 1423, the 2413, and the 2431. These should all be rare occurrences where a team which will be blown out nevertheless makes just a bit of a run in the fourth quarter.

Conclusions

The pure number of games which aren’t ultimately close in AFL strikes me more than anything. Leverage should naturally be highest in the fourth quarter, as a goal in a close game will have a large impact on the game percentage. The top 366 individual scoring plays by leverage all occurred in the 4th quarter (out of 9265 outcomes, including rushed behinds.) The individual scoring plays give us a sense of the maximum leverage by quarter – the individual scoring plays with the highest leverage were 100% in the 4th quarter (after the siren for the win), 32% in the 3rd quarter, 23.4% in the 2nd quarter, and 20.4% in the 1st quarter.

However, the 4th quarter in 2019 had the highest leverage only 25.1% of the time, meaning only one in four games has a relatively close final outcome. The third quarter had the highest leverage 58 times, the first quarter 56, the fourth quarter 52 and the second quarter 41.

Switching things around, in spite of its built-in advantage, the fourth quarter had the lowest leverage 119 times out of 207, or 57.5% of the time!

There’s a point in every AFL game, and probably every timed sporting event, where the lead becomes unassailable. By the time the final goal of the Carlton-Collingwood game had been kicked in May, a Will Hoskin-Elliott goal which put the Magpies up by 3.1 with practically no time left, even a Carlton goal would not have changed the game’s outcome.

To calculate this, I looked at the point where the leverage remained zero – i.e., there was a less than 1-in-10,000 chance the losing team would come back to win. I’ve also looked at where the leverage was less than 1%, which should indicate a point in the game where the losing team’s pretty much cooked but still has a chance to come back.

This point tends to happen at some point during the 4th quarter.

There were 50 games where the leverage in the final minute was greater than zero. In 16 of those, a comeback was unlikely (needing two scores to win.) In 34 of those games, the leverage reached above 75%, meaning the game was functionally within a goal. This means there’s about a 1-in-4 chance an AFL game will be decided in the final two minutes.

There’s about a 1-in-21 chance the game will be entirely over before the 4th quarter, but when looking at games where comebacks are unlikely, this rises to almost 19% of all matches. Half of all matches have a sure winner with five minutes to go.

Understanding how games work may have its largest use in betting and predictor markets. The Squiggle average winning percentage for favourites in the 2019 season was 64.8%, which my simulator predicts as a roughly 10-point favourite, or less than two goals. It may be interesting for those creating models to try and predict how the game will be won as opposed to just looking at the likely margins.

AFL: 1990 and 1999 grand final in-game winning percentages

I’m excited about the article you’re currently reading in part because I get to advertise a couple new developments in footy statistics.

First is The Arc AFL’s complete AFL history. Using ELO (which admittedly isn’t my favourite way to determine winners and losers, but it serves its purpose well here) you can easily determine the winning percentage for each game in AFL/VFL history.

Second is the newly coded 1990 and 1999 grand finals. These games are on YouTube in full and appear to be presented in real time, meaning the scoring progression can be coded based on the time of the game. I went back and coded the scoring progression and sent it into AFL Tables. Unfortunately, I don’t know how many games this can be done with based on the coverage that’s currently available, though it looks like St Kilda’s win back in the 1960’s may have full television coverage. The most important thing when looking for games to code has been unedited, continuous coverage – for instance, I have the 1989 grand final mostly coded, but the game in the third quarter jumps ahead about five minutes. I also have the 1975 grand final on DVD and I’m not sure the full game is on there, but I’ll have to look – that would be a good one to add to the collection.

The other advantage of looking at historical footage – advanced AFL stats only go back a few years. There’s a very good chance the finals could be coded to add in statistics such as marks inside 50 just by watching the television coverage (I’m particularly interested in how many Gary Ablett had in 1989.) It’s potentially a bit optimistic, but I think it’s definitely worth exploring.

I have it on very good terms the AFL has a lot of archival footage, so even though most of the grand finals currently on Youtube have either been edited or miss some of the coverage, I’m hoping the league will be interested in a project to try to retroactively code as many grand finals as possible. At the very least, the league would benefit from giving broadcasters a more comprehensive statistical package to discuss on game day.

While neither the 1990 nor the 1999 grand finals were particularly interesting to the neutral, I’ve attached the leverage and winning percentages below. These required both the Arc’s winning percentages and the newly coded game to produce. Hopefully there will be more of these in the future (imagine the commentary at the 2020 grand final: “of course, Ben Brown’s 8 inside 50s so far pale in comparison to Gary Ablett Sr. in 1989…”) Enjoy if you like!

AFL: The most important goals of 2019, ranked by math

Recently I put my AFL leverage simulator through its paces, generating both winning percentages and leverage for the entire 2019 season. The leverage simulator simulates the remainder of each AFL game from any given minute to determine how often a team wins from that point. More importantly, it also determines the game’s leverage, or how much the winning percentage would change if a goal were to be kicked in that given minute.

Combining the play-by-play with the leverage can give an indication of which scoring shots were kicked at the most important points of the match.

There’s a couple caveats here. The most important kick of the season belongs to Michael Walters, whose after-the-siren kick nicked the post to give Fremantle the win over Brisbane after the siren, but would show up 9th in the list if I had included all behinds. The simulator only knows he kicked a goal in that minute, not when in that minute the goal was kicked, so no goal will move the leverage needle a full 100%.

For each minute in each game, the simulator checks to see if any scoring shots were generated, and then simulates the game from there. The leverage gets calculated from two different simulations: the home team’s winning percentage if they were up an additional goal, and the home team’s winning percentage if they were down an additional goal. While I would have liked to write the simulation to use seconds instead of minutes, minutes were functionally practical for two reasons: first, each scoring shot’s time is only estimated, and second, seconds would incrase the length of time a game takes to run from three-four minutes to (does some scratch math) approximately three hours. It might be worth going back and simulating the final two minutes of every close game down to the second, but I have not yet done so. Based on previous simulations looking at close games at the seconds level, would definitely end up changing where goals sit in the rankings. I still consider there to be value in this list as it allows the comparison of late-game goals to other scoring shots, the likely topic of a future blog post.

Also, if multiple scoring shots occurred in the same minute, they have the same leverage – as noted above, the simulator currently works by minute, not by scoring shot. It’s significantly easier this way.

Walters’ scoring shot is a great example of how the simulator works – at the start of the minute that the goal was kicked, with the score locked at 72, Fremantle had an 85.6% chance of winning if they were a goal up with a minute to play, and a 0% chance of winning if they were a goal down. This is a bit low for two reasons. Keep in mind that the simulator treats draws as “no values.” If two teams tie in the simulation, neither of them win and the simulation is discarded. Also, the simulator dramatically underestimates the chance multiple scoring shots will be kicked in the last minute to be effectively zero, so the simulator basically saw no way for Fremantle to win if they were down by six points. In truth, Fremantle would have had a very, very small chance of winning from that point.

Obviously, this will be weighted towards the scores which occurred late in the game as the leverage reached toward its limit of 1. I’ve also removed behinds – the list was full of them, but Walters aside, many of them had no direct impact on the outcome of the game.

Also, all minutes here are, unfortunately, estimated. If you’ve been waiting here thinking, “what about THAT goal?” get in touch and I’ll take a look – noticed some odd things around draws and occasionally when a goal was estimated to be kicked.

And now, to the rankings:

10. Lincoln McCarthy, Brisbane vs Geelong – 75-73, 78th minute – 75.0%
This game winning goal won the battle of 1st and 2nd on the ladder at the Gabba, though it left enough time on the clock for Esava Ratugolea to get a point back.

9. Shaun McKernan, Essendon vs GWS – 71-71, 77th minute – 75.6%
McKernan tied the game for the Bombers – a GWS goal would have won the game for them, but instead turned the game into a 50-50 contest. The highest ranked goal which didn’t directly win a game. If you forgot who won, hang on, you’ll remember shortly.

8. Michael Walters, Fremantle at Collingwood – 79-75, 80th minute – 77.4%
Walters’ goal clinched victory for Fremantle at the MCG – Fremantle weren’t given much of a chance to win this game (19%) and as such the simulator thought Collingwood very likely to kick an additional goal, lowering the leverage a bit. Another game where a by-second simulation would increase the percentage a bit.

7. Anthony McDonald-Tipungwuti, Essendon vs North Melbourne – 86-81, 80th minute – 83.1%
Up 18 at three-quarter time, Essendon gave up 21 straight points to the ‘Roos but McDonald-Tipungwuti’s goal clinched the win.

6. Marty Hore, Melbourne at Gold Coast, May 11 – 60-60, 80th minute – 83.3%
Tom McDonald’s behind won the match, but Marty Hore’s game-tying goal happened at a crucial juncture of the match. This missed the top three mostly because Melbourne were expected to beat the Suns by a greater margin, so the simulator assumed the favourites would kick goals more often.

5. Josh Bruce, St Kilda vs Fremantle – 72-69, 80th minute – 83.6%
Bruce’s game winner came with 42 seconds left.

4. Jack Bowes, Gold Coast vs Carlton, April 14 – 59-57, 80th minute – 84.0%
Bowes scored with 13 seconds left on the game clock to lead Gold Coast over the Blues, so again the actual leverage of this kick was closer to probably 99.8% – but still shows up as hugely important in the rankings.

3. Eddie Betts, Adelaide at Brisbane, May 18 – 92-93, 80th minute – 84.8%
Betts scored with about three seconds left in the game, so the leverage here assumes there was time to kick another scoring shot – there wasn’t. The leverage here is a little higher than Bowes because of the one-point margin at the start of the last minute. If the simulator looked at seconds instead of minutes, I’m nearly certain Eddie wouldn’t make the list at all.

2. Cale Hooker, Essendon vs GWS – 77-71, 80th minute – 86.3%
Hooker’s kick won Essendon the match in spite of the odds (well, Essendon’s winning percentage was about 49% at that point.) I had to manually add this kick to the top of the list due to a double issue with how the game was coded, and what the simulator assumes with the game being tied if GWS were to be given a goal (considering it knocks out drawn games.)

1. Marc Murphy, Carlton at Fremantle, June 30 – 79-75, 80th minute – 87.4%

Win probability and leverage for Murphy’s winning goal against Fremantle

Marc Murphy didn’t kick after the siren, but his very late goal put Carlton up by 4 with only a few seconds left to play to cap off the upset – Fremantle were given a 78% chance of winning by the pundits.

And of course, a special additional mention for Michael Walters, who really should be all the way down here with that after-the-siren behind.

 

MLB: The odds of a perfect game

As I sat with my friend in the second deck of Dodger Stadium on September 4, 2017, we began to contemplate the impossible.

Heading into the bottom of the sixth inning, Robbie Ray had retired the first 15 batters he faced without allowing a baserunner. The Diamondbacks were up 2-0, J.D. Martinez having hit a two-run homer in the 4th.

Leading off the bottom of the 6th, Logan Forsythe would put a temporary end to our dream of seeing a historical baseball evening with a hard-hit single to center. Ray’s perfect game had suffered an early defeat.

Unlike most baseball nights, the dream of seeing history would later be resurrected by Kristopher Negron’s (remember him?) ninth-inning double to left, virtually guaranteeing Martinez would get his chance to hit his fourth home run of the night.

Unlike the perfect game, the four home run night builds: it is not until the third home run you consider yourself a potential witness to history, with a perfect game conspicuous by the three consecutive zeros on the scoreboard from start to finish.

With the scoreboard still showing those zeros in the fifth, I began to wonder: how rare of an event is it to retire the first 15 batters of the game? And, certainly, while the odds of a perfect game are easily found, certainly if you’ve gone slightly more than halfway through a game without allowing a batter, your odds of finishing the night without a baserunner must be increased from the norm, right?

I couldn’t find this information anywhere, in spite of getting to Google Page #17 on a fairly specific search. As such, I decided to write a little computer program (a Python script, for you fellow nerds) which parsed retrosheet.org play-by-play data to figure out how rare the Robbie Ray 15-straight-to-start-the-game outing was, and to determine the odds of pitching a perfect game by the number of outs from the top of the first.

I took all the play-by-play files from retrosheet.org from 1930 – not all the games are complete, and there may be small errors here and there which I didn’t account for, and I didn’t check for pitching changes – and ran them through a parser. The parser determined at what out the first batter reached base. If the walk happened with 2 outs in the first, I marked it as a “2”.

It turns out Robbie Ray’s 15 straight was a decently special event, occurring only 939 times in the 304,933 team play-by-plays analyzed (it’s an odd number since several bad play-by-play files were removed.) If it were a batting average, seeing 15 in a row mowed down is akin to seeing a player with a .003 batting average getting a hit. It happens once about every 162.5 games, so a starter for each team should do it once a year.

However, to demonstrate the difficulty of pitching a perfect game, Ray only had a .019 chance of making history at that point. He’s already pitched a game in the 99.7% percentile in terms of keeping runners off base from the start of the game, and he’s still only got a very small chance of making history.

Statistically, a pitcher has about a 28% chance of getting through the first inning without allowing a baserunner, but he only increases his odds of pitching a perfect game to .0002.

The probabilities finally start looking better for a pitcher around the seventh inning. This makes sense, as more perfect games are lost with 9 outs gone than 8, and even 18 outs gone than 17, because of the pitcher’s slot in the National League (8 and 17 representing two outs in the bottom of the 3rd and 6th, respectively.) The pitcher will have gone through what should be the best part of the lineup three times and is on the home stretch.

The first time a pitcher has a >10% chance of a perfect game is with 21 outs gone, or through seven innings. From there, the odds begin increasing dramatically: 20.7% after 22, 29.5% after 23, 41.9% after 24, 51.4% after 25, and 62.1% after 26 outs. That’s right: 11 of the 29 times a pitcher got to 26 outs, batter number 27 reached base (including, remember, Yusmeiro Petit.)

Though the sample size is very, very small to the point where trying to divine meaning doesn’t make much sense, it’s still interesting the 29 batters representing the final out in a perfect game have a reached base a higher percentage of the time (.379) than the 304,933 leadoff batters (.345). Even more interesting, the biggest jump in historic probability occurs between outs 23 and 24. Batters have only broken up a perfect game 14 times in the bottom of the 9th with zero or one out, compared to 11 times with two outs. In fact, the rarest situation is the perfect game broken up after one out in the 9th – the 8th batter has only made it on base 6 times in 35 possibilities, for a .171 OBP (well, an OBP which includes errors.)

If you see a pitcher retire the first three batters, that’s between a 1-in-3 and 1-in-4 occurrence. If he makes it through two innings, that’s approximately a 1-in-11. Retiring nine straight to start the game happens roughly once every 31 games – by that point, the pitcher has done better than 97.9% of all pitchers since 1930, in games without corrupted play-by-play data.

Another interesting statistic: the on base with error percentages are highest for outs 27 (.379), 2 (.355), 1 (.345), 20 (.349), and 11 (.339). Generally, with the exception of the number three hitter, the chance a batter will reach base in a given situation decreases as the game gets on, which is in line with what we’d expect with a pitcher capable of retiring 18 straight – until, of course, that final out.

Oh, and the Dennis Martinez 1991 perfecto is the only time in the play-by-play I looked at where both pitchers took perfectos into the sixth inning in any game – fortunately for Dennis, Mike Morgan gave up a no-out sixth inning hit to Tucson, Arizona native Ron Hassey, the only catcher in history to catch more than one perfect game.

So, your conclusion: enjoy the games where the pitcher gets into the fourth or fifth inning without allowing any baserunners. The fun’s likely to end soon, but you’ve still witnessed a fairly rare event. If a pitcher is perfect through seven, you’re unlikely to see a perfect game, but you’ve now got a 1-in-7 chance of witnessing history. And be nervous with two outs in the ninth, since the only person in the stadium who isn’t nervous is probably Eric Chavez.

I have included the table and the Python script below. (Sorry about the headers. An hour of CSS design didn’t do much.)

Outs

Times ended

Total occurrences

Percent chance of perfect game

Percent chance of retiring X straight to start the game

Individual odds of having the game be broken up after this out

Percentile

Odds change between outs

OBP (with error) % by out

1 in X games*

0

105159

304933

.000

1.000

.345

.345

.000

.345

1

1

66685

199774

.000

.655

.219

.564

.000

.334

2

2

47259

133089

.000

.436

.155

.719

.000

.355

2

3

29771

85830

.000

.281

.098

.816

.000

.347

4

4

18385

56059

.000

.184

.060

.876

.000

.328

5

5

12020

37674

.000

.124

.039

.916

.000

.319

8

6

7902

25654

.001

.084

.026

.942

.000

.308

12

7

5202

17752

.001

.058

.017

.959

.000

.293

17

8

2760

12550

.001

.041

.009

.968

.000

.220

24

9

3246

9790

.002

.032

.011

.979

.001

.332

31

10

2185

6544

.003

.021

.007

.986

.001

.334

47

11

1476

4359

.004

.014

.005

.991

.002

.339

70

12

923

2883

.006

.009

.003

.994

.003

.320

106

13

604

1960

.009

.006

.002

.996

.004

.308

156

14

417

1356

.013

.004

.001

.997

.006

.308

225

15

261

939

.019

.003

.001

.998

.007

.278

325

16

190

678

.027

.002

.001

.998

.010

.280

450

17

107

488

.037

.002

.000

.999

.010

.219

625

18

116

381

.047

.001

.000

.999

.021

.304

800

19

73

265

.068

.001

.000

.999

.026

.275

1151

20

67

192

.094

.001

.000

1.000

.050

.349

1588

21

38

125

.144

.000

.000

1.000

.063

.304

2439

22

26

87

.207

.000

.000

1.000

.088

.299

3505

23

18

61

.295

.000

.000

1.000

.124

.295

4999

24

8

43

.419

.000

.000

1.000

.096

.186

7091

25

6

35

.514

.000

.000

1.000

.106

.171

8712

26

11

29

.621

.000

.000

1.000

.379

.379

10515

27

18

18

1.000

.000

.000

1.000

-1.000

1.000

16941

*divide by two to get the actual number, since we “count” each game twice: once for each team.

And the python script:


*divide by two to get the actual number, since we “count” each game twice: once for each team.
#© 2018 John Holden
#You are free to use and distribute this script as long as you do not charge and credit is given to the author
#Script to determine play-by-play
import csv
import os

#printMe function -
#instead of printing all files to the console, just print a couple games to check to see it’s working properly
#otherwise it gets super unwieldy super quickly
#also check to see what games don’t catch and give you >27 outs
def printMe(printString):
    if (activegame < 17 and activegame > 16) or firstout1 > 27 or firstout2 > 27:
        print(printString)

#define which variables we will be using
activegame = 0    
onbase1 = 0    #flag for the away team to see if anyone has been on base
onbase2 = 0    #flag for the home team to see if anyone has been on base
firstout1 = 0    #variable to store how many outs there were when person reached base
firstout2 = 0
ider = ''    #for display purposes – gets the ID of the game
outs = {}    # dictionary to track the number of times a perfecto was broken up at a specific number of outs

for csvFilename in os.listdir('PerfectGame'):
    if csvFilename.endswith('.EVN') or csvFilename.endswith('.EVA'):
        with open('PerfectGame/' + csvFilename, 'rb') as csvfile:
            pbp = csv.reader(csvfile, delimiter=',', quotechar='"')
            for row in pbp:
                if row[0] == "id":
                    printMe("new game")
                    if activegame > 0:
                        outs[firstout1] = outs.get(firstout1, 0) + 1
                        outs[firstout2] = outs.get(firstout2, 0) + 1
                    if firstout1 > 25 or firstout2 > 25:
                        #print all the games that went into the ninth inning to make sure this is working properly
                        print(ider + " " + str(firstout1) + " " + str(firstout2))
                    #we have a new game going so reset all the variables
                    firstout1 = 0
                    firstout2 = 0
                    onbase1 = 0
                    onbase2 = 0
                    activegame = activegame + 1
                    ider = row[1]
                if row[0] == "play":
                    pbpres = row[6]
                    printMe(row)
                    #checks to see if any of the game data from Retrosheet matches getting on base
                    if onbase1 == 0 and row[2] == '0':
                        if pbpres[:1] == "H" or pbpres[:1] == "T" or pbpres[:1] == "D" or pbpres[:1] == "S" or pbpres[:1] == "W" or pbpres[:1] == "E":
                            onbase1 = 1
                            printMe("game1 killed" + str(firstout1))
                        elif pbpres[:2] != "NP":
                            firstout1 = firstout1 + 1
                    #checks for home team – this could have been looped
                    if onbase2 == 0 and row[2] == '1':
                        if pbpres[:1] == "H" or pbpres[:1] == "T" or pbpres[:1] == "D" or pbpres[:1] == "S" or pbpres[:1] == "W" or pbpres[:1] == "E":
                            onbase2 = 1
                            printMe("game2 killed " + str(firstout2))
                        elif pbpres[:2] != "NP":
                            firstout2 = firstout2 + 1
print(outs)