Feb 25, 2023; Atlanta, Georgia, USA; San Jose Earthquakes defender Jonathan Mensah (4) controls the ball in the first half against the Atlanta United FC at Mercedes-Benz Stadium. Mandatory Credit: Brett Davis-USA TODAY Sports
Special acknowledgment to Quakes Epicenter patron Trevor Wojcik, a data scientist in the economics field, for constructing a season and player simulation technique for me that will be used in this series and beyond. Trevor’s insightful work and feedback made this preseason series concept for 2023 possible.
Introduction
I’m still shell-shocked. Every weekend in MLS feels like it has wild finishes, but not often quite like this. In the first game of Luchi Gonzalez as the head coach of the San Jose Earthquakes, we saw the Quakes involved in a finish that we maybe haven’t seen at this level since Shea Salinas and Alan Gordon in 2013 at the Cali Clasico. Two goals in stoppage time to win a game…but in reverse Goonies style. The Quakes were on the wrong end of a soccer gods decision at Mercedes-Benz Stadium on Saturday night, and many of us are struggling to make sense of it.
This fan base has been through a lot in the past few years. 2022 saw as little hype around the club as I’ve seen since I began following it at the end of 2011. Between the Stahre and Almeyda eras, the fans have been feeling crushed by defeat after defeat, and only in the middle of 2019 feeling have they been able to feel the tide had turned in their favor. That was three-and-a-half years ago.
In the aftermath of Saturday’s game, I couldn’t tweet. I didn’t even have time to put out a tweet. I didn’t have to talk to the fans since we had four amazing hosts I could lean on to do that, but I struggled with trying to produce The Aftershock postgame show in the face of such a finish and then the changes thrust on us at the last second due to the new Apple TV/MLS Season Pass plans for postgame press conferences.
While I love what we do here, there was a moment I felt like walking away. “Is this worth it?” was a question I haven’t really asked myself through all the losses of the past five years, but I asked it last night in the face of an avalanche of negative outcomes within the span of about 20 minutes.
While I pondered that question, I saw two different responses from the analysts and the fan base on Twitter and The Aftershock Postgame Show:
- “Same ol’ Quakes” from fans who feel like they have again been killed by the hope of a new season, a new coach, and a few new players.
- “This one was different, though” from fans who tried to keep the perspective that this was just one game — just the very first game — of a very long season.
“The die has been cast.” Same ol’ Quakes.
“The die has not been cast.” Today was not a different result, but what we saw was different.
Part 3 of this series here was supposed to be a final preseason article, but personal circumstances prevented me from writing it. The big plans I had to work again with Trevor Wojcik on a player evaluation model vanished. And then that Atlanta United game happened. How could I write this pretending it didn’t?
Dice and Probabilities
According to dictionary.com, the phrase “the die has been cast” is said to have originated from the Latin “Iacta alea est” which is alleged to have been said by Julius Caesar when he crossed the Rubicon and invaded Italy in 49 BC. Dice themselves originate between 2000 and 3000 BC in both Egypt and China. The word “die” in this context originates from Old French and Latin words referring to “something which is given or played”.
While I’ve never been a Dungeons and Dragons player, I have some experience with different types of die values due to playing sports simulation games as a child and teenager. It won’t surprise you that I played these games with both six-sided and 20-sided die as a youngster. I even created a couple of them myself.
Regular readers of Quakes Epicenter know that I’m a probabilities guy. The nuance is that I don’t just look at data the way a data analyst does, but also I observe situations and how to take data to observe them for analysis or in the aggregate and look for repeatable patterns.
Expected goals, for example, are an imperfect way of determining quality because there are many observable elements of how shots happen that are not found in the data because they are not recorded. For example, what was the shooter’s body positioning? What part of the foot and the ball did they strike? Did the player move into the ball? Did the ball get caught under their body? Was the player leaning forward or leaning back? Was the keeper’s view obstructed? Was the keeper leaning in a particular direction? Did the keeper have to take a step first to reach the ball’s trajectory? etc. etc. etc.
I could go on and on…and on about how the observable universe is not fully captured in the data. Data science experts in sports analytics tell us that about 70%-80% of the things that matter the most about a shot’s probability are recorded in the data.
However, as imperfect as they are, we know that expected goals predict future goals better than past goals do. It’s been proven many times, although the data for MLS is particularly noisy, as the referenced American Soccer Analysis article demonstrates.
Regardless, there are certain truisms in the sport of football/soccer:
- Non-penalty shots are converted 10% of the time (home 1% better than away)
- Penalty shots are converted ~75% of the time
- Expected goals follow conversion rates using decimal places: 0.1 for non-penalty shots
- Home teams get 14 non-penalty shots per game, away teams get 11
- Only about 35% of shots are on target (resulting in a goal or a save) regardless of home or away
In MLS specifically:
- Home teams win ~50%, draw ~25%, and lose ~25% with some season-over-season variation
This type of understanding at a very basic level of how the sport works grounds us in the basics of simulation and probabilities, such as we’ve done all preseason long. No matter how much nuance and complexity we add to how we view the sport, the underlying values never change very much at all, allowing us to confidently depend on them when viewed under the right lens.
There is a die we can cast.
Will Things Get Better Under Luchi?
I mean this is the real question, isn’t it? And the answer flatly is, “I don’t know.” I sure hope so for your sake, dear reader.
We only have one game to draw from. It was an away game. It was the first game of the season, and players are still getting full fitness, requiring interesting and maybe not-typical substitution decisions. It took two world-class shots from perhaps the most world-class player in MLS to win it.
Currently, the Quakes are on pace to score 34 and give up 68. Well, that’s one goal better on the defensive side than in 2022 already. We’ll know a good bit better after 5 games, and then 10 games as expected goals and other metrics become their most predictive.
That said, we can look at certain aspects of this single away game by comparing it to recent away seasons and then using these metrics that we can depend on, such as shots per game and xG, along with expected away performances, to analyze true performance throughout the 2023 season.
We can see from this table of away game non-penalty for/against shot metrics that in Luchi’s first game with the Quakes, they gave up 19 shots. Not great! In fact, it’s more than the average in any other season listed. But here’s what happened: the Quakes forced Atlanta into some pretty terrible shots — about half the usual xG Against (xGA) per shot the last five seasons. Do you know what you call a shot against you that doesn’t go in the net (unless they get a rebound shot or a deflection for a corner)? A turnover.
Also, the Quakes only managed 10 shots which are below their away average since 2018. Also not great! But here’s what happened: the Quakes improved their xG For (xGF) per shot by about 3-5% and grabbed a 1-0 lead from outperforming Atlanta United for the first 60′ of the game, despite the unfortunate handball call on Marie.
To overstate the obvious, 10 shots on 0.121 xGF is higher than 20 shots on only 0.055 xGA. It’s simple multiplication and subtraction.
xGD per Game = (Shots For per Game * xGF/shot) - (Shots Against per Game * xGA/shot)
What does this mean? Well, if this pattern continued exactly as is (it won’t), the Quakes would likely win more than they lose on the road and grab a bunch of draws along the way. But since it won’t continue, what we want to see is something closer to 2019 where the Away xGD per Game column is a positive number or very close to it. In 2019, Almeyda did this by peppering the opposition with tons of low-quality shots. In 2023, I expect Luchi to do the opposite as Almeyda and plan to shoot higher-quality shots, even if that means fewer of them. Fortunately for us, this simple formula can help us track how it’s going.
After this next weekend, we can start tracking the same metrics for the home side of the equation.
Yeah, but They Lost
Yes, and so did most of the away teams on Saturday and Sunday. So far home teams have won eight, and away teams have won four. Vancouver lost to RSL by also blowing a 1-0 second-half lead by conceding two goals in two minutes (sound familiar?). STL City only won because Austin gifted them goals. A bad Charlotte team lost 1-0 to the Revs, and a bad Minnesota United team played poorly but got a goal in Dallas while FCD managed only one shot on frame.
But since 2013, in games with a second goal, and the away team having a one-goal lead, the home team gets the second goal 64% of the time. Away teams get out-shot by a similar margin. The away team only finishes with a 1-0 win 27% of the time. It’s approaching 50% of the time that home teams come back to at least get a draw.
Luchi seems to have a good sense of this dynamic if not the actual probabilities. His substitutions were intended to create a mid-to-low block that would be difficult to play through and force Atlanta United into desperation long shots.
“I felt that we had the match in control. I was very confident that we were going to win the match. We put up a good mid-to-low block, using a 4-5-1 or 4-1-4-1 [formation], and it was hard to break through.”, Luchi Gonzalez said in his postgame press conference. “Even though there were shots made, they were blocked or from a distance that we think we can handle. [Atlanta United FC] was good at making combinations centrally, so that is why we tried to load the middle defensively — play compact. But they sprayed the ball wide a few times with some interesting crosses, and the corner came from a cross.”
This setup resulted in Atlanta United shooting as poorly as they did throughout the game, and particularly in the second half. It explains Luchi’s substitutions. And it worked, mostly.
The problem is once one goal goes in, desperation turns to confidence for the home side.
The Final Simulation (of this series)
But, in typical Jamon-article fashion, we are going to replay the end of the Quakes game. We are going to do it by casting a die. We are going to give Atlanta United 11 shots after the 58th minute as the Quakes did, and they are going to have to hit a 1 on a 20-sided die given their penchant in the game for 0.05 xG shots. The die is going to be cast. A lot.
Given 11 shots with a 20-sided die and needing a “1” to score a goal, at least one goal will be scored (thrown) 46% of the time. You can test this yourself with a probability calculator, or if you have the time, a 20-sided die.
The probability calculator gives the Quakes a 54% chance of winning with a 1-0 lead and giving up 11 0.05 xG shots.
I asked Trevor Wojcik to simulate this, you guessed it, 10,000 times.
For Atlanta United to score at least two goals over those 11 shots, it was a 10% chance. For Thiago Almada to score two in a row on under 0.05 xG shots to end the game is like throwing snake eyes with two 20-sided dice. It’s a 1-in-400 chance. It’s less “same ol’ Quakes” and more “holy crap, Almada!”.
Some may say Daniel should have done better with Almada’s first shot, but when the goalkeepers come to Daniel’s defense, we should know better. It was a poor game to make any judgment on the new goalkeeper given he faced only gimmes and worldies.
In truth, the Quakes would almost certainly have their best road record since at least 2012 if they carried 1-0 leads into the 58′ and then let the opponent shoot 11 shots from outside the box while they packed it in every single away game. There’s a reason teams do that. They could only be so lucky to have that chance in all 17 away games in 2023.
Conclusion
This article was originally supposed to be about whether the Quakes will make the playoffs or not, but it’s really still a “let’s wait and see.”
Only one die has been cast, and 33 more are to come. Let’s cast 10 and look at this again. See you Saturday at PayPal Park.