SamSuka
Excel LADZ
Excel LADZ

patreon


Master Sports Betting with Monte Carlo Simulations!

G'day lads! My next video on YouTube will be an explanation of how you can use Monte Carlo simulations in Excel to compute probabilities for betting markets in Excel 🔥 I'm hoping this video will be filmed and released on YouTube within the next week.

Meanwhile, attached is the script and a model projecting the winner of MLS. The file does not have the accompanying 10,000 simulations as it would lag WAY too much. I froze the simulations by copying and pasting over the top, so that the model is easier to use and look around. Let me know if you have any questions below lads!

Note: The model uses a LET function that is only available to Excel 2019 and later users. A 'Legacy' version of the file with a slightly reduced capability will be released shortly. Enjoy 💪

Video Introduction & Futures Betting


G’day lads! In this video I’ll take you through how to use Monte Carlo Simulations in Excel so that you can price markets in sports betting. Right now, the MLS is about half-way through their season, and Messi’s Inter Miami are sitting first in their conference. Bet365 have them as clear favourites to win the MLS Cup, at $2.62.


This is an example of a futures market; it’s an event that is happening in the future. Now lads, whenever you place a bet, you want to be sure that it has a positive expected value. As you saw, Miami is priced at $2.62 to win the MLS. This implies that it has a probability of 38.2%. If you learn that the event is actually 50% likely to occur, then “fair odds” would be $2.00. As a result, a bookmaker offering $2.62 would give you a positive expected value, meaning that, in the long term, you are expected to profit from the bet.

So, to find the expected value of Inter Miami winning the MLS, we need to find the event’s probability. Unfortunately, there isn’t a straightforward way to do this. There are about 228 games left until the MLS Cup, the final of the competition.

With three possible outcomes in every match (a win, draw and a loss), the number of potential outcomes is 3^[228]; and this doesn’t even account for the range of possible score lines in a match. Clearly, there’s too many variables for a simple formula to calculate Miami’s probability.


Monte Carlo Simulations

However, we can simulate these 228 games numerous times to see the distribution of results. This process is known as a Monte Carlo Simulation.

To illustrate, consider rolling a six-sided die. Suppose you don’t know the probability of rolling a six. By using a Monte Carlo Simulation, we can simulate rolling the die 100,000 times to see how often a six appears. By dividing the number of sixes rolled by the total number of rolls, we can find the experimental probability of rolling a six. The more trials we conduct, the closer this experimental probability will be to the “true” probability of one-sixth.

We can apply the same principle to the MLS competition. By simulating the rest of the season, we can determine a “simulated winner”. Running this process 10,000 times gives us a distribution of possible outcomes. The good news is that this process is very easy to do in Excel. 


Poisson Process for Football Games

Firstly, I’ve imported a live MLS table and fixtures/results list from the website Football Reference. I’ve done this by connecting Football Reference to my Excel file with a Power Query, so that the stats update whenever I select the “Refresh All” icon in the ‘Data’ tab.

Using this data, I can create a worksheet that summarises each regular season game into columns named ‘Game Number’, ‘Home Team’, ‘Home Score’, ‘Away Score’ and ‘Away Team’ using an XLOOKUP function. As only 265 games have been played, the remaining 228 games have a blank cell for the Scoreline columns.

In order to simulate a season, these remaining games need to be simulated so that a random score is given to each team in the match. These scores can’t be completely random though; they have to be based on the relative strength of each team.

We can start by determining the Attack and Defence ratings for each team in a matchup. The Attack rating is calculated by dividing a team’s average goals scored per match by the league’s average goals per match. Similarly, the Defence rating is determined by dividing a team’s average goals conceded per match by the league’s average goals conceded per match.

Next, we need to calculate the expected number of goals in a match (xG). This is done by multiplying the attacking rating of one team by the defensive rating of the opposing team, and then multiplying by the league’s average goals per game. I’m also going to add a 20% xG advantage to the home team, which has been consistent throughout this season. This can be done by multiplying the home team’s xG by the square root of 1.2, and then multiplying the away team’s xG by the reciprocal. This gives us the xG for each team in the matchup.

To simulate the actual goals scored using these xG values, we can use the BINOM.INV function in Excel. The formula looks like this: BINOM.INV(10000, xG/10000, RAND()). This function simulates the number of goals scored by a team, treating each potential goal as an independent trial with a small probability.

The BINOM.INV function models a binomial distribution, which is suitable for a situation where you have a fixed number of trials, each with a constant probability of success. When the number of trials (in this case, 10000) becomes very large and the probability of success (xG/10000) becomes very small, the binomial distribution approximates the Poisson distribution. And lads, the Poisson distribution is ideal for modelling the number of goals in a soccer match because it describes the probability of a given number of events happening within a fixed interval; that is, the number of goals scored in 90 minutes.

So by using the BINOM.INV function with 10,000 trials, we effectively simulate the goals scored in a way that closely aligns with the Poisson distribution, providing a realistic outcome based on the teams’ attacking and defensive strengths. 

So that the model is stochastic - meaning that the attacking and defensive strengths of each team update after every simulation - I’m including the BINOM.INV formula within a huge LET function. This formula will retain the scoreline of completed games while simulating matches yet to be played. By dragging this formula down for all matches, the remainder of the regular season unfolds with simulated scorelines. Each time I refresh the worksheet, these simulations automatically update.

Next, I’ll summarise these outcomes by creating a dynamic table for both the Eastern and Western Conferences. To do this, I’ll use a series of SUMPRODUCT functions to grab a team’s results, and then I’ll rank them using the SORT function above. This table adjusts in real-time whenever I refresh the simulations, reflecting a new simulated season. Now that each team has been seeded, it’s time to simulate the postseason to find a winner of the competition.


MLS Postseason Scenario

The playoffs start with a wild-card match between the 8th and 9th seeds of each conference, decided by penalties after normal time. The winners of these matches progress to the ‘Round One Best-of-3 Series’ to play their respective seeds in a best-of-three format.

The winners advance to the ‘Conference Semifinals’, which are single elimination games hosted by the higher seeds. The victors then proceed to the ‘Conference Finals’; also single elimination games. Finally, the conference winners compete in the ‘MLS Cup Final’ to decide the champion. All of this needs to be set up in our spreadsheet.

To simplify things, I’ll use each team’s final Attack and Defence ratings from the regular season as constants throughout the playoffs. These ratings will update with each new simulated season. Using these fixed ratings, I’ll create a new formula using the LET function to end the game after 90 minutes. If the game is tied, the RAND function will give each team a 50% chance of winning the penalty shootout.

The winners of these wild card matches move onto the ‘Round One Best-of-Three Series’ to face the first seed in their conference. Second hosts seventh, third plays sixth, and fourth plays fifth. Although in real life a third game isn’t played if a team is up 2-0 in the series, I’ll include it to simplify the following formulas. The winners of each series advance to the Semi-Finals. Here, I’ll modify the LET function again to include 30 minutes of extra time if the game is tied after normal time. This involves dividing a team’s xG by three to simulate 30 minutes of extra time instead of 90 minutes. If still tied, penalties will decide the winner.

The winners move on to the Conference Finals, and the winners of those go to the MLS Cup. From there, I can determine the winner from the simulated score. And as you can see lads, we have a winner for a single simulated season, in this case it’s [Inter Miami].

Now, to find the probability of each team winning the MLS Cup, we need to run the entire season simulation 10,000 times to get a distribution of results.


Data Table + Analysis

To do this, I’ll create a new worksheet called “Season Simulations” and list each team horizontally along the first row. Using an IF function, I’ll return a 1 if the team has won the MLS (that is, they appear in cell L5 of the “Post Season Simulations” worksheet) and a 0 if they haven’t. I’ll drag this formula across for every team. Since there should always be only one winner each season, there will be a single 1 every time I update the simulations.

In column A, I’ll create a sequence from 1 to 10,000. Beside this, I’ll run the formulas to pick up the competition winner for all 10,000 rows. To run these simulations, highlight cells A2: AD100002. Then, go to the ‘Data’ tab, select the ‘What-if Analysis’ icon, and choose Data Table. Skip the “row input cell” and for the “column input cell” select a random cell outside the dataset. Press “OK”, and 10,000 simulated seasons will be generated.

Next, we need to analyse these simulations. Create a new worksheet called “Analysis” and, using an INDEX MATCH formula, I’ll count the number of instances where a team has a 1 in their simulations, indicating they were champions, and divide by the number of trials (10,000) to get a probability. To find the “fair” decimal odds from these probabilities, I’ll divide 1 by the percentages in the next column.

Comparing the odds offered by Bet365, our model indicates that Messi’s Inter Miami is significantly overvalued by the bookmaker. According to the simulations, the fair odds for Inter Miami should be around $7.37. This difference is probably due to the popularity of the team, considering Messi, Suarez, Busquets and Alba are now playing for them. When bookmakers anticipate a large volume of bets on a particular team, they’ll reduce their odds for that market to lower their potential payout. This is a risk management technique called “balancing the book”.

Excluding Inter Miami, we can use the expected value (EV) formula to identify undervalued bets. By considering trades with a minimum 10% expected value, we find that profitable teams include Real Salt Lake, the Columbus Crew, LAFC, Seattle Sounders, and FC Dallas.

Just keep in mind lads that these probabilities were generated using a basic Poisson Attack/Defence model. They can be updated with a more accurate formula that takes into account player changes for each match, previous fixture difficulty and other factors. However, the process for generating the probabilities using Monte Carlo simulations remains the same.

NBA Example

Also lads, Monte Carlo simulations aren’t just useful for futures. In another video, I’ve used Monte Carlo simulations to generate probabilities for the Spread, Moneyline, and Under/Over markets in NBA games. In this model, I treated a player’s simulated simulated points as the sum of their simulated Free Throws, 2-Pointers, and 3-Pointers in a match using the binomial distribution. The sum of every player’s scores in a match was the team’s score, which was then simulated 10,000 times to get a distribution of points for each team.



More Creators