A statistical approach to sports betting
While gambling on sports fixtures is a popular activity, for the majority of gamblers it is not a profitable one. In order to make a consistent profit through gambling, one of the requirements is the ability to assess accurate probabilities for the outcomes of the events upon which one wishes to place bets. Through experience of betting, familiarity with certain sports and a natural aptitude for estimating probabilities, a small number of gamblers are able to do this. This thesis also attempts to achieve this but through purely scientific means. There are three main areas covered in this thesis. These are the market for red and yellow cards in Premier League soccer, the market for scores in American football (NFL) and the market for scores in US Basketball (NBA). There are several issues that must be considered when attempting to fit a statistical model to any of these betting markets. These are introduced in the early stages of this thesis along with some previously suggested solutions. Among these, for example, is the importance of obtaining estimates of team characteristics that reflect the belief that these characteristics adjust over time. It is also important to devise measures of evaluating the successo f any model and to be able to comparet he predictive abilities of different models for the same market. A general method is described which is suitable for modelling the sporting markets that are featured in this thesis. This method is adapted from a previous study on UK soccer results and involves the maximisation of a likelihood function. In order to make predictions that have any chance of competing with the odds supplied by professional bookmakers, this modelling process must be expanded to reflect the idiosyncrasies of each sport. With the market for red and yellow cards in Premier League soccer matches, in addition to considering the characteristics of the two teams in the match, one must also consider the effect of the referee. It is also discovered that the average booking rate for Premier League soccer matches varies significantly throughout the course of a season. The unusual scoring system used in the NFL means that a histogram of the final scores for match results does not resemble any standard statistical distribution. There is also a wealth of data available for every NFL match besides the final score. It is worth investigating whether by exploiting this additional past data, more accurate predictions for future matches can be obtained. The analysis of basketball considers the busier schedule of games that NBA teams face, compared to NFL or Premier League soccer teams. The result of one match may plausibly be affected by the number of games that the team has had to play in the days immediately before the match. Furthermore, data is available giving the scores of the game at various stages throughout the match. By using this data, one can assess to what extent, and in which situations, the scoring rate varies during a match. These issues, among many others, are addressed during this thesis. In each case a model is devised and a betting strategy is simulated by comparing model predictions with odds that were supplied by professional bookmakers prior to fixtures. The limitations of each model are discussed and possible extensions of the analysis are suggested throughout.