PREDICTING SOCCER MATCH RESULTS USING REGRESSION MODELS FOR COUNT DATA
Brasileirão 2019; count data; La Liga 2018-19; regression models; prediction; overdispersion.
Football is the most popular sport in many countries of the world. It is of interest to many bettors to know how many goals will be scored in a match and to analyze the probabilities predicted by statistical models before making decisions. The Poisson regression model is the main ally in predicting the number of goals scored by the teams, as it is present in the vast majority of related files (articles, dissertations, monographs, among others). However, when considering count data that exhibit excess dispersion, the negative binomial model appears as an alternative. This work aims to expand the universe of available models to predict football match results, estimating the probabilities of each possible result (home win, draw and away win) and including unprecedented applications of some models to football data, such as COM-Poisson, Bell and Poisson-Shanker. The bivariate versions of the Poisson and negative binomial models were also considered. The applications were made for the matches of the Brazilian Championship Series A 2019 and of the Spanish La Liga 2018-19. The models considered in this work showed a good predictive performance, according to the Brier Score measure.