We would rather get something predicted than waiting to see the outcome. We would love if someone predicted anything correctly. There were analyst who were paid millions to predict an outcome of a game or an election. Surely, none of them were as accurate as the Octupus of the 2010 World Cup who correctly predicted 8 games. So, what made the sea dweller so accurate than the most intelligent species in the planet? Certainly his instinct. But as humans can we develop something that can be reasonably accurate to predict the outcome of any competitive match or an election?
The answer is Yes we can. Quite easily.
I started to learn machine learning few months ago. I did one college project on it .One day I was going through Stanford’s final year projects on Machine Learning and I was quite amazed when my eyes crossed a title Predicting Soccer Match Results in the English Premier League by Bob Ulmer and Mathew Fernandez.
I read the project report. I was quite motivated to build a model which could predict an outcome of English Premier League. By, instinct and by machine inspection I picked few features like a team’s form, either they are playing home or away, their recent form, last season standing, current season standing and so on.
From instinct I had planned to use two methods to classify the results(win,loss,draw). One was modeled to use a Simple vector Machine and other was to use Naive Bayes Classification.
I extracted 2200 match data of premier league matches over the last 10 seasons and classified them on feature columns.Features like home team, away team, standings were easily available online whereas features like form had to be calculated on the dataframe itself.
The dataframe was preprocessed and fed into the model. After few hit and trials and few adjustment to the model it was found that SVM with an RBF kernel (C=1,gamma=auto) gave the best train score of 0.78. The train score can be interpreted as a weighted average of the precision(also called positive predictive value) and recall(also called precision), where an train score reaches its best value at 1 and worst score at 0. 0.78 simply means that more data were correctly predicted (it is not a probability of an outcome).
I did this project to fulfil my curiosity on whether Data Analysis And Machine Learning can be as accurate as the octopus on predicting match outcomes or not. Unpredictability is one of the best feature of the game. Which machine or human or a octopus or an squirrel would have predicted Leicester City F.C. to be the champions of England? So, there can only be the best model to predict an outcome never a 100% accurate model.
So, I will be back to predict outcome of a game using Deep Learning. Deep learning is more accurate than machine learning when working with huge data.
If you have any queries or need any help on the subject feel free to comment .