Football’s just a branch of science: how the beautiful game is turning into a numbers game

“We should be playing with two up front”, “We shouldn’t have subbed him off”, “I can’t believe they’ve left him on the bench”. Andre Doerr of Exasol writes.

  • Monday, 19th November 2018 Posted 6 years ago in by Phil Alsop

For most football fans, phrases like these should be fairly familiar. Whether we’re watching the match in the stadium, tuning in on the tv or radio, or reading the post-match report in the papers, we always think we know best. We see our teams play every week, and so naturally, we think we know those players inside out. Believing that we can do the job better than the managers is just part and parcel of the beautiful game.

 

But the opinions we have are often subjective, at best. As armchair managers, our decisions are based on ‘gut instinct’ or simply because we want to vent our frustration at how the match is going. And while those decisions are frequently wrong, there’s never the chance of that coming to light.

 

For the Premiership managers, of course, those decisions count. The substitutions, the formations, the starting team, the set pieces, and the playmaking are all out in the open for everyone to see – to praise and to criticise. Managers have their reputation, their job, their trophies, and club funding on the line each and every season. With the stakes so high, decisions clouded by emotion and unconscious human biases simply aren’t acceptable in the modern game – which is why, nowadays, sports management is turning into something of a science.

 

Most of the choices top teams now make can be traced directly to data analytics. Winning teams understand the power of big data, and that the smallest nugget of information within these vast data pools can make the difference between a win, a draw, and a loss. In recent years, we’ve become accustomed to this data-centric analysis of football: numbers and percentages relating to shots on goal, interceptions, and cards are incorporated into live TV. It’s easy to forget that clubs are relying on this data too.

 

By way of cameras and wearable tech, individual players can be monitored in training so that fatigue, heartrate, speed, the number of shots, passes completed, fouls conceded, and times dispossessed can all be collated and analysed to establish the strengths and weaknesses of a particular player. Keeping tally of the direction of a player’s penalty shots can even betray the unconscious bias of which side of the goal he prefers. With this knowledge, a goalkeeper can increase the chances of correctly “guessing” which way to dive.

 

To aggregate and apply all this data, teams and bookies often draw on a “second team” of data scientists. Given the need to quickly process vast amount of data, for flexible data structures, and the demand for low latency and high performance to make calculations mid-match and quickly analyse large pools of data, a fast in-memory database is preferable – and the results are astonishing.

 

Descriptive analytics were at the heart of the Leicester City’s success in the 2015/16 season, as the UK team overcame odds of 5000/1 to win the Premier League title. Players were able to access interactive reports created by the team’s data scientists both before and after matches via tablets. This allowed them to see statistics and subjective comments and watch video footage, and adapt their training and play moving forwards. The use of analytics also enabled the team to assess injury concerns and achieve the lowest injury rate in the league, meaning Leicester City was able to play their strongest players week in, week out.

 

As analytical models become more informative, the transfer system has also become less reliant on human perception. Talent scouts used to be the conventional way to gather intelligence about potential purchases, and they would spend days watching unknown players in the pouring rain to find the next star.

 

While there is still a role for the traditional talent scout, most clubs’ recruitment policies now lean heavily on data analysis. Today, scouts will often watch players on a computer screen from the comfort of their desk, and matches will be supplemented by software which describes players, teams or matches, however obscure, around the world. Here an efficient in-memory databases make it possible to aggregate and analyse data from thousands of players at once, rather than preforming having to “roll up” – the data analyst’s term for selecting a manageable portion of a dataset, often to ensure their visualisations load quickly enough that their reports are responsive.

 

Data can be leveraged by external pundits, too. Of course, bookies have been using analytical modelling for quite some time, but now that this data is more widely available, others are able to make data-driven prediction too.

 

Lloyd’s of London, for example, correctly predicted that France would win the 2018 World Cup final before kick-off. Using players’ wages and endorsement incomes, alongside a collection of additional indicators to construct an economic model which estimates players’ incomes until retirement, the firm was able to rank players according to insurable value, with the French team coming out on top. On a separate occasion, a team of data scientists studied 10 years' worth of data on nearly half a million football matches and the associated odds offered by 32 bookmakers between January 2005 and June 2015. Applying an odds-averaging formula to upcoming football matches, the team made a profit of $957.50 after five months.

 

As data analytics runs the gamut of the professional football scene, its influence can be seen in everything from mid-match decisions to talent scouting, and its impact is testified in the victories it has produced. In a sport often decided by a single goal, clubs need all the help they can get.  And with the new season of the Premier League heating up, the team that takes “the extra time” to make the most of its data analytics abilities may well just win it.