They said it could never be done. The game was too fluid, too chaotic. The players’ movements too difficult to track reliably. But, decades after sports like baseball first embraced statistics, football too is starting to play the data game. 

“We used to hear that football was too complex and free-flowing to apply data to it, but that’s not something that really gets said these days,” says Ted Knutson, co-founder and chief executive of StatsBomb, one of the football analytics companies at the heart of the sport’s long-overdue data revolution.

Numbers are not entirely new in the sport: for decades commentators have painstakingly compiled statistics on everything from winning streaks to the most crosses ever delivered in one match. But over the past decade a far more scientific operation has emerged, changing not only teams’ results but also how money is deployed on recruiting new talent. 

Football’s analytics era began in earnest with granular “event data” — detailed records of every on-the-ball action in a match. In 2006, the team of event-coders at London-based Opta Sports were tapping buttons to record the time and location of every pass, shot, tackle and dribble. Today, each Opta-coded match contains around 2,000 data points.

Then followed the incorporation of “expected goals” — a system for calculating the likelihood of any shot being scored, based on its distance and angle from goal. The concept went mainstream in 2017 when it was introduced to the Premier League’s flagship television highlights show, Match of the Day.

The biggest impact of this sudden wealth of player-level data has been on recruitment and retention. 

Clubs can now draw up a shortlist of players whose statistics match the profile of their ideal target signing, all without leaving the training ground. Scouts can then assess matches and video footage of a smaller pool of players, saving time and money.

One company in the recruitment field is the 21st Club. The consultancy’s tool calculates the historical link between players’ actions on the pitch and their team’s overall performance level, and assigns each player a rating. Clubs can use the data to see whether a player would strengthen, weaken or make little difference to their team’s overall performance level. 

Earlier this year, 21st Club used the tool to identify a potential new player for a club in eastern Europe. The footballer was earning 25 per cent less than the average player on the club’s existing 20-strong shortlist; the valuation model estimated the player to be better than all but one of those 20.

“This shows the power of smart use of data,” said Omar Chaudhuri, 21st Club’s head of football intelligence. “The player was from a market that they [the club] didn’t necessarily have the resource to scout in great detail, but by using data we were able to highlight a specific player worth looking at.”

A graphic with no description

Context is important, though. A player does not make the same decision when in yards of space as when encircled by opponents, so StatsBomb’s event-coders note whether or not a player is under pressure when playing a pass or shot. They also record the goalkeeper’s position and the location of defenders between a player and the goal when a shot is struck.

But what is now needed, Mr Knutson says, are “complex models to help evaluate what is certainly the most important skill in the game” — passing the ball. StatsBomb is one of dozens of analysts that build upon simple passing totals with information on the difficulty of the passes attempted.

Some elite clubs’ buying patterns, of signing fewer but more suitable players, suggests such tools are having an impact. “Liverpool are the clear case study,” says Mr Knutson. “But Manchester City just don’t sign bad attackers any more either; they signed De Bruyne, Sterling, and Sané back-to-back-to-back, who were among the very best choices [according to the data] each time.”

Inside clubs, technical staff say these new techniques aid and empower coaches and scouts, helping them do their jobs better, faster and smarter.

Javier Fernández, data scientist at FC Barcelona, says: “Most of the interesting questions we get from coaches are things event data doesn’t cover. Coaches talk about space — creating space, getting into space. So we realised we needed more fine-grained ways of understanding space on the pitch.”

A graphic with no description

In the most cutting-edge development, data scientists have developed a technique called “ghosting”, where algorithms predict the most likely actions players will take in certain situations.

“You identify a specific scenario that tends to disrupt the opponent, giving you, say, a 30-second window where the opponent is disorganised,” says Paul Power, an AI scientist at STATS, an analytics company. “That’s the 30 seconds you then focus on in training.”

Again, these techniques have huge value in player recruitment. 

Models of players’ movements can be aggregated into playing styles for whole squads. This allows scouts to home in on players with similar styles to their own teams, ensuring tactical compatibility. 

They can use ghosting to model the impact of swapping a target player into their own team, looking not only at abstract measures such as points added per season, but how a player changes the team’s ability to execute specific moves.

Twenty years ago there was still a case to be made that football was too fluid for data to provide real value but advancements in technology have crushed that argument.

The evolution of tracking from a glorified pedometer to tools that can predict an opponent’s next move has created a data ecosystem worthy of the beautiful game.

Copyright The Financial Times Limited 2023. All rights reserved.
Reuse this content (opens in new window) CommentsJump to comments section
About this Special Report

Football clubs use data analytics to identify fresh talent, Indian companies fight for access to database of 1bn citizens, the analysts helping business tame the sea of data, and does worker-generated data lead to empowerment or exploitation?

Follow the topics in this article