It is not often that a new profession springs up almost overnight.
It is also unusual for many of the people who find their way into this new field to do it without the formal training provided by the normal institutions of higher education.
Machine learning, as well as the allied field of data science, is developing in a way that looks unlike most other professional career paths that preceded it. It represents both one of the most promising employment opportunities of the next few years and a model for how people entering the workforce today adapt to changes in employment demands in future.
The term is given to the marriage of statistics and computer science that has been revolutionising the field of artificial intelligence. It depends on both a new class of learning algorithms that improve over time, as well as the availability of large bodies of data to train the systems.
Many organisations have invested in the IT infrastructure needed to amass “big data”, often breaking down internal silos to pool them in more centralised data stores. The need to make more use of that asset has created a sudden demand for specialists, far outstripping the output of traditional computer science courses.
This is not a discipline open only to computer scientists. As a field that revolves around the collection, collation and analysis of data, it spans several fields. Maths, statistics and programming all play a part.
Many non-specialist executives will also find themselves in need of more than a passing familiarity with the field, as they interact with machine learning experts on the front lines of business.
There is no single job description that encompasses this emerging field. Many people who have traditionally been called “data analysts” aspire these days to the title of “data scientist”, says Anthony Goldbloom, founder and chief executive of Kaggle. His company, which was acquired this year by Google, maintains an informal network of experts around the world. He adds that data scientists, in turn, aspire to become machine learning experts.
The explosion in demand for these skills has happened far more quickly than the traditional academic courses have been able to meet. According to Stack Overflow, which runs an online developer community and carries out one of the most extensive annual surveys of the field, data scientists, machine learning experts and developers with a statistics or maths background are three of the four highest-paying job descriptions in software.
People are finding their way into this field through unconventional routes. A recent survey of 16,000 Kaggle’s users found that only 30 per cent studied machine learning or data science as part of their formal college education.
By contrast, 66 per cent described themselves as self-taught. And a little more than half said they had used online courses to learn the new disciplines.
People from many different areas have been drawn in. Goldbloom lists physics, computer science, classical statistics, bioinformatics and chemical engineering. This makes machine learning the first new discipline to demonstrate the importance of lifetime learning, he says: it simply will not be possible to spend an entire working life without re-skilling to adapt to new opportunities like these.
The speed with which people from other fields are adapting to machine learning also reflects the nature of the discipline.
Andrew Ng, one of the pioneers of a technique known as deep learning while a professor at Stanford University, says that as the field advances, it is actually becoming easier for non-specialists to break in.
“What surprised me is how easy it is to get into AI. With the rise of deep learning, the algorithms we have are getting simpler because we are relying more on data,” he says. “After a few weeks, you can read leading research papers and cutting edge ideas in the area.”
Ng was also a founder of Coursera, the platform for Moocs, or massive open online courses. His own AI course was the first to draw a huge audience over the internet, though very few of the people who started it actually got to the end, and Moocs have been succeeded by more structured online courses with more narrowly defined goals. “Knowledge spreads much faster,” he says.
The phenomenon of a new discipline taking shape so rapidly raises questions for the companies trying to harness these skills. How do they design jobs that make the most of this new breed of data expert? And how do they meet the aspirations of these new workers and develop career paths to draw them into upper levels of management, where a lack of technical skills is already beginning to look like a problem?
The bad news is that, at least for now, many companies seem to be failing. According to Kaggle’s survey, most people working in the field say they spend 1-2 hours a week looking for a new job, says Goldbloom.
This is borne out by the Stack Overflow data, which is based on a survey of 64,000 developers. Machine learning specialists topped its list of developers who said they were looking for a new job, at 14.3 per cent. Data scientists were a close second, at 13.2 per cent.
People working in this field experience many frustrations, says Goldbloom. Bad data are one of the main ones: their employers cannot provide the essential raw material for them to obtain results.
Some also complain of being given a lack of clear questions to answer. Companies may sense the opportunity, but they often do not know enough to get the most from their data assets. This also highlights the lack of technical knowledge among non-specialist managers who work alongside data scientists and machine learning experts.
And then there is the frustration that comes from being in the vanguard of any new profession. People complain of a lack of other talent to collaborate with, says Goldbloom.
Companies that grew up on the internet, collecting masses of data about their users’ behaviour and using techniques such as A/B testing to constantly improve their services, provide natural homes for workers like these. If they are to compete for some of tomorrow’s key talent, other companies need to make such techniques core to their own operations.
Which countries are becoming the first centres of machine learning?
A quarter of the users of Kaggle, which runs an online network for data scientists and machine learning experts, are in the US, with people in India representing the second biggest contingent, at 16 per cent.
The global picture is not complete: it does not reflect what is happening in China, for instance, where the Great Firewall blocks its citizens from joining. But with 165,000 people a month using the site, based on figures for September, it provides an early snapshot of the emerging discipline.
The biggest group, 27 per cent of users, work for tech and internet companies. But 15 per cent are in academia and 14 per cent in finance and insurance.
Kaggle helps companies connect with talent in this emerging field by posting open data science competitions, with prizes for the top performers. One interesting finding: only 2.5 per cent of Kaggle users are based in Russia, but nine of its 94 “grandmasters”, the highest ranking on the site, are based there.
Get alerts on Big Data when a new story is published