Graphic: How machines learn

How does machine learning work?

Machine learning is a subset of artificial intelligence (AI) in which a computer imitates the way humans learn from experience.

It involves training a computer to make predictions or decisions using data, rather than explicitly programming it to follow instructions. This training is repeated to gradually improve the computer’s performance.

Machine learning is already used in internet search engines, in email filters to detect spam, in music and video streaming services to make personalised recommendations, in banking software to detect unusual transactions, and in camera technology for image recognition.

Let’s see how it works with image recognition . . .

Here is an example of how machine learning can enable a computer to distinguish a cat from a dog — and the steps involved to ‘train’ it

1 Collect the data

Data is collected on 2 distinguishing features between cats and dogs:

● Ear shape

● Nose size (relative to head)

. . . from a data set of 100,000 photographs

2 Prepare the data

Data is de-duplicated, error-corrected and randomised — so that the data sequence does not affect the learning process. It is split into “training” data (c 80 per cent) and “evaluation” data (c 20 per cent). In this example, the training data is labelled with the correct answers, so the model can measure its accuracy in recognising cats and dogs and gradually learn over time. This is known as supervised learning.

Data visualisation may also be used to identify any relationships between variables, or any imbalances in the data that may cause the model to be biased.

Relationships between variables and imbalances in data can be spotted early in the training process

If there are far more dog photos than cat photos, the model will be biased towards predicting “Dog” as this will be the correct answer most of the time.

3 Choose a model

In this example, there are only 2 features that the computer needs to learn from: ear shape and nose size. So a small linear model can be used to generate the outputs — or answers — from the data.

A small linear model can be used when learning from 2 sets of features

4 Train the model

The model starts by applying random weights and biases to the features data, to predict the answers. At first, these answers will not be very accurate. But they are then compared with the correct answers, and the weights and biases adjusted to generate more accurate predictions.

Algorithmic models can be adjusted to generate more accurate predictions

This process is repeated to incrementally improve the model’s ability to make an accurate prediction. Each reiteration is a training “step”.

5 Evaluate the model

An evaluation data set is used to test the model

The model is then tested using the evaluation dataset, which was kept separate at the preparation stage — so the model was not trained on it. Predictions are compared with correct answers to test the model’s performance using new data.

New evaluation data

Model with adjusted weights and biases

Prediction

Evaluation against correct answer

6 Tune the parameters

To further improve the training, parameters — such as the number of times the training data set is used, and the amount of adjustment in each training step — may be fine tuned, to achieve more accuracy.

Adjustments are then made to the algorithmic model

7 Use the model to make predictions/decisions

Now the model has learned how to make accurate predictions

Once the parameters are tuned and the training is optimised, the model can be used to provide the outputs — or answers — from the inputs — or data. It has learned how to do this without following instructions.