An attendee holds her new country's flag and her naturalization papers as she is sworn in during a U.S. citizenship ceremony in Los Angeles, U.S., July 18, 2017. Picture taken July 18, 2017. REUTERS/Mike Blake - RC1985AD4490
The US plans to include a question about citizenship status in its next census, but critics argue it will dissuade people from responding to the survey © Reuters

Governments have always relied on data to shape policy, but the digital age is allowing them to collect information with far more precision, joining up and cross-referencing different data sets.

This is posing a new set of ethical dilemmas for policymakers, and some commentators are asking whether there should be limits on the use of new data sources.

The earliest examples of government data collection relate to taxation, notably the Roman census of Judea documented in the Bible, and the English Domesday Book of 1086.

Yet this long heritage of harnessing data does not make the question of who and what to count any less controversial — as a continuing row over the US Census illustrates.

For the first time since 1950, the US government plans to include a question about citizenship status in the next census, which is due to take place in 2020.

A group of cities, states and civil liberty organisations are suing the government to prevent this, arguing that it would discourage households that include non-citizens from responding to the survey. This, they say, would undermine its accuracy and have far-reaching consequences for government spending and policy decisions.

It is just one example of the way in which data, far from being an esoteric and technical subject, can at its heart be deeply political.

The Australian census also ran into controversy two years ago, when plans to hold on to the data for several years for research purposes triggered an outcry about privacy and security. The Australian Bureau of Statistics had planned to match the records with data held by government departments.

“There needs to be a debate among politicians and civil society,” says Peter Wells, head of policy at the Open Data Institute, a non-profit company that advocates for the use of open data. 

“The vast majority of people don’t understand how their data can be used — we need to agree some rules of the road.”

Although decadal censuses have until recently been the backbone of most governments’ data collection, they are increasingly being joined by other data sources.

The UK’s work and pensions department, for example, is using benefit claims data to link up services and target policies to subsets of claimants.

This poses challenges, however, says Daniel Zeichner MP, chair of Westminster’s all-party parliamentary group on data analytics.

“Data should be a force for good but when you start joining things up you’ve got a whole new range of ethical dilemmas,” he says. “It is beginning to enter every part of public policymaking, but I fear we will see quite a few cases of unintended consequences.”

It is not only government policymakers who are tapping data to inform their work — monetary policymakers are, too. The Bank of England set up a data analytics division four years ago.

It has used a database of mortgage applications to influence lending regulation and analysed job advertisements to examine the changing demand for labour. It is also using machine learning to hone the language that its Prudential Regulation Authority uses in letters to the financial companies it oversees.

“There are many potential areas where these new sources and new techniques could be expanded to improve the bank’s understanding of the economic and financial system,” said the bank’s chief economist Andy Haldane in a speech earlier this year.

Are there limits to the extent to which new data sources can help policymakers, though?

The UK is on course to find out. Its Office for National Statistics is, at the government’s request, exploring the use of data sets such as mobile phone locations, property transactions, tax and social benefits systems, education and healthcare to replace traditional census questions.

The potential benefits include more up-to-date information and greater detail. A possible snag, however, evident in the census controversies in the US and Australia, is public reaction.

The greatest limitation on the use of the plethora of data that most people now create as they pass through their daily lives is the right to privacy. Policymakers’ biggest challenge in utilising Big Data is not, therefore, technical — it is, at heart, political.

The revolution in data provision “offers analytical riches”, according to Mr Haldane, but it also requires “considerable care . . . issues of data privacy loom much larger with granular, in some cases personalised, data”.

Many countries’ existing government departments are “not really structured to deal with a lot of these issues”, says Mr Zeichner, who says it is “up to us [politicians] to spend time trying to figure out how to balance the trade-offs that have to be made” between access to data and privacy rights.

But policymakers themselves are in desperate need of greater data literacy, he adds: “We are struggling woefully to keep up.”

“I do worry that [politicians] are not really engaged on this [and] it’s not yet become a doorstep issue,” says the ODI’s Mr Wells. 

From its earliest days, data collection and analysis has always been contentious. As the range of data sources proliferate, politicians and other decision makers urgently need to start a conversation with voters about their use.

Get alerts on Big Data when a new story is published

Copyright The Financial Times Limited 2020. All rights reserved.
Reuse this content (opens in new window)
Explore the Special Report
About this Special Report

Football clubs use data analytics to identify fresh talent, Indian companies fight for access to database of 1bn citizens, the analysts helping business tame the sea of data, and does worker-generated data lead to empowerment or exploitation?

Follow the topics in this article