The hidden ‘replication crisis’ of finance
It may sound like a low-budget Blade Runner rip-off, but over the past decade the scientific world has been gripped by a “replication crisis” — the findings of many seminal studies cannot be repeated, with huge implications. Is investing suffering from something similar?
That is the incendiary argument of Campbell Harvey, professor of finance at Duke university. He reckons that at least half of the 400 supposedly market-beating strategies identified in top financial journals over the years are bogus. Worse, he worries that many fellow academics are in denial about this.
“It’s a huge issue,” he says. “Step one in dealing with the replication crisis in finance is to accept that there is a crisis. And right now, many of my colleagues are not there yet.”
Harvey is not some obscure outsider or performative contrarian attempting to gain attention through needless controversy. He is the former editor of the Journal of Finance, a former president of the American Finance Association, and an adviser to investment firms like Research Affiliates and Man Group.
He has written more than 150 papers on finance, several of which have won prestigious prizes. In fact, Harvey’s 1986 PhD thesis first showed how the bond market’s curves can predict recessions. In other words, this is not like a child saying the emperor has no clothes. Harvey’s escalating criticism of the rigour of financial academia since 2015 is more akin to the emperor regretfully proclaiming his own nudity.
To understand what the ‘replication crisis’ is, how it has happened and its implications for finance, it helps to start at its broader genesis.
In 2005, Stanford medical professor John Ioannidis published a bombshell essay titled “Why Most Published Research Findings Are False”, which noted that the results of many medical research papers could not be replicated by other researchers. Subsequently, several other fields have turned a harsh eye on themselves and come to similar conclusions. The heart of the issue is a phenomenon that researchers call “p-hacking”.
In statistics, a p-value is the probability of whether a finding could be because of pure chance — a simple data oddity like the correlation of Nicolas Cage films to US swimming pool drownings — or whether it is “statistically significant”. P-scores indicate whether a certain drug really does help, or if cheap stocks do outperform over time.
P-hacking is when researchers overtly or subconsciously twist the data to find a superficially compelling but ultimately spurious relationship between variables. It can be done by cherry-picking what metrics to measure, or subtly changing the time period used. Just because something is narrowly statistically significant, does not mean it is actually meaningful. A trading strategy that looks golden on paper might turn up nothing but lumps of coal when actually implemented.
Harvey attributes the scourge of p-hacking to incentives in academia. Getting a paper with a sensational finding published in a prestigious journal can earn an ambitious young professor the ultimate prize — tenure. Wasting months of work on a theory that does not hold up to scrutiny would frustrate anyone. It is therefore tempting to torture the data until it yields something interesting, even if other researchers are later unable to duplicate the results.
Obviously, the stakes of the replication crisis are much higher in medicine, where lives can be in play. But it is not something that remains confined to the ivory towers of business schools, as investment groups often smell an opportunity to sell products based on apparently market-beating factors, Harvey argues. “It filters into the real world,” he says. “It definitely makes it into people’s portfolios.”
AQR, a prominent quant investment group, is also sceptical that there are hundreds of durable and successful factors that can help investors beat markets, but argues that the “replication crisis” brouhaha is overdone. Earlier this year it published a paper that concluded that not only could the majority of the studies it examined be replicated, they still worked “out of sample” — in actual live trading — and were actually further corroborated by international data.
Harvey is unconvinced by the riposte, and will square up to the AQR paper’s authors at the American Finance Association’s annual meeting in early January. “That’s going to be a very interesting discussion,” he promises.
Many of the industry’s geekier members will be rubbing their hands at the prospect of a gladiatorial, if cerebral, showdown to kick off 2022.