Listen to this article
The Financial Times gave part of my job to a robot last week. For years I have been making podcast versions of my column, but now I am faced with stiff competition — in the shape of Experimental Amy.
She is vastly undercutting me on price, is a quick learner and always does precisely what she is told.
On the downside, I daresay she is a less convivial colleague than I am — but then you cannot have everything.
Being replaced by a robot is every worker’s worst nightmare, and when I discovered that she was muscling in on my act I was understandably distressed. Yet once I got over the outrage and sat down and listened to her work, I started to feel better.
I know it is early days for her, but at the moment Amy is no match for me: instead, according to my partial ear, she is absolutely useless. If you don’t believe me, listen. Click on the arrow at the top of this column to hear what Amy has to say, and then click here to hear my own version. Don’t read the words at the same time, just listen.
To be fair, Amy does have some things going for her. For a start, she has a great voice.
When I started recording my columns a decade ago, one listener wrote in to complain that my “nasal Estuarine twang” meant he had to stop listening at once. By contrast, Amy’s voice has an agreeably low timbre and is smooth as velvet.
Her second advantage is that she is practically free. She is part of a new service from Amazon that turns text to speech, and which costs nearly nothing — at least by comparison with what the FT pays me.
Even more impressive is her speed. Less than two seconds after receiving my written text she has supplied a spoken version of it. Which means by the time I have cleared my throat and started to read: “Last Monday the Finan . . . ” she has already finished.
In her case there is no kerfuffle involved and she does the job single-handedly. By contrast, my recording involves a producer, the use of a studio, the necessity of the two of us exchanging emails to confirm a mutually convenient time and then some idle pleasantries when we meet. It involves setting up equipment and then editing the clip to iron out all my stumbling. It takes half an hour of the producer’s time and about 15 minutes of mine.
That would swing it if what Amy produced were halfway decent — but it is not. She keeps putting her full stops in the wrong places. She runs words together when they should be kept apart. Her grasp of syntax is patchy.
Listening to her is not like listening to a non-English speaker read aloud, but to someone without brain, or heart, or sense of humour. Indeed her delivery is so poor that I do not even understand the column when she reads it — which is saying something given that I wrote it.
Amy’s learning curve is very steep. A couple of years ago mass-market voice bots sounded like Stephen Hawking. Every day Amy’s learning algorithms help her improve. Her weird timing will be fixed. Her intonation will get better. She will be able to do ersatz emotion and some jokes.
But Amy will never be able to read with understanding. Amy will never know when to pause and when to sneer. Amy will never do irony. She will continue to get it wrong.
In the last she is not alone. I also make mistakes when I read. Sometimes there is a clanging in the background. Sometimes I read too fast or am a bit too emphatic. But I fancy that listeners do not treat our failings equally.
When a human screws up the audience understands why. Quite often a mistake makes us feel more closely tied to the person who has made it. But when a robot makes a mistake, we do not sympathise and are likely to lose faith in the whole undertaking.
In the end, I do not resent Amy because she is about to steal my job. But I do dislike her for reading my columns like that. Put through her mangle, I see them as the most impenetrable, dreariest things ever written.
Amy could make a decent stab in reading the shipping forecast or the football results. She will very soon be good enough to read anything predictable. But that is the point about a decent column. If it is predictable, it is not good enough.
Get alerts on Audio articles when a new story is published