This is an audio transcript of the FT News Briefing podcast episode: ‘The trials and tribulations of AI voice tech

Marc Filippino 
Good morning from the Financial Times. Today is Wednesday, June 21st, and this is your FT News Briefing. 

[MUSIC PLAYING] 

Marc Filippino
Singapore’s sovereign wealth fund wants to spend big in the US. And President Joe Biden’s son is facing federal charges. Plus, the FT’s Madhumita Murgia tests out AI voice technology. 

Madhumita Murgia
I hear kind of flashes of real Madhu in there, but it still feels like I can tell the difference. 

Marc Filippino
I’m Marc Filippino and here’s the news you need to start your day.

[MUSIC PLAYING]

Marc Filippino
Singapore’s sovereign wealth fund is accelerating its dealmaking in the US. The fund is called GIC. It told private equity and venture capital executives recently that it wants to increase exposure to US-focused funds, that’s according to FT sources. Like a lot of global investors, GIC is trying to expand beyond China. It’s worried about rising geopolitical tensions as well as an economic slowdown and a crackdown on business in China. In the US, GIC is focused on investing in venture capital funds and technology companies. So it looks like there’s still optimism about the US tech industry despite the big sell-off last year.

[MUSIC PLAYING] 

Marc Filippino
US president Joe Biden’s son has been hit with federal charges. Hunter Biden has agreed to plead guilty to failing to pay income tax. He’s also reached a deal with prosecutors over a separate charge that one accuses him of possessing a firearm while he was addicted to drugs. The FT’s deputy Washington bureau chief Lauren Fedor has more. 

Lauren Fedor
The Department of Justice has been investigating Hunter Biden for several years now. On top of that, Republicans have made a lot of noise about Hunter Biden, whether it’s about his tax affairs, whether it’s about this gun charge, whether it’s about his business dealings overseas. This development is in some ways a big moment. And Hunter Biden and the White House are hoping to move on. But, you know, we still have House Republicans controlling that chamber of Congress, and they are very adamant that they’re going to continue to investigate Hunter Biden through congressional committees. So I wouldn’t say that it’s necessarily the end of Hunter Biden being a talking point in Washington and beyond. 

Marc Filippino
And because of that, you might think this could affect Joe Biden’s 2024 re-election campaign. But Lauren says that’s not necessarily a given. 

Lauren Fedor
Hunter Biden’s personal issues, his issues with drugs, addiction, some of these investigations, they have been going on for a really long time. If you think back to the 2020 election, questions about Hunter Biden were part of the conversation then, and we all know how that election ended. You know, fast forward now we’re having echoes of 2020. You know, we’ll have to see how it plays out. But for now, it may be kind of baked-in in terms of voters’ perceptions, both on the right and the left. 

Marc Filippino
Lauren Fedor is the FT’s deputy Washington bureau chief.

[MUSIC PLAYING] 

Marc Filippino
Artificial intelligence voice technology is already being used for a bunch of things — audio books, video narrations, customer service interactions. ElevenLabs is one of the start-ups selling synthetic voice generation. They rolled out an early version of their product in January and ElevenLabs just raised $19mn, which gives it about a $100mn valuation. This technology is popular because it can do things like mimic the voices of political figures and celebrities. But because of that, it’s also running up against ethical and legal issues. The FT’s AI editor Madhumita Murgia reports. 

Madhumita Murgia
This clip of Nightmare on Elm Street

Clip from Nightmare on Elm Street
[Words are in Polish] 

Madhumita Murgia
Dubbed into Polish is the kind of movie experience Mati Staniszewski had to endure when he was growing up in Poland. 

Clip from Nightmare on Elm Street
[Words are in Polish]. 

Mati Staniszewski
Every movie, every American or British movie that you watch relies not even on dubbing, but on single narrator voice over. 

Madhumita Murgia
Freddy Krueger, the screaming teenager and the policeman — they’re all being done by the same person, so they all have the same voice. 

Mati Staniszewski
And as you can imagine, it’s pretty bad experience. 

Madhumita Murgia
The 28-year-old engineer eventually moved from Poland to London. But his moviegoing experience is one of the reasons he started his AI voice company. Generative AI technology like ChatGPT has the ability to create new text, music or images. And AI voice technology has the capacity to mimic and then create any voice, which would make it a lot easier to distinguish between Freddy Krueger and his victims. Mati co-founded ElevenLabs, along with another frustrated Polish movie buff and it’s now one of the leading text-to-speech AI start-ups. I met up with Mati at a co-working space in London. In the glass conference room, he opens his laptop to show me how the technology works. 

Mati Staniszewski
This got instant voice cloning. We’ll clone Madhu’s voice with her permission. I’m going to call it AI Madhu. And we’ll upload the file that Madhu provided to us. 

Madhumita Murgia
I’d already sent Mati an audio file of me speaking. He just needed a minute of my voice. His software program extracts certain qualities of my speech and recreates my voice so that now, it can be used to say basically anything. To show me what that’s like, Mati pulls up the FT’s Wikipedia page, copies some text and pastes it into his AI text-to-speech program. 

Mati Staniszewski
And all we need to click is this just to generate, and let’s see how that comes out. 

Madhumita Murgia
The Financial Times is a British daily business newspaper printed in broadsheet and published digitally that focuses on business and economic current affairs. 

Mati Staniszewski
What do you think? 

Madhumita Murgia
It’s funny because the accent is definitely different and I’m not great on differentiating regional British accents either because I didn’t grow up here. But I can tell that and I hear kind of flashes of real Madhu in there, but it still feels like I can tell the difference. 

Madhumita Murgia
I’ll admit it wasn’t perfect. They’re still trying to get accents right. But the way his software captures intonation and pacing is a big step forward. Another big advance is that it uses the text or the meaning of what’s written to adjust the emotion of the delivery. We typed in a super happy sounding sentence and here’s what came out of the laptop. 

Madhumita Murgia
It’s so beautiful and sunny in London today. I’m so excited. I had tons of sleep and I am really looking forward to the long weekend. I can’t wait. 

Madhumita Murgia
So I was really excited there. I thought the ending bit was quite good. The “I can’t wait” was pitched I think in quite the right way. 

Madhumita Murgia
This potential to capture feeling is what makes ElevenLabs as a software really attractive to companies that make, for example, audiobooks or provide real-time customer service. But the more advanced the technology becomes, the more it can be misused as well. ElevenLabs admits that its software has been misused, but it won’t say exactly how. We do know that AI voice technology has been used in phone scams and bank fraud. And users of ElevenLabs have spoken about the program’s ability to make deepfake voices of celebrities and politicians. Mati’s response to all of this is similar to other technologists. He acknowledges the risks but downplays them. He says they’re already working on technological solutions to spot AI in the wild.  

Mati Staniszewski
Already now, like every single audio that’s produced has hidden another audible signal that we can decode and know that this is coming from ElevenLabs or not. So everything that we produce, everything we’ve produced today can be tracked to the account, and we can take action in case you do something that’s against our terms of service or is illegal. 

Madhumita Murgia
Mati says they’ve ramped up their efforts to deactivate accounts of users who violate their policies. But AI companies in general are getting nervous enough to seek legal advice. 

Sophie Goossens
For the last six months, it’s almost been one a week, and I think the last two months it has been one a day. 

Madhumita Murgia
Sophie Goossens is a lawyer specialising in technology and copyright. She doesn’t represent ElevenLabs, but she says one of the biggest risks that AI voice companies face is copyright violation. 

Sophie Goossens
Because AI engines, they need a gigantic amount of data to learn from. And when the data that they learn from is protected by copyright, you always need to ask yourself whether or not you need permission to use that data. Another issue would be privacy and data protection. You, the human being with that voice, are in a position to control what is being done with your voice, even if it’s a machine that is generating it. 

Mati Staniszewski
Now, I think it’s going to be the easiest. 

Madhumita Murgia
But the risks of AI voice technology don’t stop at the legal grey areas. There are some serious ethical issues that come with the prospect of being able to speak using someone else’s voice. Even if imperfect, listening to a synthetic version of my own voice was unsettling. Despite the risks, Mati remains focused on getting his technology out there. He says at this year’s Venice Film Festival, one of the movies being screened uses voices produced entirely with their AI technology. 

Mati Staniszewski
That’s just a biggest surprise, like stretching the platform to what it wasn’t designed. An entire movie that’s produced with voices from ElevenLabs. We have those dialogue scenes, all synthetic and going to be the first foray into that. 

Madhumita Murgia
So we might one day see AI dubbing on Polish versions of 80s horror flicks like A Nightmare on Elm Street. But to get there, young AI companies like ElevenLabs will have to stay one step ahead of the legal and moral challenges that this technology brings. 

Madhumita Murgia
This is Madhu reporting for the FT News Briefing. Actually, that was my synthetic voice. This is the real Madhumita Murgia reporting for the FT News Briefing.

[MUSIC PLAYING] 

Marc Filippino
You can read more on all of these stories at FT.com. This has been your daily FT News Briefing. Make sure you check back tomorrow for the latest business news. 

Copyright The Financial Times Limited 2024. All rights reserved.
Reuse this content (opens in new window) CommentsJump to comments section

Comments

Comments have not been enabled for this article.