There is a lot of money in voice cloning.
In this case: Eleven Labsa startup that develops artificial intelligence-powered tools for creating and editing synthetic voices, announced today that it has closed an $80 million Series B round led by prominent investors including Andreessen Horowitz, former GitHub CEO Nat Friedman and entrepreneur Daniel Gross .
The round, which also had participation from Sequoia Capital, Smash Capital, SV Angel, BroadLight Capital and Credo Ventures, brings ElevenLabs’ total to $101 million and values the company at over $1 billion (up from ~$100 million last June). CEO Mati Staniszewski says the new cash will go toward product development, expanding ElevenLabs’ infrastructure and team, AI research, and “strengthening security measures to ensure the responsible and ethical development of AI technology.” .
“We raised the new money to solidify ElevenLabs’ position as a global leader in voice AI research and product development,” Staniszewski told TechCrunch in an email interview.
Founded in 2022 by Piotr Dabkowski, a former Google machine learning engineer, and Staniszewski, a former development strategist at Palantir, ElevenLabs went into beta about a year ago. Staniszewski says he and Dabkowski, who grew up in Poland, were inspired to create voice cloning tools by poorly dubbed American movies. AI could do better, they thought.
Today, ElevenLabs is perhaps best known for its browser-based speech generator application that can create live voices with adjustable toggles for pitch, emotion, rhythm, and other key vocal characteristics. For free, users can enter text and get a recording of that text read aloud by one of several default voices. Paying customers can upload voice samples to create new styles using ElevenLabs voice cloning.
Increasingly, ElevenLabs is investing in versions of its speech production technology aimed at creating audiobooks and dubbing movies and TV shows, as well as creating character voices for games and marketing activities.
Last year, the company released a “speech-to-speech” tool that attempts to preserve a speaker’s voice, prosody and intonation while automatically removing background noise and — in the case of movies and TV shows — translating and synchronizing speech with the footage source. On the roadmap for the coming weeks is a new dubbing studio workflow with tools to create and edit transcriptions and translations, and a subscription-based mobile app that narrates web pages and text using ElevenLabs voices.
ElevenLabs’ innovations have won startup clients at Paradox Interactive, the game developer whose recent projects include Cities: Skylines 2 and Stellaris, and the Washington Post — among other publishing, media and entertainment companies. Staniszewski claims that ElevenLab users have created the equivalent of more than 100 years of audio, and that the platform is used by employees at 41% of Fortune 500 companies.
But the publicity was not entirely positive.
The infamous 4chan message board, known for its conspiratorial content, used ElevenLabs tools to share hate messages impersonating celebrities like actress Emma Watson. James Vincent of The Verge was able to tap ElevenLabs to maliciously clone voices in seconds, creating samples containing everything from threats of violence to racist and transphobic comments. And at Vox, reporter Joseph Cox documented creating a clone convincing enough to fool a bank’s authentication system.
In response, ElevenLabs has sought to weed out users who repeatedly violate its terms of service, which prohibit abuse, and released a tool to detect speech generated by its platform. This year, ElevenLabs plans to improve the detection tool to flag audio from other voice-producing AI models and work with unnamed “distribution players” to make the tool available on third-party platforms, Staniszewski says.
ElevenLabs offers a number of different voices, some synthetic, some cloned from voice actors.
ElevenLabs has also faced criticism from voice actors who claim the company is using samples of their voices without their consent — samples that could be leveraged to promote content they don’t support or spread misinformation and disinformation. In a recently In the Vice article, victims recount how ElevenLabs was used in harassment campaigns against them, in one instance to share an actor’s personal information — their home address — using a cloned voice.
Then there’s the elephant in the room: the existential threat platforms like ElevenLabs pose to the voice acting industry.
Motherboard writes about how voice actors are increasingly being asked to sign over rights to their voices so that clients can use artificial intelligence to create synthetic versions that could eventually replace them — sometimes without commensurate compensation. The fear is that voice work — especially cheap, entry-level work — will eventually be replaced by AI-generated voiceovers, and that actors will have no recourse.
Some platforms try to find a balance. Earlier this month, Replica Studios, a competitor of ElevenLabs, signed an agreement with SAG-AFTRA to create and license digital copies of the voices of members of the media artists union. In a press release, the organizations said the agreement established “fair” and “ethical” terms and conditions to secure performers’ consent — and the terms of negotiation for the use of digital voice are doubled in new projects.
However, even that didn’t sit well with some voice actors – including SAG-AFTRA their own members.
ElevenLabs’ solution is a marketplace for voices. Currently in alpha and set to become more widely available in the coming weeks, the marketplace allows users to create a voice, verify it and share it. When others use a voice, the original creators receive compensation, Staniszewski says.
“Users always remain in control of the availability and compensation terms of their voice,” he added. “The marketplace is designed as a step toward aligning AI advances with established industry practices while bringing a diverse set of voices to the ElevenLabs platform.”
The naysayers might dispute the fact that ElevenLabs doesn’t pay in cash, though — at least not for now. The current setup has creators getting credit for ElevenLabs’ premium services (which some people find ironic, I’d bet).
Perhaps that will change in the future, as ElevenLabs – now among the best-funded synthetic voice startups – attempts to beat new competition from the likes of Papercup, Deepdub, ElevenLabs, Acapela, Respeecher and Voice.ai as well as established Big Tech companies. such as Amazon, Microsoft and Google. Either way, ElevenLabs, which plans to grow its workforce from 40 to 100 by the end of the year, intends to stay — and make waves — in the fast-growing synthetic voice market.
