AI language learning app Speak is on a tear.
Since launching in the inaugural South Korean market in 2019, I speak has grown to over 10 million users, CEO and co-founder Connor Zwick told TechCrunch. Its user base has doubled every year for the past five years, and Speak now has customers in more than 40 countries.
Eager to see Speak’s expansion continue, investors are now committing to additional cash at the startup.
The company this week closed a $20 million Series B expansion led by Buckley Ventures, with participation from OpenAI Startup Fund, Khosla Ventures, Y Combinator co-founder Paul Graham and LinkedIn executive chairman Jeff Weiner. The capital injection brings Speak’s total fundraising to $84 million and doubles the startup’s valuation to half a billion dollars.
Launched in 2014 by Zwick and Andrew Hsu, who met while on Thiel Fellowships, Speak is designed to teach language by having users learn speech patterns and practice repetition in created lessons instead of memorizing vocabulary and grammar . In this way, it’s not unlike Duolingo, particularly with Duolingo’s newer AI features. But true to its namesake verb, Speak emphasizes the verbal above all else.
“Our core philosophy is focused on getting users to speak out loud as much as possible,” Zwick said. “Acquiring fluency helps people build bonds, bridge cultures and create economic opportunities. It remains the most important part of language learning for humans, but historically the least supported by technology.”
Speak started with English and has since launched lessons in Spanish, backed by a speech recognition model trained on internal data. Next up is French, but Zwick didn’t say exactly when she’ll start classes for that.
Speak makes money by charging $20 a month, or $99 a year, for access to all of the app’s features, including review material and individual lessons.
With a workforce of 75 in offices in San Francisco, Seoul, Tokyo and Ljubljana (the capital of Slovenia), Speak’s near-term roadmap is developing new models that provide better real-time feedback on tone and pronunciation, Zwick said.