When the Apache Kafka open source streaming service was created in 2011 at LinkedIn, it was a different world. Most companies were still operating. The concept of cloud computing was just beginning to emerge. WarpStreaman early-stage startup, sees the value of streaming in a native cloud environment and built a new solution from the ground up based on the Apache Kafka protocol, but designed for the cloud era.
Today the company announced a $20 million investment.
WarpStream CEO Richard Artoul says he and co-founder CTO Ryan Worl found in their previous jobs that getting data into Kafka was complicated and expensive, and they wanted to change that. “If you were to build something today that looked like Apache Kafka like we did with WarpStream, you’d be taking a really different approach than we did in 2011 when Kafka was first designed, and that’s why we think now is a really good time to us to build something new that can significantly reduce the cost and operational burden for people,” Artoul told TechCrunch.
The way they do this is by taking advantage of today’s cloud environment to separate compute from storage using an object storage service like Amazon S3. By taking this approach, they were able to eliminate cross-zone networking costs, which often account for 80% or more of the total cost of running large-scale Kafka workloads, according to the company.
“When you interact and store data in cloud object storage, you bypass all these networking charges that plague these big data systems when they’re lifted, moved to the cloud,” he said. “And a lot of the really hard problems around data persistence and data replication that Kafka had to solve on its own by replicating the data three times, duplicating it and making sure the data is never lost, we’re able to offload those problems at the object storage layer itself, and that ends up making the system much easier and cheaper to run.”
Artoul and Worl were working together at Datadog when they helped develop a storage system called Husky. Today, if a Datadog customer searches their application logs, they are actually using Husky. Datadog was also a big user of Kafka. “Based on our experience building the kind of storage system on top of object storage that we had built at Datadog, we felt that streaming systems should work the same way. And so last year we left Datadog to start working on it,” he said.
There are two approaches, one where customers can essentially bring their own cloud and install WarpStream, and one where they offer a fully managed serverless option. The BYOC version is available starting today. The company has also included a calculator right on the pricing page to work out how much it will cost to run WarpStream.
The founders brought in some of the people who helped build Husky to build the new system, and today they have nine employees. The good news is that they are hiring and hope to double the number of employees by the end of the year.
The $20 million investment was made by Greylock and Amplify Partners, with some luminaries from the data industry also participating with angel investments.