InceptionA new company based in Palo Alto launched by computer professor Stanford Stefano Ermon, claims to have developed a new AI model based on “diffusion” technology. Inception calls it a large linguistic diffusion model or a “DLM” for a short period of time.
The AI genetic models that receive the most attention now can be widely divided into two types: Large LLMS and diffusion models. LLMS are used to produce text. Meanwhile, diffusion models, which supply AI systems such as Midjourney and Openai’s Sora, are mainly used to create images, videos and audio.
The Inception model offers the capabilities of traditional LLMs, including the production of code and answering questions, but with significantly faster performance and reduced computer costs, according to the company.
Ermon told TechCrunch that he has studied how to apply diffusion models to the text for a long time in Stanford’s workshop. His research was based on the idea that traditional llms are relatively late compared to diffusion technology.
With LLMS, “you can’t create the second word until you create the first and you can’t create the third until you create the first two,” Ermon said.
Ermon was looking for a way to apply a diffusion approach to the text, because, unlike LLMS, which operates in succession, the diffusion models start with an inappropriate estimation of the data they create (eg a picture) and then bring the data to the focus of all at the same time.
Ermon assumes that the production and modification of large blocks of text was also possible with diffusion models. After years of trying, Ermon and a student have achieved a major discovery, which they describe in detail in one research paper published last year.
Recognizing the potential of progress, Ermon founded the establishment last summer, utilizing two former students, Professor Ucla Aditya Grover and Professor Cornell Volodymyr Kuleshov, to co-ordinate the company.
While Ermon refused to discuss Inception’s funding, TechCrunch understands that the Mayfield Fund has invested.
Inception has already secured several customers, including the Fortune 100 public limited companies, facing their critical need for a reduced AI delay and increased speed, Emron said.
“What we have found is that our models can take advantage of GPUs much more effectively,” said Ermon, referring to computer chips commonly used to perform models in production. “I think this is a big deal. This will change the way people create language models.”
Inception offers an API as well as the developer development options in the field and support to perfect the model and a DLMS suite out of the frame for various cases of use. The company claims that its DLMS can reach up to 10 times faster than traditional LLMs while it costs 10X less.
‘Our’ small ‘coding model is as good as [OpenAI’s] GPT-4O mini while more than 10 times faster, “a company spokesman told TechCrunch.” The “mini” model exceeds small open source models like [Meta’s] Llama 3.1 8B and achieves more than 1,000 chips per second. ”
“Tokens” is the industry for pieces of raw data. A thousand tokens per second is an impressive speedassuming that the claims of Inception are maintained.
