At the GTC conference, Nvidia today was announced Nvidia NIM, a new software platform designed to streamline the deployment of custom and pre-trained AI models in production environments. NIM takes the software work that Nvidia has done on model inference and optimization and makes it easily accessible by combining a particular model with an optimized inference engine and then packaging it into a container, making it accessible as a microservice.
Typically, it would take developers weeks — if not months — to ship similar containers, Nvidia claims — and that’s if the company even has any in-house AI talent. With NIM, Nvidia is clearly aiming to create an ecosystem of AI containers that use its hardware as the foundation layer with these curated microservices as the core software layer for companies looking to accelerate their AI roadmap.
NIM currently includes support for models from NVIDIA, A121, Adept, Cohere, Getty Images and Shutterstock as well as open models from Google, Hugging Face, Meta, Microsoft, Mistral AI and Stability AI. Nvidia is already working with Amazon, Google, and Microsoft to make these NIM microservices available on SageMaker, Kubernetes Engine, and Azure AI, respectively. They will also be integrated into frameworks such as Deepset, LangChain and LlamaIndex.
“We think Nvidia’s GPU is the best place to run inference on these models […]and we believe that NVIDIA NIM is the best software package, the best runtime, so developers can focus on enterprise applications — and just let Nvidia do the work to produce these models for them as much as possible. efficient, business-like way so they can just do the rest of their work,” said Manuvir Das, head of Nvidia’s enterprise computing division, during a press conference ahead of today’s announcements.”
As for the inference engine, Nvidia will use Triton Inference Server, TensorRT and TensorRT-LLM. Some of the Nvidia microservices available through NIM will include Riva for customizing speech and translation models, cuOpt for routing optimizations, and the Earth-2 model for weather and climate simulations.
The company plans to add additional features over time, including, for example, making the Nvidia RAG LLM operator available as a NIM, which promises to make building AI chatbots that can pull in custom data much easier.
This wouldn’t be a developer conference without a few customer and partner announcements. Current users of NIM include Box, Cloudera, Cohesity, Datastax, Dropbox
and NetApp.
“Established business platforms are a gold mine of data that can be turned into productive combinations of artificial intelligence,” said Jensen Huang, founder and CEO of NVIDIA. “Created with our partner ecosystem, these containerized AI microservices are the building blocks for businesses in every industry to become AI companies.”