Nvidia announced today it’s releasing a family of foundational AI models called Cosmos that can be used to train humanoids, industrial robots, and self-driving cars. While language models learn how to generate text by training on copious amounts of books, articles, and social media posts, Cosmos is designed to generate images and 3D models of the physical world.
During a keynote presentation at the annual CES conference in Las Vegas, Nvidia CEO Jensen Huang showed examples of Cosmos being used to simulate activities inside of warehouses. Cosmos was trained on 20 million hours of real footage of “humans walking, hands moving, manipulating things,” Jensen said. “It’s not about generating creative content, but teaching the AI to understand the physical world.”
Researchers and startups hope that these kinds of foundational models could give robots used in factories and homes more sophisticated capabilities. Cosmos can, for example, generate realistic video footage boxes falling from shelves inside a warehouse, which can be used to train a robot to recognize accidents. Users can also fine-tune the models using their own data.
A number of companies are already using Cosmos, Nvidia says, including humanoid robot startups Agility and Figure AI as well as self-driving car companies like Uber, Waabi, and Wayve.
Nvidia also announced software designed to help different kinds of robots learn to perform new tasks more efficiently. The new feature is part of Nvidia’s existing Isaac robot simulation platform that will allow robot builders to take a small number of examples of a desired task, like grasping a particular object, and generate large amounts of synthetic training data.
Nvidia hopes that Cosmos and Isaac will appeal to companies looking to build and use humanoid robots. Jensen was joined on stage at CES by life-sized images of 14 different humanoid robots developed by companies including Tesla, Boston Dynamics, Agility, and Figure.
Along with Cosmos, Nvidia also announced Project Digits, a $3,000 “personal AI supercomputer” that can run run a large language model of up to 200 billion parameters without the need of cloud services from the likes of AWS or Microsoft. It also announced its highly anticipated next-generation RTX Blackwell GPUs, and incoming software tools to help build AI agents.
Services Marketplace – Listings, Bookings & Reviews
 
                    