By Consultants Review Team
Nvidia has unveiled a ground-breaking artificial intelligence (AI) model aimed at improving the training of AI-powered robotic systems via simulation. Cosmos-Transfer1, a recently released large language model (LLM), aims to provide granular control over simulation environments, making it a valuable tool for developers working on robotics training.
The tech behemoth released the model as an open-source resource under a permissive license, making it available to interested developers and researchers via popular repositories such as GitHub and Hugging Face. The model is the latest addition to Nvidia's Cosmos Transfer World Foundation Models (WFMs), a collection of AI models designed to improve simulation-based robotics training.
Simulation-based training is becoming increasingly popular in the robotics industry, particularly for developing hardware that can use AI as a core processing unit. Unlike traditional factory robots, which are designed to perform specific tasks, this approach enables machines to be trained for a wider range of real-world scenarios, significantly increasing their versatility.
Nvidia's Cosmos-Transfer1 generates high-quality, photorealistic video outputs from structured video inputs such as segmentation maps, depth maps, lidar scans, and more. These results can then be used as training grounds for AI-powered robots, allowing them to learn from diverse simulated environments.
According to a paper published by the company in the arXiv journal, Cosmos-Transfer1 provides better customization than previous models. "It enables varying the weight of different conditional inputs based on spatial location, allowing developers to create highly controllable simulation environments," the graphics chip maker said.
Cosmos-Transfer1 is a diffusion-based model with seven billion parameters that is tuned for video denoising in the latent space. Its control branch enables the model to accept text and video inputs while producing photorealistic output videos. Four different types of control input videos are supported: canny edge, blurred RGB, segmentation mask, and depth map.
The AI model was thoroughly tested on Nvidia's Blackwell and Hopper series chipsets, with inference performed under the Linux operating system. Its design enables real-time world generation, making AI training more efficient and diverse.
Nvidia has made the Cosmos-Transfer1 AI model available under the Nvidia Open Model License Agreement, which allows for academic and commercial use. Developers and researchers can download the model from Nvidia's GitHub and Hugging Face pages.
We use cookies to ensure you get the best experience on our website. Read more...