VLA-JEPA Collection VLA-JEPA model checkpoints (LIBERO, Pretrain, SimplerEnv) • 3 items • Updated 14 days ago • 12
view article Article Introducing NVIDIA Cosmos Policy for Advanced Robot Control nvidia • Jan 29 • 48
VST Collection A comprehensive framework designed to cultivate VLMs with human-like visuospatial abilities. • 6 items • Updated Feb 1 • 6
MolmoAct Data Mixture Collection All datasets for the MolmoAct (Multimodal Open Language Model for Action) release. • 4 items • Updated Dec 23, 2025 • 20
MolmoAct Collection All models for the MolmoAct (Multimodal Open Language Model for Action) release. • 10 items • Updated May 4 • 37
view article Article Welcome Gemma 3: Google's all new multimodal, multilingual, long context open LLM +2 ariG23498, merve, pcuenq, reach-vb • Mar 12, 2025 • 497
Cohere Labs Aya Vision Collection Aya Vision is a state-of-the-art family of vision models that brings multimodal capabilities to 23 languages. • 5 items • Updated Jul 31, 2025 • 74
Cosmos-Preidct1 Collection ⚠️ This collection is archived. 👉 https://huggingface.co/collections/nvidia/cosmos3 • 14 items • Updated 4 days ago • 304
Llama 3.2 Collection This collection hosts the transformers and original repos of the Llama 3.2 and Llama Guard 3 • 15 items • Updated Dec 6, 2024 • 674
Molmo Collection Artifacts for open multimodal language models. • 5 items • Updated Dec 23, 2025 • 310
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions Paper • 2309.10150 • Published Sep 18, 2023 • 26
Language Embedded Radiance Fields for Zero-Shot Task-Oriented Grasping Paper • 2309.07970 • Published Sep 14, 2023 • 8
Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control Paper • 2307.00117 • Published Jun 30, 2023 • 6