Meta AI releases Joint Embedding Predictive World Model JEPA-WMs for physical planning

2026-04-04 12:58:49

ME News message, on April 3 (UTC+8), the Meta AI Research team released the joint embedding predictive world model JEPA-WMs for physical planning and its related research. The study investigates the key factors behind the model’s success and provides a complete PyTorch implementation, dataset, and pretrained models. The released models include the core JEPA-WM, as well as the DINO-WM and V-JEPA-2-AC(fixed) models as baselines, covering multiple robotic manipulation and navigation environments such as DROID & RoboCasa, Metaworld, Push-T, PointMaze, and Wall. The models use visual encoders such as DINOv3 ViT-L/16, DINOv2 ViT-S/14, and V-JEPA-2 ViT-G/16, with input image resolutions mainly 224×224 or 256×256. The project also provides an optional VM2M decoder head for visualization and trajectory decoding, but emphasizes that this decoder is not necessary for training a world model or conducting planning evaluations. All resources have been made public on GitHub, Hugging Face, and arXiv. (Source: InFoQ)

View Original

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.