SpatialLM: Large Language Model for Spatial Understanding
-
Updated
Mar 28, 2025 - Python
SpatialLM: Large Language Model for Spatial Understanding
[NeurIPS 2024] Official code for HourVideo: 1-Hour Video Language Understanding
[ICLR 2025] SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
Large-scale photo-realistic virtual worlds for embodied AI
[CVPR 2025] Source codes for the paper "3D-Mem: 3D Scene Memory for Embodied Exploration and Reasoning"
[CVPR 2025] Code for "StarGen: A Spatiotemporal Autoregression Framework with Video Diffusion Model for Scalable and Controllable Scene Generation".
Multimodal datasets for spatial intelligence
"Gradio" Interface for SpatialLM Model | A 3D Large Language Model for Structured Scene Understanding, Processing Point Cloud Data from Monocular Videos, RGBD Images, and LiDAR.
Trying out SpatialLM (SpatialLM: Large Language Model for Spatial Understanding). Impressed with results 💖
Add a description, image, and links to the spatial-intelligence topic page so that developers can more easily learn about it.
To associate your repository with the spatial-intelligence topic, visit your repo's landing page and select "manage topics."