DrivePI: Spatial-aware 4D MLLM for Unified Autonomous Driving Understanding, Perception, Prediction and Planning
Presents DrivePI, a 4D (3D + time) multimodal large model for autonomous driving that unifies perception, prediction, and planning. Instead of separate stacks, DrivePI treats driving as a holistic spatial-temporal understanding problem, ingesting sensor data and outputting both scene interpretations and future trajectories. It’s another sign that end-to-end or semi end-to-end ‘driving MLLMs’ are becoming a serious research direction.
Zhe Liu, Runhui Huang