Coupling WorldWarp with a global spatial memory module will allow for 'infinite loop' generation where the model recognizes and re-renders previously visited locations without drift.
Motivation
Autoregressive video generation suffers from 'dream drift,' where the environment morphs uncontrollably over long horizons. While WorldWarp handles short-term consistency via warping, it likely lacks global consistency if the camera returns to a start point. Integrating a persistent SLAM-like map would allow the model to recall geometry from thousands of frames ago.
Proposed Method
Extend the 'short-term' point cloud buffer in WorldWarp to a global voxel map or persistent point cloud stored on disk. Implement a retrieval mechanism that fetches historical geometry when the camera trajectory loops back to a visited coordinate. Modify the diffusion input to blend 'warped recent frames' with 'retrieved historical frames,' using the noise schedule to resolve conflicts between the two (favoring historical data to close the loop).
Expected Contribution
Demonstration of infinitely consistent environment exploration, solving the loop-closure problem in generative video models.
Required Resources
Significant memory/storage for spatial maps, SLAM expertise, long-sequence video generation compute.
Source Paper
WorldWarp: Propagating 3D Geometry with Asynchronous Video Diffusion