← Back to Papers

ImplicitRDP: An End-to-End Visual-Force Diffusion Policy with Structural Slow-Fast Learning

7.20 2512.10946 · 2025-12-11

Authors

Wendi Chen; Han Xue; Yi Wang; Fangyuan Zhou; Jun Lv; Yang Jin; Shirun Tang; Chuan Wen; Cewu Lu

Scores

7.7

Novelty

7.7

Technical

6.0

Transferability

7.3

Momentum

7.0

Evidence

7.0

Breakthrough

Rationale

The paper introduces a novel approach by integrating visual and force modalities in a unified end-to-end diffusion policy, addressing a significant challenge in contact-rich manipulation tasks. The Structural Slow-Fast Learning mechanism and Virtual-target-based Representation Regularization are innovative contributions that enhance model performance. The work is technically significant in improving reactivity and robustness in manipulation, which are major bottlenecks in robotics. While primarily applicable to robotics, the approach holds potential transferability to other domains requiring multi-modal integration. The alignment with ongoing research in multi-modal learning and robotics is strong. The empirical evidence is solid, demonstrating superior performance over baselines, and the release of code/video supports reproducibility. The approach has a good chance of influencing future work in multi-modal AI systems.

View on arXiv →

←

Falcon-H1R: Pushing the Reasoning Frontiers with a

→

WorldWarp: Propagating 3D Geometry with Asynchrono