← Back to Ideas

A closed-loop 'Failure-Driven Synthesis' pipeline, where synthetic videos are generated specifically to correct policy uncertainty or observed failure modes, will yield higher success rates with fewer generated samples than uniform data augmentation.

Feasibility: 8 Novelty: 7

Motivation

Current generative augmentation strategies typically expand datasets uniformly or randomly. However, robot policies often fail in specific 'long-tail' scenarios. Generating data that specifically targets the policy's current weaknesses (active learning) would make the synthesis process far more efficient and effective.

Proposed Method

1. Train an initial policy on real data with an uncertainty estimation head (e.g., ensemble variance). 2. During evaluation, identify states with high uncertainty or failure. 3. Use these failure states as the initial condition (first frame) for RoboVIP, prompting it to generate a successful task completion video from that specific viewpoint. 4. Add these 'corrective' videos to the training set and fine-tune the policy iteratively.

Expected Contribution

Demonstration of an active learning framework for generative video in robotics, proving that targeted synthesis outperforms blind augmentation for resolving edge-case failures.

Required Resources

RoboVIP model access, robot simulation environment (or real hardware) for evaluation, compute for iterative policy training and video generation.

Source Paper

RoboVIP: Multi-View Video Generation with Visual Identity Prompting Augments Robot Manipulation

View Paper Details →

←

Temporal-GDPO: Decoupling normalization across tem

→

Integrating a learned inverse dynamics model as a