← Back to Ideas

Integrating a differentiable physics simulation layer into the decoupled pose estimation module will significantly reduce inter-object penetration and gravity-defying artifacts in generated scenes without requiring annotated physical data.

Feasibility: 8 Novelty: 7

Motivation

While SceneMaker decouples de-occlusion and pose estimation to improve geometry, it likely treats scene composition as a purely visual or geometric task. This often leads to physically implausible results (floating objects, collisions) in open-set generation where training data coverage is sparse. A physics-aware constraint would bridge the gap between visual generation and physical realism.

Proposed Method

Extend the SceneMaker framework by adding a differentiable physics engine (e.g., varying the pose estimation loss function). After the initial pose estimation and de-occlusion pass, run a short simulation step to detect collisions and unstable equilibria. Backpropagate these physical error terms (penetration depth, potential energy) to update the pose parameters iteratively, freezing the de-occluded geometry to maintain visual fidelity.

Expected Contribution

A self-supervised method for ensuring physical plausibility in generative 3D scenes, enhancing their utility for simulations and robotics.

Required Resources

High-end GPUs (e.g., A100), differentiable physics library (e.g., Brax or Warp), SceneMaker codebase.

Source Paper

SceneMaker: Open-set 3D Scene Generation with Decoupled De-occlusion and Pose Estimation Model

View Paper Details →