Hybrid AR-NAR Decoding via Dynamic Entropy Thresholding
Motivation
Non-autoregressive (NAR) generation offers speed, while autoregressive (AR) generation offers coherence. Denoising Entropy provides a real-time signal of when the model is 'confused.' This allows for a dynamic switching mechanism that uses fast parallel decoding for easy sequences and falls back to slow serial decoding only when necessary.
Proposed Method
Develop a hybrid decoding algorithm for a text-based MDM. During the parallel unmasking steps, monitor the Denoising Entropy. If the entropy of a specific span exceeds a learned threshold $\tau$, switch to a localized autoregressive decoding mode for that span (conditioning on the left context), then return to parallel decoding for the remainder of the sequence.
Expected Contribution
A 'best of both worlds' decoding strategy that achieves near-NAR speeds with near-AR coherence, specifically reducing hallucinations in factual text generation.
Required Resources
MDM for text (e.g., MDLM), evaluation benchmarks for text consistency (e.g., perplexity, factual accuracy metrics).
Source Paper
Optimizing Decoding Paths in Masked Diffusion Models by Quantifying Uncertainty