Trust-Region Noise Search for Black-Box Alignment of Diffusion and Flow Models

1TU Munich
TRS Alignment Teaser
TRS balances global exploration of noise space with local refinement around good regions.
Inference-time alignment overview
Searching over noise can dramatically upgrade even smaller base models.

In a Nutshell

Diffusion and flow models generate outputs from random noise. What if we could search for better noise? TRS does exactly that: it treats the generative model and any reward function as black boxes, and uses a trust-region search to find noise inputs that produce higher-reward outputs — without retraining or backpropagating through anything.

Key highlights

  • Outperforms noise-optimization and even full trajectory-optimization baselines under identical compute
  • Tested across text-to-image (SD1.5, SDXL), molecule design, and protein design
  • Works with any reward: differentiable, non-differentiable, expensive, or human-proxy
  • Stays on the data manifold — no gradient drift, stable even with many optimization steps
  • Simple algorithm, minimal hyperparameter tuning across all tasks

How It Works

A generative model maps a noise vector to an output — an image, a molecule, a protein. Change the noise and you change the output. Some noise inputs lead to outputs that score much higher on a given reward. TRS finds them.

TRS starts by sampling random noise vectors, evaluating them, and selecting the top-k as starting points. It then maintains a “trust region” around each — a local neighborhood where it proposes new candidates by perturbing a random subset of noise dimensions. Regions that find improvements expand; those that stall contract. After each iteration, all regions re-center on the globally best noise vectors, naturally shifting from broad exploration toward focused exploitation.

Because TRS only modifies the source noise and never touches the model internals, generated samples stay on the learned data manifold — avoiding the drift problems of gradient-based methods.

Trust-region search algorithm
Trust regions expand around improving noise vectors and contract when stalled, balancing exploration and exploitation.

Results

Text-to-Image

Text-to-image visual examples
Optimized SDXL-Lightning samples. TRS outputs match the prompt more faithfully — correct counts, readable text, accurate layout.

Molecule & Protein Design

Molecule and protein visual examples
Randomly selected optimized molecules and proteins. TRS samples land closer to target properties and achieve higher designability.

Quantitative Comparison

Quantitative text-to-image results
Molecule optimization results
Protein optimization results

Mean best rewards and scaling curves. TRS consistently scores highest under the same compute budget across all domains.

BibTeX

@misc{schweiger2026trustregionnoisesearchblackbox,
      title={Trust-Region Noise Search for Black-Box Alignment of Diffusion and Flow Models},
      author={Niklas Schweiger and Daniel Cremers and Karnik Ram},
      year={2026},
      eprint={2603.14504},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2603.14504},
}