Trust-Region Noise Search (TRS)

Publication 路 Black-box alignment for diffusion and flow models.
Niklas Schweiger, K. Ram, Daniel Cremers
Accepted @ ReALM-GEN Workshop, ICLR 2026.
馃搫 Download Paper

We propose a simple trust-region based search algorithm (TRS) which treats the pre-trained generative and reward models as black boxes, only optimizing the source noise to achieve reward-agnostic alignment.

TRS allows for reward-agnostic alignment across diverse domains including image generation, molecule design, and protein engineering.

Abstract

Aligning generative models (like diffusion or flow models) with specific user preferences often requires differentiable reward functions or expensive fine-tuning. We propose Trust-Region Noise Search (TRS), a simple yet effective black-box alignment algorithm.

TRS treats the generative model and the reward function as completely opaque. Instead of updating model weights, it optimizes the source noise (the latent space) using a trust-region search. This makes it applicable to non-differentiable rewards and avoids the catastrophic forgetting associated with full fine-tuning.

Key Methodology

The core idea of TRS is to iteratively explore the noise space $\mathbb{R}^M$ to find regions that map to samples with higher rewards.

The TRS pipeline: (a) Mapping noise to data manifold, (b) Generating new candidates, (c) Perturbation strategy, and (d) Dynamic trust-region updates.

Impact

Our results demonstrate that TRS can steer models towards high-aesthetic scores in text-to-image tasks and optimal docking scores in molecule design鈥攁ll without ever calculating a single gradient through the generative model itself.