Trust-Region Noise Search (TRS)

TRS allows for reward-agnostic alignment across diverse domains including image generation, molecule design, and protein engineering.

Abstract

Aligning generative models (like diffusion or flow models) with specific user preferences often requires differentiable reward functions or expensive fine-tuning. We propose Trust-Region Noise Search (TRS), a simple yet effective black-box alignment algorithm.

TRS treats the generative model and the reward function as completely opaque. Instead of updating model weights, it optimizes the source noise (the latent space) using a trust-region search. This makes it applicable to non-differentiable rewards and avoids the catastrophic forgetting associated with full fine-tuning.

Key Methodology

The core idea of TRS is to iteratively explore the noise space $\mathbb{R}^M$ to find regions that map to samples with higher rewards.

The TRS pipeline: (a) Mapping noise to data manifold, (b) Generating new candidates, (c) Perturbation strategy, and (d) Dynamic trust-region updates.

Impact

Our results demonstrate that TRS can steer models towards high-aesthetic scores in text-to-image tasks and optimal docking scores in molecule design—all without ever calculating a single gradient through the generative model itself.

Status: Download Paper PDF · Accepted @ ReALM-GEN Workshop @ ICLR 2026 (Rio de Janeiro)
Collaborators: K. Ram, Prof. Daniel Cremers
Focus: High-efficiency alignment, non-differentiable rewards, black-box optimization.