# controlled_generation_with_equivariant_variational_flow_matching__267c893e.pdf Controlled Generation with Equivariant Variational Flow Matching Floor Eijkelboom 1 Heiko Zimmermann 2 Sharvaree Vadgama 2 Erik J Bekkers 2 Max Welling 1 Christian A. Naesseth 1 Jan-Willem van de Meent 1 We derive a controlled generation objective within the framework of Variational Flow Matching (VFM), which casts flow matching as a variational inference problem. We demonstrate that controlled generation can be implemented two ways: (1) by way of end-to-end training of conditional generative models, or (2) as a Bayesian inference problem, enabling post hoc control of unconditional models without retraining. Furthermore, we establish the conditions required for equivariant generation and provide an equivariant formulation of VFM tailored for molecular generation, ensuring invariance to rotations, translations, and permutations. We evaluate our approach on both uncontrolled and controlled molecular generation, achieving state-of-the-art performance on uncontrolled generation and outperforming stateof-the-art models in controlled generation, both with end-to-end training and in the Bayesian inference setting. This work strengthens the connection between flow-based generative modeling and Bayesian inference, offering a scalable and principled framework for constraint-driven and symmetry-aware generation. 1. Introduction Generative modeling has seen remarkable advances in recent years, particularly in image generation (Ramesh et al., 2022; Rombach et al., 2022), where diffusion-based approaches based on score matching (Vincent, 2011) have proven highly effective (Ho et al., 2020a; Song et al., 2020b). However, these methods rely on stochastic dynamics that require iterative denoising steps during sampling, leading to significant computational overhead (Song et al., 2020a; Zhang & Chen, 2022). An alternative approach, continuous normalizing *Equal contribution 1Bosch-Delta Lab 2AMLab. Correspondence to: Floor Eijkelboom . Proceedings of the 42 nd International Conference on Machine Learning, Vancouver, Canada. PMLR 267, 2025. Copyright 2025 by the author(s). flows (CNFs) (Chen et al., 2018), models a continuous-time transformation between distributions (Song et al., 2021), enabling direct sampling without Markov chain steps. Yet, CNFs have historically been hindered by their reliance on solving high-dimensional ordinary differential equations (ODEs), making both training and sampling computationally expensive (Ben-Hamu et al., 2022; Rozen et al., 2021; Grathwohl et al., 2019). To address these challenges, Lipman et al. (2023) introduced Flow Matching (FM), a simulation-free method for training CNFs by regressing onto vector fields that define probability paths between noise and data distributions. Unlike traditional CNF training, which requires maximum likelihood estimation through ODE solvers, FM directly learns vector fields through a per-sample objective, enabling scalable training without numerical integration. FM generalizes beyond diffusion methods by accommodating arbitrary probability paths, including those based on optimal transport (Chen & Lipman, 2024; Klein et al., 2023), enabling faster training while maintaining expressiveness. Empirically, FM has outperformed diffusion models in likelihood estimation and sample quality on datasets such as Image Net (Wildberger et al., 2024; Dao et al., 2023; Kohler et al., 2023). Variational Flow Matching (VFM) (Eijkelboom et al., 2024) frames flow matching as posterior inference in the distribution over trajectories induced by the used interpolation. The key idea is to approximate the posterior probability path, i.e. the probability distributions (for different points in time) over endpoints given the current point in space, using a sequence of variational distributions. For generation, VFM approximates the true vector field with the expected vector field based on the learned variational approximation. This approach achieves state-of-the-art results in categorical data generation while maintaining the computational efficiency that makes FM practical. While VFM has demonstrated considerable success in categorical data generation and shows initial promising results on general geometries and molecular generation tasks (Zaghen et al., 2025; Guzm an-Cordero et al., 2025), its broader potential, both methodologically and in terms of real-world applications, has not been explored. In particular, a key distinguishing feature of VFM is its ability to directly reason about (approximate) posterior Controlled Generation with Equivariant Variational Flow Matching probability paths, which makes it well suited to address two fundamental challenges in flow-matching-based generative modeling: controlled generation and the incorporation of inductive biases such as symmetries. Controlled generation is a fundamental challenge in generative modeling, requiring models to produce outputs that satisfy specific constraints while maintaining natural variation in unconstrained aspects. This is particularly crucial for applications like molecular design, where the interplay between discrete (atom types, bonds) and continuous (spatial positions) properties typically necessitates combining multiple generative approaches. VFM s unified treatment of mixed modalities could potentially address this challenge, though developing effective control mechanisms remains an open problem. Additionally, many real-world applications exhibit inherent symmetries for instance, molecular structures should remain valid under rotations, translations, and permutations. Incorporating such domain-specific constraints into VFM would be essential for producing outputs with consistent structure and improved generalization. Thus, extending VFM to handle both controlled generation and symmetry constraints represents a key direction for developing more practical and reliable generative models. In this work, we address both of the aforementioned challenges. We extend VFM to controlled generation, deriving a principled formulation that enables generative models to satisfy explicit constraints. We show that this formulation naturally emerges from the connection between flow matching and variational inference (Eijkelboom et al., 2024), allowing conditional generation to be understood as a Bayesian inference problem. Additionally, this perspective enables post-hoc control of pretrained generative models without requiring conditional training, providing cheap and flexible alternative to standard end-to-end approaches. Furthermore, we show how we can design variational approximations that are group-equivariant with respect to its expectation, which ensures that the generative dynamics respect key invariances. We demonstrate the utility of these advancements on problems in molecular generation, a field that, as of late, has received great attention and as a result many advances of the state-of-the-art. Our key contributions are: Controlled Generation as Variational Inference: We derive a controlled generation objective within VFM and show that it can be used in two ways: 1) for endto-end training of conditional generative models, or 2) as a Bayesian inference problem, enabling post hoc control of unconditional models without retraining. Equivariant Formulation: We establish the conditions required for equivariant generation and provide an equivariant formulation of VFM, ensuring that the generative process respects symmetries such as rotations, translations, and permutations critical for molecular modeling. Results for Molecular Generation: We validate our method on both unconditional and controlled molecular generation, achieving state-of-the-art results on unconditional tasks and significantly outperforming existing models in conditional generation both with and without end-to-end training. Notably, Bayesian inferencematches or surpasses explicitly trained conditional models, demonstrating its flexibility and scalability. These advances establish VFM as a robust and efficient framework for constraint-driven generative modeling, with broad applications in molecule generation, e.g. material design, drug discovery, and beyond. 2. Background 2.1. Transport Dynamics for Generative Modeling Generative modeling through transport dynamics is a flexible framework for approximating complex distributions, such distributions over valid molecular structures, by transforming a simple distribution p0 (e.g., a standard Gaussian) into a target distribution p1. This transformation is typically described by a time-dependent process, governed by an ordinary differential equation (ODE): d dtφt(x) = ut(φt(x)), with φ0(x) = x, (1) where ut : [0, 1] RD RD is a velocity field that guides the evolution over time. The task is to approximate ut using a parameterized model vθ t (x), such as a neural network. While ODEs are invertible (under Lipschitz continuity of ut) defining a likelihood through a change-of-variables computation solving ODEs during training is computationally expensive. Flow Matching addresses this limitation by directly learning the time-dependent vector field ut on [0, 1] through: LFM(θ) := Et,x h ut(x) vθ t (x) 2i . (2) While ut is intractable, we can make an assumption on the velocity field ut(x | x1) for a generative process that is conditioned on a specific endpoint x1 (e.g., a target molecule). The marginal field ut(x) can then be expressed as an expected value with respect to the posterior probability path pt(x1 | x), which defines the distribution over possible end points for interpolations intersect x at time t, ut(x) = Ept(x1|x) [ut(x | x1)] , (3) Controlled Generation with Equivariant Variational Flow Matching enabling efficient estimation through conditional samples. A key insight is that minimizing the loss for the conditional velocity field yields the same gradient as minimizing it for the marginal field, leading to the conditional flow matching loss: LCFM(θ) := Et,x1,x h ut(x | x1) vθ t (x) 2i . (4) 2.2. Variational Flow Matching Variational Flow Matching (VFM) extends Flow Matching by introducing a variational perspective. Instead of directly regressing to the true vector field, VFM parameterizes it through a variational distribution qθ t (x1 | x): vθ t (x) := Eqθ t (x1|x) [ut(x | x1)] . (5) This reformulation transforms the problem into one of variational inference, minimizing the Kullback-Leibler divergence between the true posterior pt(x1 | x) and its variational approximation qt(x1 | x): LVFM(θ) := Et KL pt(x1, x) || qθ t (x1, x) (6) = Et,x1,x log qθ t (x1 | x) + const. (7) When the conditional velocity field is linear in x1 (e.g. straight-line interpolation, diffusion), this objective simplifies to matching only the posterior mean, as then Ept(x1|x)[ut(x | x1)] = ut(x | Ept(x1|x)[x1]). (8) Moreover, the latter expectation only depends element-wise on the marginal expectation, implying we can use a fully factorized variational form without loss of generality, which results in the simplified mean-field objective: LMF-VFM(θ) = Et,x1,x d=1 qθ t (xd 1 | x) d=1 log qθ t (xd 1 | x) Notice that this reduces learning a single high-dimensional distribution into learning D univariate distributions, a much simpler task. VFM s flexibility in terms of the choice of variational distribution qt makes it particularly well-suited for molecular generation tasks. For instance, using categorical factors enables modeling of discrete molecular features like atom types and bond orders, which can be combined with Gaussian factors to represent continuous atomic coordinates. This unified treatment of discrete and continuous variables, combined with the efficiency inherited from Flow Matching, makes VFM attractive for mixed-modality tasks. 3. Controlled and Equivariant VFM Controlled generation is crucial for practical applications in generative modeling. In this section, we 1) extend VFM to controlled generation by deriving a unified objective for both end-to-end training and control using post-hoc Bayesian inference and 2) develop a fully equivariant framework that ensures invariance to key symmetries. 3.1. Controlled Variational Flow Matching Controlled generation extends generative modeling by guiding the generative process to satisfy constraints imposed by conditioning on additional information y. In Flow Matching, the primary goal is to learn a sequence of distributions (pt)0