# learning_reduced_fluid_dynamics__deb5f781.pdf Learning Reduced Fluid Dynamics Zherong Pan, Xifeng Gao, Kui Wu Lightspeed Studios {zrpan,xifgao,kwwu}@global.tecent.com Predicting the state evolution of ultra high-dimensional, timereversible fluid dynamic systems is a crucial but computationally expensive task. Existing physics-informed neural networks either incur high inference cost or cannot preserve the time-reversible nature of the underlying dynamics system. We propose a model-based approach to identify lowdimensional, time reversible, nonlinear fluid dynamic systems. Our method utilizes the symplectic structure of reduced Eulerian fluid and use stochastic Riemann optimization to obtain a low-dimensional bases that minimize the expected trajectory-wise dimension-reduction error over a given distribution of initial conditions. We show that such minimization is well-defined since the reduced trajectories are differentiable with respect to the subspace bases over the entire Grassmannian manifold, under proper choices of timestep sizes and numerical integrators. Finally, we propose a loss function measuring the trajectory-wise discrepancy between the original and reduced models. By tensor precomputation, we show that gradient information of such loss function can be evaluated efficiently over a long trajectory without timeintegrating the high-dimensional dynamic system. Through evaluations on a row of simulation benchmarks, we show that our method reduces the discrepancy by 50-90 percent over conventional reduced models and we outperform PINNs by exactly preserving the time reversibility. Introduction High-dimensional Partial Differential Equations (PDE), especially fluid dynamic systems, find vast applications in the field of scientific computation (Moin and Mahesh 1998; Alfonsi 2009), PDE-constrained optimization (Biegler et al. 2003; Herzog and Kunisch 2010), design prototyping (Baysal and Eleshaky 1992; Zang and Green 1999), fluidic devices design (Du et al. 2020; Li et al. 2022), and digital entertainment (Bridson and Batty 2010; Bridson 2015), to name a few. A fundamental task of all these applications lies in the efficient prediction of numerical solutions over a long horizon. In design prototyping, for example, a designer needs to quickly preview the fluid flow surrounding an aerial vehicle in order to refine its form factor. In a game engine, a fluid simulator needs to achieve real-time performance to provide interactive special effects for players. Copyright 2024, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. Although abundant numerical tools (Petrila and Trif 2004; Demkowicz et al. 1989) have been developed over the past decades with improved efficacy, their algorithmic complexity is still challenging the limits of current computational resources. On a parallel front, the idealized, incompressible, inviscid Eulerian fluid should be time reversible and energy preserving (Duponcheel, Orlandi, and Winckelmans 2008), and dedicated numerical schemes are proposed to faithfully preserve these properties in a discrete setting (Rowley and Marsden 2002; Pavlov et al. 2011). This implies that the initial condition of a trajectory can be recovered from any state thereafter and the discrete total energy is a constant throughout the predicted trajectory. Although idealized fluid models are not pursued in applications, their accurate prediction is an important criterion of reliable numerical schemes. Recently, a row of approaches have been proposed to identify high-dimensional fluid dynamic systems using compact learnable models. A large body of prior works fall into the category of non-intrusive approaches, which parameterize fluid dynamic transfer function using general learning models such as radial basis functions (Zhang, Kou, and Wang 2016), feed-forward networks (Hsieh and Tang 1998), recurrent networks (Pearlmutter 1989; Wang et al. 2018), convolutional autoencoder (Wu et al. 2021; Hasegawa et al. 2020), etc. Unfortunately, all these non-intrusive learning techniques cannot preserve the time reversible property of idealized Eulerian fluid, potentially leading to large prediction error or requiring a large dataset. Recent works propose physics-informed loss (PINNs) (Raissi, Perdikaris, and Karniadakis 2019) to minimize the physics-rule violation as much as possible, but since the underlying learning models are not exactly time reversible, the energy loss can still occur and even accumulate over long trajectories. To ensure exact time reversibility, the seminal work (Greydanus, Dzamba, and Yosinski 2019) proposes to learn the Hamiltonian operator and then uses symplectic integrator to predict the trajectory. Although this method could be applied to fluid mechanics, their computational cost is as high as conventional fluid simulator (if not even higher) since the Hamiltonian network needs to be evaluated separately for each fluid particle. So far, we are still lacking a learning-based fluid dynamic model that both preserves exact time reversibility and reduces the dimension. On the other hand, model reduction approaches (Berkooz, The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24) Method Time-Reversible Low-Dimensional PINNs HNN Ours Table 1: Features of representative methods. Holmes, and Lumley 1993; Rowley 2005) have shown great potential in compressing high-dimensional PDE, but their connection with learning-based approaches have been largely ignored. The earliest data-driven method of Proper Orthogonal Decomposition (POD) (Berkooz, Holmes, and Lumley 1993) finds the optimal linear subspace that best explains the variation of the state distribution. However, POD does not consider time-dependency, which is remedied by Dynamic Model Decomposition (DMD) (Schmid 2010) that finds the optimal linear subspace that best approximates the Koopman operator. In comparison, data-free model reduction approaches, such as balanced POD (Rowley 2005), H2optimization (Gugercin, Beattie, and Antoulas 2006), and modal analysis (Taira et al. 2017), identify bases corresponding to the intrinsic property of PDE by analyzing the system transfer matrices in the frequency domain, and are thus independent of data. Unfortunately, these techniques are largely limited to linear systems and their extensions to nonlinear fluid dynamics, such as (Yang, Jiang, and Xu 2019), are in their infancy. We propose a learning-based model reduction technique to identify low-dimensional, exactly time reversible fluid dynamic systems and we summarize our features in Table 1. We first interpret the linear subspace of fluid velocities as a point on the Grassmannian manifold and study the dependence of reduced trajectories on the choice of subspace. Thanks to the time reversibility, we show that the map from the subspace bases to reduced trajectories is globally differentiable, which allows us to optimize the reduced model via gradient-based Riemannian optimization. We further propose a trajectory-wise discrepancy loss that penalizes the difference between the full-order and the reduced trajectories. To make the optimization tractable, we propose a tensor precomputation scheme to accelerate the back-propagation of gradient information. Our high-level idea is to fine-tune the reduced fluid model to minimize the expected trajectorywise discrepancy loss over the distribution of initial conditions. In essence, our method extends prior optimal reduced bases construction algorithm (Berkooz, Holmes, and Lumley 1993; Schmid 2010) to the nonlinear, idealized fluid dynamic model. As an intrusive approach, our method preserves the desirable property of time reversibility. When compared with POD-type reduced model baseline on a row of idealized fluid simulation benchmarks, our method lowers the discrepancy by 50% 90%. We further show that general learnable models can incur energy loss that accumulates over time, while our method exactly preserves the total energy. Related Work We review related works on machine learning for solving ODE and PDE, reduced physics models beyond fluid dynamics, and finally learning under hard constraints. Learning for Solving ODE and PDE: Machine learning can identify complex behaviors of dynamic systems from groundtruth data. (Chen et al. 2018) propose to learn such dynamics as a general Ordinary Differential Equation (ODE) with the time derivative of state predicted via a neural network. However, it does not reflect the spatial and temporal structures of certain systems, which limits its accuracy, data-efficacy, and scalability to high-dimensional systems such as fluids. Several follow-up works improve the network architecture to reflect additional structures. For example, the inter-dependency among spatial variables is oftentimes local and sparse, which could be modeled via neighborhood message passing (Battaglia et al. 2016; Li et al. 2019). Hamiltonian dynamics are time reversible and energy preserving, which is modeled by learning the Hamiltonian operator in canonical coordinates (Greydanus, Dzamba, and Yosinski 2019), generalized coordinates (Cranmer et al. 2020), or ambient space with additional constraints (Finzi, Wang, and Wilson 2020). However, the above techniques are using Lagrangian coordinates, while fluid mechanics are oftentimes modeled via an Eulerian grid, see e.g. (Takahashi et al. 2021), which is a major point of difference from our method. Parallel efforts have been made to learn Eulerian fluid mechanics (Um et al. 2020; Takahashi et al. 2021; Holl, Thuerey, and Koltun 2020; Prantl, Bonev, and Thuerey 2019; Kim et al. 2019). Some of these works (Um et al. 2020; Takahashi et al. 2021; Holl, Thuerey, and Koltun 2020) learn to control fluids via differentiable simulators but the dynamic systems are not learned. Other works (Prantl, Bonev, and Thuerey 2019; Kim et al. 2019) learn to predict short future trajectories of free-surface flows. As the major difference from these techniques, our goal is to predict arbitrarily long trajectories by utilizing the time reversible structure of the dynamic system to guarantee stability. On the downside, however, our method cannot predict free-surface flows. Learning Reduced Physical Models: Model reduction is a special kind of dimension reduction technique dealing with time series datasets and we refer readers to (Rowley and Dawson 2017) for a review of its application in fluid mechanics. Other than fluid, reduced models have been adopted in predicting the behaviors of solid (Sampaio and Soize 2007), electromagnetic fields (Ralph-Uwe, Ernst, and Spitzer 2008), quantum and molecular mechanics (Mohan and Fredrickson 2020), neuron propagations (Amsallem and Nordstrom 2016), etc. Conventional techniques for model reduction are restricted to linear dynamic systems, for which optimal linear subspace can be identified via POD or DMD (Berkooz, Holmes, and Lumley 1993; Rowley 2005) and the projected dynamic system can be precomputed via Galerkin projection. More general machine learning techniques have been proposed for an extension to nonlinear dynamics. For example, convolution autoencoder has been used to identify nonlinear subspaces (Wu et al. 2021; Hasegawa et al. 2020). The ROM-net (Daniel et al. The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24) diffuse=0 diffuse=1 diffuse=10 diffuse=100 (a) diffuse=0 diffuse=1 diffuse=10 diffuse=100 (b) Figure 1: We plot the energy dissipation cause by a viscous term under µ = 0,1,10,100, simulated using our learned reduced model (a) and the groundtruth fullspace model (b). 2020) learns to select a suitable subspace from a dictionary. (Li et al. 2017) proposes to represent the linear subspace bases as the output of a universal neural network. In order to efficiently project the nonlinear dynamic system into the subspace, the Discrete Empirical Interpolation Method (DEIM) (Chaturantabut and Sorensen 2010) has been proposed to select a sparse set of interpolation points. The interpolation points are then contracted with the subspace bases in an intrusive manner. Non-intrusive approaches use universal neural networks to learn the entire nonlinear transfer function (Wu et al. 2021; Hasegawa et al. 2020; Lee et al. 2021) or part of the nonlinear terms (Maulik et al. 2019). It has been noticed in (Amsallem and Nordstrom 2016; Liu et al. 2015) that time reversibility and energy preservation features can be preserved by using an intrusive approach, which is a main reason behind our technical choice. Learning Under Hard Constraints: Our work deals with idealized fluid satisfying two hard constraints: 1) incompressibility and 2) time reversibility. Since prominent training algorithms (Duchi, Hazan, and Singer 2011; Kingma and Ba 2014) and neural network architectures are designed for unconstrained optimization, dealing with hard constraints has been a long-standing problem (M arquez-Neila, Salzmann, and Fua 2017). There are two general approaches to inform a learned model of hard constraints: softening and constraint layers. Softening transforms the hard constraint into soft losses and relies on unconstrained optimizations. In the learning of physical models, softening has been adopted to enforce physical correctness (Sirignano and Spiliopoulos 2018; Ober-Bl obaum and Offen 2022), fluid incompressible (Ajuria Illarramendi et al. 2020), and collision-free constraints (Tan et al. 2022), and data augmentation has been used to enforce invariance to rigid (Morozov, Zgyatti, and Popov 2021) and Galilean transformations (Ling, Jones, and Templeton 2016). A common problem with all these approaches lies in the unpredictable constraint violation in regions of insufficient data coverage. To exactly impose hard constraints, a series of works (Amos and Kolter 2017; Agrawal et al. 2019) propose to formulate the constrained optimization as a differentiable layer in the neural network architecture. In particular, the entire fluid simulator has been formulated as a differentiable layer (Schenck and Fox 2018; Takahashi et al. 2021) for model-based control and system identification. The incompressible constraint has also been formulated as an elliptic PDE solver layer in (Mohan et al. 2020). Although these techniques can enforce hard con- straints, the cost of forwardand back-propagations through these layers are prohibitive. Our method uses the constraint layer approach to enforce fluid incompressibility and time reversibility, by incorporating the reduced model (Liu et al. 2015) as our differentiable layer. However, we encode the constraint property into the reduced bases, which is fixed during test time, leading to the low computational cost of trajectory prediction. Time Reversible Reduced Fluid Model We briefly review the underlying geometric structure and associated computational model of idealized, incompressible, inviscid fluid (Pavlov et al. 2011). Given a simulation domain M, the fluid configuration can be described as a vector field v V(M) where v(x) for any x M represents the velocity of fluid at x. The governing equation for v is: v + v v + λ = 0 s.t. v = 0, (1) where λ is the pressure field, which is also the Lagrangian multiplier for the divergence-free constraint v = 0. The above system is closed with appropriate initial and boundary conditions. (Pavlov et al. 2011) proposed timereversible, energy preserving spatial and temporal discretization schemes for Equation 1. However, directly time integrating the discrete system requires solving large-scale nonlinear system equations. Reduced-order model (Liu et al. 2015) scales down the cost by embedding v into a pdimensional subspace with divergence-free, orthogonal basis U, giving v = Uz where z is the coefficient vector. The reduced-order governing equation can be derived via Galerkin projection: z + M U T (Uz) (Uz) = 0, (2) where the second term is the reduced-order advector, which could be succinctly written as a contraction with a thirdorder tensor Ckij: zk + i j Ckijzizj = 0 s.t. Ckij M Uk, Ui Uj , (3) where we use zk (resp. Uk) to denote the kth element (resp. column). For fast reduced trajectory prediction, the tensor Ckij is precomputed, and a small p is used. An essential feature of Ckij is antisymmetry: Ckij = Cjik, which implies that the continuous-time, reduced system is also energypreserving as: d dt z 2 =2 kij Ckijzkzizj = kij (Ckij Cjik)zkzizj = 0. Using a variational integrator, e.g. the trapezoidal rule, the energy will also be conserved in a time-discrete computational model. We use a superscript + to denote variables at the next time instance, the superscript d denotes the variable at the dth timestep, and δt denotes timestep size. The trape- The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24) zoidal rule relates z+ and z by: δt + C(z+) = 0 s.t. C(z+) ij C ij z+ i + zi from which z+ can be solved via the Newton-Raphson method to satisfy z+ 2 = z 2, i.e. energy conservation, as well as discrete-time reversibility. These remarked features, originally discovered in (Pavlov et al. 2011; Liu et al. 2015), achieve an ideal balance between computational efficacy and numerical stability. As pointed out by (Pavlov et al. 2011), although real-world flows are not ideally energy-preserved, simulating ideal flows is a crucial benchmark for evaluating the stability and fidelity of a simulator. More general nonreversible flows can be modeled by adding additional constitutive terms. As an example, we could add a viscous term µ ( v) to model energy dissipation and this term can be projected to the reduced space via Galerkin projection. In Figure 1, we plot the procedure of energy dissipation under different µ using both our learned reduced model and the groundtruth fullspace model. We formalize and prove these properties in Appendix 2 and Appendix 3. In particular, Equation 4 defines a unique z+ given z and a sufficiently small δt, so we define the function z+(z) by a slight abuse of notations. The accuracy of a reduced model relies on a proper choice of the basis vector U, which remains a difficult but underappreciated problem. Reduced Model Optimization As illustrated in Figure ??, we propose to identify reduceorder fluid models via gradient-based optimization of U to minimize the trajectory-wise discrepancy between a reduced-order model (Equation 3) and the full-order model (Equation 1). In this section, we first discretize the spatial computational domain (Section ), we then propose our discrepancy loss function (Section ), and finally discuss our optimization algorithm (Section ). Spatial Discretization We assume that M is discretized using a tetrahedron mesh or a rectangular grid via Discrete Exterior Calculus (DEC) as in (Pavlov et al. 2011). As a result, each vector field has a finite dimension n p. We use a bar to denote discrete variable so v Rn. U belongs to the intersection of Stiefel manifold St(n,p) and the divergence-free basis subspace: D(n,p) = { U Rn p U = 0}, where R(n m) n is the discrete divergence operator and m p is the dimension of divergence-free velocity subspace. The elements of U can also be identified with the elements of St(m,p). Indeed, we can find a set of unit, orthogonal bases D Rn m spanning the subspace of divergence-free velocity fields. For each U, we can identify some U m St(m,p) such that U = D U m. As illustrated in Figure 1, a point on St(n,p) is the bases of a p-dimensional velocity field subspace, while a point on St(m,p) is the bases of a p-dimensional divergence-free velocity field subspace. Since we merely use U to project the velocity field into a subspace, we are only interested in the lower-dimensional Grassmannian Manifold (the manifold of velocity subspace irrespective of the particular bases), but we use Stiefel representation for better memory and computational efficacy. In other words, we treat U as our decision variable. We further write the tensor coefficient Ckij as a function C( Uk, Ui, Uj), which is derived by discretizing the continuous definition of Ckij in Equation 3 using DEC. Lifting Transfer Function to Full-Space In order to optimize the accuracy of reduced dynamic system, we first need to compare simulated trajectories generated by different bases U. However, the coordinate vector z of different U is incomparable, as they reside in different linear subspaces. To resolve this problem, we propose to lift z to v = Uz in the ambient space Rn, so that two vectors can be compared by the induced metric in the Euclidean space. Further, we can smoothly extend the reduced-order simulator function to the ambient space using the projection operator P = U U T and P = I P: v+( v, U) Uz+( U T v) + P v. (5) In other words, the velocity component orthogonal to the subspace is stationary, and the tangential velocity is governed by the reduced dynamic system. As detailed in Appendix 4, the above extension can be written as a function defined on the Grassmannian manifold: v+( v, P m) Rn Gr(m,p) Rn, where we denote P m = U m [ U m] T . With the smooth extension, we can evaluate the derivatives of v+ with respect to v and the subspace. We can also compare two velocity fields generated by reduced-order simulators using different subspaces. Note the full-order dynamics (Equation 1) can be identified with U m = Im m. The above lifting is not unique, and a useful alternative is to discard the orthogonal component, i.e. setting P v+ = 0, which is discussed in Appendix 4. As our major contribution, we show in Appendix 4 that the above function v+ is a well-defined smooth function on Gr(m,p). We further show that for any differentiable loss function L( v+), its derivatives with respect to the bases can be efficiently computed under a proper representation of U as a manifold point. Reduced Discrepancy Loss The differentiable structure of reduced fluid allows us to minimize the discrepancy between reducedand full-order model in an efficient model-based manner. Given two velocity fields v and v+, a full-order model should satisfy the governing equation of motion, which inspires the following discrepancy measure: Ldyn( v+, v) δt + C( D DT , v+ + v This is similar to the physics correctness loss used in (Sirignano and Spiliopoulos 2018; Ober-Bl obaum and Offen 2022) and we absorb the linear divergence-free constraint by The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24) using the projection operator D DT . Again, evaluating L involves a sparse linear solve for each of the T timesteps. But we can accelerate this computation thanks to the low-rank property of the velocity fields. Since, v and v+ both reside in low-rank spaces, we can write: C( D DT , v+ + v ij C( D DT , Ui, Uj) z+ i + zi and precompute the tensor C( D DT , Ui, Uj) via p2 sparse linear solves at the cost of O(nωp2). For a trajectory with T p2 timesteps, this operator reduces the cost of evaluating Ldyn from O(nωT) to O(nωp2 + Tnp2). Stochastic Riemann Optimization Using a low-dimensional subspace, it is impossible to approximate all fluid simulation trajectories with sufficient accuracy. Instead, reduced models are designed to optimize a subset of trajectories with a given distribution I of initial conditions, i.e. v0 I and our goal is to solve the following problem via stochastic Riemann optimization: argmin U D(n,p) St(n,p) E v0 I [ T d=1 γd Ldyn( vd, vd 1)], (7) where T is the horizon of trajectory and γ (0,1] is a constant discount factor. Riemann optimization is a well-studied problem in both deterministic and stochastic settings and we use the RAMSGRAD algorithm proposed in (Becigneul and Ganea 2019). This algorithm requires both the retraction and parallel transport operators on St(n,p). We use QRfactorization for the retraction operator (Bendokat, Zimmermann, and Absil 2020). Unfortunately, there is no efficient way to compute the parallel transport operator (Edelman, Arias, and Smith 1998), so we approximate the transport operator by projecting out the non-tangential component. This corresponds to using a single step of forward Euler integrator to solve the associated ODE of the transport operator. Again due to time reversibility, the objective function is globally differentiable with respect to U under compact I and sufficiently small δt. We outline our forward-backward gradient propagation and adapted RAMSGRAD algorithm in Appendix 1. These algorithms are well-defined due to the following lemma: Lemma 0.1. For any compact initial distribution I, there exists a sufficiently small δt, such that the objective function T d=0 γd Ldyn( vd) is globally differentiable, i.e. for any z0 I and U D(n,p) St(n,p). Proof. Since I is compact, v0 is uniformly upper bounded by some r and z0 = U T v0 r. By Corollary 2.5, there exists a sufficiently small δt making any zd a differentiable, reversible function of z0. This also implies vd is a differentiable, reversible function of v0 under the definition of Equation 5, and our result follows. 10 20 30 40 50 p Figure 2: The cost of evaluating z+(z) plotted against p. Evaluation We implement our method using Pytorch with a fluid simulator implemented via native C++ with CPU parallelization, and perform all the computations on an AMD Threadripper 3970X CPU having 32 cores. We initialize our method using a conventional POD-type algorithm. Given I, we first sample a set of N trajectories using the full-order dynamics (Equation 1) and then perform a POD-type basis extraction. The number of extracted bases is determined by truncating the eigenvalues below ϵ of the largest eigenvalue. We always use a batch size of 1. The performance of our method is summarized in Table 2. We consider two variants of our method: coupled case, where Ckij is treated as a function C(Uk,Ui,Uj) as discussed in Section , and decoupled case, where Ckij is treated as an antisymmetric independent decision variable. Our main experiments are performed with the coupled case. Experiments with the decoupled case and a summary of decision variables are included in Appendix 5. The efficacy of trajectory prediction using a reduced-order model depends on p as illustrated in Figure 2, so the runtime performance of both the POD baseline and our method are the same, while the cost of evaluating the full-order model is 252ms (26 slower than the reduced-order model with p = 49). Our first benchmark is Taylor vortices (Pavlov et al. 2011), where two vortices are separated by a distance slightly larger than the critical threshold. We use a velocity field discretized on a 64 64 rectangular grid with the periodic boundary condition, leading to n = 8192. This is a single trajectory (I is deterministic) and we set T = 500, δt = 0.01. We experiment with four parameters ϵ = 0.05, 0.01, 0.001, and 0.0001 and the number of bases is p = 8, 11, 16, and 25, correspondingly. With each U as the initial guess, we run our optimizer for 24 hours. In Figure 4bc, we plot the trajectory-wise discrepancy loss against the number of bases p and the convergence history of our method. Compared with POD bases, our method reduces the discrepancy loss by 87.93%, 90.12%, 91.47%, and 90.16%, respectively. Snapshots are shown in Figure 4a, where our method predicts a velocity field closer to the full-order groundtruth. Our second benchmark involves having a smoke plume rise at a constant speed. We use a rectangular domain of [0,1]2 with all Dirichlet boundary conditions. The region of [0.25, 0.75] [0.125, 0.375] is occupied by the smoke with a constant speed (0, 1), the remaining regions have zero velocity, and we use T = 1000. All other settings are the same as our first benchmark. The discrepancy loss and conver- The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24) Discrepancy - Our Method POD Discrepancy 0 25 50 75 100 125 150 Discrepancy 0 25 50 75 100 125 150 Figure 3: (a): The trajectory-wise discrepancy with respect to θ for our third benchmark. (bc): The initial (b) and final (c) trajectory-wise discrepancy with respect to θ1,θ2 for our forth benchmark. Benchmark ϵ = 0.05 ϵ = 0.01 ϵ = 0.001 ϵ = 0.0001 p Loss-POD Loss-Ours p Loss-POD Loss-Ours p Loss-POD Loss-Ours p Loss-POD Loss-Ours Taylor Vortices 8 4.84 0.58 11 3.93 0.39 16 1.99 0.17 25 0.70 0.07 Plume Rise 6 57.04 6.37 9 28.23 5.38 15 18.53 2.30 26 5.96 1.44 Plume Rise+Obs. 5 133.16 10.48 8 46.47 8.60 16 17.57 2.57 30 5.71 1.05 Spherical Plume - - - 36 120.89 44.89 - - - - - - Two Plume - - - 59 103.23 49.22 - - - - - - Table 2: Summary of benchmarks for comparing POD and our method under different ϵ and p. gence history are plotted in Figure 6bc. We experiment with four parameters ϵ = 0.05, 0.01, 0.001, and 0.0001, the corresponding numbers of bases p are 6, 9, 15, and 26, respectively. Our method reduces the discrepancy loss by 88.82%, 80.94%, 87.60%, and 75.79%, respectively. We have also tested a variant of our method with an obstacle in the simulation domain, where our method reduces the discrepancy loss by 92.13%, 81.49%, 85.38%, and 81.70%, respectively. Snapshots of our second benchmark are shown in Figure 6a and Figure 7 of Appendix 9. In our first benchmark, Taylor vortices (Pavlov et al. 2011), we further analyze the sensitivity of our method with respect to the initial guess U. To this end, we first compute U via POD and then corrupt U using a random noise bases U with each element sampled according to the truncated normal distribution with µ = 0, σ = 1 and truncated to range [ 1, 1]. We then use the following initial guess: Retract( U, D DT UΣ), where Σ is a scaling diagonal matrix such that each column of UΣ has l2-norm equals to some ϵ and ϵ controls the magnitude of random noise. Here multiplying by D DT ensures that our noise is divergencefree. In Figure 5, we profile the convergence history with ϵ = 0.01,0.05,0.25,0.5. Although the noise can drastically change the initial discrepancy loss, all four instances can reduce the loss to similar levels after sufficiently many iterations. Our analysis also implies that the POD baseline provides a good initial guess of U, because a fully noisy initialization of U can lead to a worse result. In the recent work (Brandstetter, Worrall, and Welling 2022), authors proposed two training modes for learning neural PDE solver, one-step training and full-unrolling. Onestep training cuts off the gradient after a single timestep, while the full-unrolling mode considers the full gradient of Equation 7 over the entire trajectory. We compare the two modes in Figure 6 in terms of trajectory-wise discrepancy loss, using our second benchmark scenario, rising smoke plume. Both modes can reduce the loss after 3000 iterations, although there is some initial fluctuation in one-step training, while full-unrolling leads to significantly faster convergence. We use the full-unrolling mode for all other examples. Our third benchmark involves a spherical smoke plume, with initial diameter 1/3 and speed 1.0 located in the center of a [0,1]2 domain, moving in varying directions. We assume the direction of motion is parameterized by the angle θ [0,2π] sampled from the initial distribution I = U([0, 2π]). We use a velocity field discretized on a 64 64 rectangular grid with Dirichlet boundary condition (n = 8320). Our training dataset for the POD baseline contains N = 8 trajectories with evenly sampled θ = 0 ,45 ,90 , . With T = 500,δt = 0.01,ϵ = 0.01,p = 36, we run our method for 12200 iterations, taking 72 hours to converge. We then test our method on another 24 evenly sampled θ = 7.5 ,22.5 ,30 , , which are not covered by the training dataset (some snapshots can be found in Figure 8 of Appendix 9). As plotted in Figure 3a, our method reduces the discrepancy by 54.65% on average. Our fourth benchmark extends the third one by involving two smoke plumes, located at (0.5, 0.25) and (0.5, 0.75). The directions of motion θ1,θ2 [0,π] are sampled from the initial distribution I = U([0,π]2) and we set ϵ = The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24) 10 15 20 25 #Bases Discrepancy POD Our Method 0 10 20 Elapsed Time (hr) Discrepancy coupled(8) coupled(11) coupled(16) coupled(25) Figure 4: (a) Velocity magnitude field snapshots of the Taylor vortices benchmark, generated by full-order model (top row), our method with ϵ = 0.0001 and p = 25 (middle row), and POD with ϵ = 0.0001 and p = 25 (bottom row). (b) Trajectory-wise discrepancy loss of POD and our method, under different p. (c) The convergence history of our method over 24 hours. 0 2500 5000 7500 10000 12500 #Iteration Discrepancy coupled(8)(0.01) coupled(8)(0.05) coupled(8)(0.25) coupled(8)(0.5) coupled(8)(0.5) More Iter. coupled(8)(noise) ϵ Loss-Init. Loss-8k-Iter. Loss-12k-Iter. 0.01 4.84 0.79 N/A 0.05 5.04 0.83 N/A 0.25 10.16 1.23 N/A 0.5 22.77 1.51 1.08 Noise 21.06 2.37 N/A Figure 5: The convergence history of four instances of learning reduced Taylor vortices with ϵ = 0.05, p = 8, and difference noise levels ϵ = 0.01,0.05,0.25,0.5. We first run the four training instances for 8000 iterations, which already brings the ultimate discrepancy loss down to similarly low levels. We then give the ϵ = 0.5 instance another 4500 iterations (purple after red curve) and it could outperform the ϵ = 0.25 instance. Finally, we tried using a fully noisy initialization of U and the result is much worse than other instances. 0.01,p = 59. Our training dataset for the POD baseline contains N = 25 trajectories with 5 evenly sampled θ1,2 = 0 ,72 ,144 ,216 ,288 . Other parameters are the same as those of our third benchmark. We run our method for 18000 iterations, taking 72 hours to converge (some snapshots can be found in Figure 9 of Appendix 9). Afterwards, we test our method on another 25 evenly sampled θ1,2 = 36 ,108 ,180 ,252 ,324 that are not covered by the training dataset. As plotted in Figure 3bc, our method reduces the discrepancy by 59.28% on average. 0 500 1000 1500 2000 2500 3000 #Iteration Discrepancy coupled(6) coupled(6) One-Step coupled(9) coupled(9) One-Step Mode ϵ/p Loss-POD Loss-3k-Iter. One-Step 0.05/6 57.39 24.12 Full-Unrolling 0.05/6 57.39 6.37 One-Step 0.01/9 28.23 15.71 Full-Unrolling 0.01/9 28.23 5.38 Figure 6: The convergence history over 3000 iterations of four instances of learning reduced smoke plume rising trajectory. We use two sets of instances: ϵ = 0.05, p = 6 and ϵ = 0.01, p = 9. For each set, we compare one-step and fullunrolling mode of training. Conclusion We propose a model-based approach to fine-tune reduced fluid dynamic systems. By evaluating several simulation benchmarks, we show that our method outperforms the POD baseline. On the downside, our trajectory prediction has sequential dependence and cannot exploit GPU parallelization. Even with our tensor precomputation technique, the training still takes hours on a desktop machine, which is much slower than the simple POD or DMD method. Further, our method uses a linear subspace with limited expressivity as compared with universal neural networks (Wu et al. 2021; Hasegawa et al. 2020; Lee et al. 2021) used by non-intrusive model reduction techniques. We speculate that using neural networks to represent the reduced bases U is possible as done in (Li et al. 2017). The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24) References Agrawal, A.; Amos, B.; Barratt, S.; Boyd, S.; Diamond, S.; and Kolter, J. Z. 2019. Differentiable convex optimization layers. Advances in neural information processing systems, 32. Ajuria Illarramendi, E.; Alguacil, A.; Bauerheim, M.; Misdariis, A.; Cuenot, B.; and Benazera, E. 2020. Towards an hybrid computational strategy based on deep learning for incompressible flows. In AIAA AVIATION 2020 FORUM, 3058. Alfonsi, G. 2009. Reynolds-averaged Navier Stokes equations for turbulence modeling. Applied Mechanics Reviews, 62(4). Amos, B.; and Kolter, J. Z. 2017. Optnet: Differentiable optimization as a layer in neural networks. In International Conference on Machine Learning, 136 145. PMLR. Amsallem, D.; and Nordstrom, J. 2016. Energy stable model reduction of neurons by nonnegative discrete empirical interpolation. SIAM Journal on Scientific Computing, 38(2): B297 B326. Battaglia, P.; Pascanu, R.; Lai, M.; Jimenez Rezende, D.; et al. 2016. Interaction networks for learning about objects, relations and physics. Advances in neural information processing systems, 29. Baysal, O.; and Eleshaky, M. E. 1992. Aerodynamic design optimization using sensitivity analysis and computational fluid dynamics. AIAA journal, 30(3): 718 725. Becigneul, G.; and Ganea, O.-E. 2019. Riemannian Adaptive Optimization Methods. In International Conference on Learning Representations. Bendokat, T.; Zimmermann, R.; and Absil, P.-A. 2020. A Grassmann manifold handbook: Basic geometry and computational aspects. ar Xiv preprint ar Xiv:2011.13699. Berkooz, G.; Holmes, P.; and Lumley, J. L. 1993. The proper orthogonal decomposition in the analysis of turbulent flows. Annual review of fluid mechanics, 25(1): 539 575. Biegler, L. T.; Ghattas, O.; Heinkenschloss, M.; and Bloemen Waanders, B. v. 2003. Large-scale PDE-constrained optimization: an introduction. In Large-Scale PDEConstrained Optimization, 3 13. Springer. Brandstetter, J.; Worrall, D.; and Welling, M. 2022. Message passing neural PDE solvers. ar Xiv preprint ar Xiv:2202.03376. Bridson, R. 2015. Fluid simulation for computer graphics. AK Peters/CRC Press. Bridson, R.; and Batty, C. 2010. Computational physics in film. Science, 330(6012): 1756 1757. Chaturantabut, S.; and Sorensen, D. C. 2010. Nonlinear model reduction via discrete empirical interpolation. SIAM Journal on Scientific Computing, 32(5): 2737 2764. Chen, R. T.; Rubanova, Y.; Bettencourt, J.; and Duvenaud, D. K. 2018. Neural ordinary differential equations. Advances in neural information processing systems, 31. Cranmer, M.; Greydanus, S.; Hoyer, S.; Battaglia, P.; Spergel, D.; and Ho, S. 2020. Lagrangian neural networks. ar Xiv preprint ar Xiv:2003.04630. Daniel, T.; Casenave, F.; Akkari, N.; and Ryckelynck, D. 2020. Model order reduction assisted by deep neural networks (ROM-net). Advanced Modeling and Simulation in Engineering Sciences, 7(1): 1 27. Demkowicz, L.; Oden, J. T.; Rachowicz, W.; and Hardy, O. 1989. Toward a universal hp adaptive finite element strategy, Part 1. Constrained approximation and data structure. Computer Methods in Applied Mechanics and Engineering, 77(1-2): 79 112. Du, T.; Wu, K.; Spielberg, A.; Matusik, W.; Zhu, B.; and Sifakis, E. 2020. Functional Optimization of Fluidic Devices with Differentiable Stokes Flow. ACM Trans. Graph., 39(6). Duchi, J.; Hazan, E.; and Singer, Y. 2011. Adaptive subgradient methods for online learning and stochastic optimization. Journal of machine learning research, 12(7). Duponcheel, M.; Orlandi, P.; and Winckelmans, G. 2008. Time-reversibility of the Euler equations as a benchmark for energy conserving schemes. Journal of Computational Physics, 227(19): 8736 8752. Edelman, A.; Arias, T. A.; and Smith, S. T. 1998. The geometry of algorithms with orthogonality constraints. SIAM journal on Matrix Analysis and Applications, 20(2): 303 353. Finzi, M.; Wang, K. A.; and Wilson, A. G. 2020. Simplifying hamiltonian and lagrangian neural networks via explicit constraints. Advances in neural information processing systems, 33: 13880 13889. Greydanus, S.; Dzamba, M.; and Yosinski, J. 2019. Hamiltonian neural networks. Advances in neural information processing systems, 32. Gugercin, S.; Beattie, C.; and Antoulas, A. 2006. Rational Krylov methods for optimal H2 model reduction. submitted for publication. Hasegawa, K.; Fukami, K.; Murata, T.; and Fukagata, K. 2020. CNN-LSTM based reduced order modeling of twodimensional unsteady flows around a circular cylinder at different Reynolds numbers. Fluid Dynamics Research, 52(6): 065501. Herzog, R.; and Kunisch, K. 2010. Algorithms for PDEconstrained optimization. GAMM-Mitteilungen, 33(2): 163 176. Holl, P.; Thuerey, N.; and Koltun, V. 2020. Learning to Control PDEs with Differentiable Physics. In International Conference on Learning Representations. Hsieh, W. W.; and Tang, B. 1998. Applying neural network models to prediction and data analysis in meteorology and oceanography. Bulletin of the American Meteorological Society, 79(9): 1855 1870. Kim, B.; Azevedo, V. C.; Thuerey, N.; Kim, T.; Gross, M.; and Solenthaler, B. 2019. Deep fluids: A generative network for parameterized fluid simulations. In Computer graphics forum, volume 38, 59 70. Wiley Online Library. Kingma, D. P.; and Ba, J. 2014. Adam: A method for stochastic optimization. ar Xiv preprint ar Xiv:1412.6980. Lee, S.; Jang, K.; Cho, H.; Kim, H.; and Shin, S. 2021. Parametric non-intrusive model order reduction for flow-fields The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24) using unsupervised machine learning. Computer Methods in Applied Mechanics and Engineering, 384: 113999. Li, Q.; Dietrich, F.; Bollt, E. M.; and Kevrekidis, I. G. 2017. Extended dynamic mode decomposition with dictionary learning: A data-driven adaptive spectral decomposition of the Koopman operator. Chaos: An Interdisciplinary Journal of Nonlinear Science, 27(10): 103111. Li, Y.; Du, T.; Grama Srinivasan, S.; Wu, K.; Zhu, B.; Sifakis, E.; and Matusik, W. 2022. Fluidic Topology Optimization with an Anisotropic Mixture Model. ACM Trans. Graph. Li, Y.; Wu, J.; Zhu, J.-Y.; Tenenbaum, J. B.; Torralba, A.; and Tedrake, R. 2019. Propagation networks for modelbased control under partial observation. In 2019 International Conference on Robotics and Automation (ICRA), 1205 1211. IEEE. Ling, J.; Jones, R.; and Templeton, J. 2016. Machine learning strategies for systems with invariance properties. Journal of Computational Physics, 318: 22 35. Liu, B.; Mason, G.; Hodgson, J.; Tong, Y.; and Desbrun, M. 2015. Model-Reduced Variational Fluid Simulation. ACM Trans. Graph., 34(6). M arquez-Neila, P.; Salzmann, M.; and Fua, P. 2017. Imposing hard constraints on deep networks: Promises and limitations. ar Xiv preprint ar Xiv:1706.02025. Maulik, R.; Rao, V.; Madireddy, S.; Lusch, B.; and Balaprakash, P. 2019. Using recurrent neural networks for nonlinear component computation in advection-dominated reduced-order models. ar Xiv preprint ar Xiv:1909.09144. Mohan, A.; and Fredrickson, G. H. 2020. Reduced Order Computational Model for the Molecular Dynamics Simulation of Entangled Polymers. ar Xiv preprint ar Xiv:2009.00216. Mohan, A. T.; Lubbers, N.; Livescu, D.; and Chertkov, M. 2020. Embedding hard physical constraints in neural network coarse-graining of 3D turbulence. ar Xiv preprint ar Xiv:2002.00021. Moin, P.; and Mahesh, K. 1998. Direct numerical simulation: a tool in turbulence research. Annual review of fluid mechanics, 30(1): 539 578. Morozov, A.; Zgyatti, D.; and Popov, P. 2021. Equidistant and Uniform Data Augmentation for 3D Objects. IEEE Access, 10: 3766 3774. Ober-Bl obaum, S.; and Offen, C. 2022. Variational Learning of Euler Lagrange Dynamics from Data. ar Xiv preprint ar Xiv:2112.12619. Pavlov, D.; Mullen, P.; Tong, Y.; Kanso, E.; Marsden, J. E.; and Desbrun, M. 2011. Structure-preserving discretization of incompressible fluids. Physica D: Nonlinear Phenomena, 240(6): 443 458. Pearlmutter, B. A. 1989. Learning state space trajectories in recurrent neural networks. Neural Computation, 1(2): 263 269. Petrila, T.; and Trif, D. 2004. Basics of fluid mechanics and introduction to computational fluid dynamics, volume 3. Springer Science & Business Media. Prantl, L.; Bonev, B.; and Thuerey, N. 2019. Generating Liquid Simulations with Deformation-aware Neural Networks. In International Conference on Learning Representations. Raissi, M.; Perdikaris, P.; and Karniadakis, G. E. 2019. Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations. Journal of Computational physics, 378: 686 707. Ralph-Uwe, B.; Ernst, O. G.; and Spitzer, K. 2008. Fast 3D simulation of transient electromagnetic fields by model reduction in the frequency domain using Krylov subspace projection. Geophysical Journal International, 173(3): 766 780. Rowley, C. W. 2005. Model reduction for fluids, using balanced proper orthogonal decomposition. International Journal of Bifurcation and Chaos, 15(03): 997 1013. Rowley, C. W.; and Dawson, S. T. 2017. Model reduction for flow analysis and control. Annu. Rev. Fluid Mech, 49(1): 387 417. Rowley, C. W.; and Marsden, J. E. 2002. Variational integrators for degenerate Lagrangians, with application to point vortices. In Proceedings of the 41st IEEE Conference on Decision and Control, 2002., volume 2, 1521 1527. IEEE. Sampaio, R.; and Soize, C. 2007. Remarks on the efficiency of POD for model reduction in non-linear dynamics of continuous elastic systems. International Journal for numerical methods in Engineering, 72(1): 22 45. Schenck, C.; and Fox, D. 2018. Spnets: Differentiable fluid dynamics for deep neural networks. In Conference on Robot Learning, 317 335. PMLR. Schmid, P. J. 2010. Dynamic mode decomposition of numerical and experimental data. Journal of fluid mechanics, 656: 5 28. Sirignano, J.; and Spiliopoulos, K. 2018. DGM: A deep learning algorithm for solving partial differential equations. Journal of computational physics, 375: 1339 1364. Taira, K.; Brunton, S. L.; Dawson, S. T.; Rowley, C. W.; Colonius, T.; Mc Keon, B. J.; Schmidt, O. T.; Gordeyev, S.; Theofilis, V.; and Ukeiley, L. S. 2017. Modal analysis of fluid flows: An overview. Aiaa Journal, 55(12): 4013 4041. Takahashi, T.; Liang, J.; Qiao, Y.-L.; and Lin, M. C. 2021. Differentiable fluids with solid coupling for learning and control. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, 6138 6146. Tan, Q.; Pan, Z.; Smith, B.; Shiratori, T.; and Manocha, D. 2022. N-penetrate: Active learning of neural collision handler for complex 3d mesh deformations. In International Conference on Machine Learning, 21037 21049. PMLR. Um, K.; Brand, R.; Fei, Y. R.; Holl, P.; and Thuerey, N. 2020. Solver-in-the-loop: Learning from differentiable physics to interact with iterative pde-solvers. Advances in Neural Information Processing Systems, 33: 6111 6122. Wang, Z.; Xiao, D.; Fang, F.; Govindan, R.; Pain, C. C.; and Guo, Y. 2018. Model identification of reduced order fluid dynamics systems using deep learning. International Journal for Numerical Methods in Fluids, 86(4): 255 268. The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24) Wu, P.; Gong, S.; Pan, K.; Qiu, F.; Feng, W.; and Pain, C. 2021. Reduced order model using convolutional autoencoder with self-attention. Physics of Fluids, 33(7): 077107. Yang, P.; Jiang, Y.-L.; and Xu, K.-L. 2019. A trust-region method for H2 model reduction of bilinear systems on the Stiefel manifold. Journal of the Franklin Institute, 356(4): 2258 2273. Zang, T.; and Green, L. 1999. Multidisciplinary design optimization techniques-Implications and opportunities for fluid dynamics research. In 30th Fluid Dynamics Conference, 3798. Zhang, W.; Kou, J.; and Wang, Z. 2016. Nonlinear aerodynamic reduced-order model for limit-cycle oscillation and flutter. Aiaa Journal, 54(10): 3304 3311. The Thirty-Eighth AAAI Conference on Artificial Intelligence (AAAI-24)