# learnability_of_linear_porthamiltonian_systems__ab180216.pdf

Journal of Machine Learning Research 25 (2024) 1-56 Submitted 4/23; Published 2/24

Learnability of Linear Port-Hamiltonian Systems

Juan-Pablo Ortega Juan-Pablo.Ortega@ntu.edu.sg Division of Mathematical Sciences Nanyang Technological University, Singapore

Daiying Yin yind0004@e.ntu.edu.sg Division of Mathematical Sciences Nanyang Technological University, Singapore

Editor: Maxim Raginsky

A complete structure-preserving learning scheme for single-input/single-output (SISO) linear port-Hamiltonian systems is proposed. The construction is based on the solution, when possible, of the unique identiﬁcation problem for these systems, in ways that reveal fundamental relationships between classical notions in control theory and crucial properties in the machine learning context, like structure-preservation and expressive power. In the canonical case, it is shown that, up to initializations, the set of uniquely identiﬁed systems can be explicitly characterized as a smooth manifold endowed with global Euclidean coordinates, which allows concluding that the parameter complexity necessary for the replication of the dynamics is only O(n) and not O(n2), as suggested by the standard parametrization of these systems. Furthermore, it is shown that linear port-Hamiltonian systems can be learned while remaining agnostic about the dimension of the underlying data-generating system. Numerical experiments show that this methodology can be used to eﬃciently estimate linear port-Hamiltonian systems out of input-output realizations, making the contributions in this paper the ﬁrst example of a structure-preserving machine learning paradigm for linear port-Hamiltonian systems based on explicit representations of this model category.

Keywords: Linear port-Hamiltonian system, machine learning, structure-preserving algorithm, systems theory, physics-informed machine learning, unique identiﬁcation problem, controllable representation, observable representation, canonical representation.

1 Introduction 3

2 Preliminaries 6 2.1 State-space systems and morphisms . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Hamiltonian and port-Hamiltonian systems . . . . . . . . . . . . . . . . . . 7 2.3 Controllability and observability . . . . . . . . . . . . . . . . . . . . . . . . 9 2.4 The symplectic Lie group and its Lie algebra . . . . . . . . . . . . . . . . . 10 2.5 Williamson s normal form . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

c 2024 Juan-Pablo Ortega and Daiying Yin.

License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v25/23-0450.html.

Juan-Pablo Ortega and Daiying Yin

3 Controllable and observable Hamiltonian representations 11

4 Unique identiﬁcation of linear port-Hamiltonian systems 16 4.1 The unique identiﬁcation problem for ﬁlters in PHn . . . . . . . . . . . . . 18 4.2 Equivalence classes of port-Hamiltonian systems by system isomorphisms . 20 4.3 The quotient spaces as groupoid orbit spaces . . . . . . . . . . . . . . . . . 21 4.4 Characterization of canonical port-Hamiltonian systems . . . . . . . . . . . 23 4.5 The unique identiﬁability space for canonical port-Hamiltonian systems as a group orbit space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 4.6 Global Euclidean coordinates for the unique identiﬁability space of canonical port-Hamiltonian systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

5 Linear port-Hamiltonian systems in normal form are restrictions of higher dimensional ones 26

6 Practical implementation of the results 29

7 Numerical illustrations 30 7.1 Non-dissipative circuit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 7.2 Positive deﬁnite Frenkel-Kontorova model . . . . . . . . . . . . . . . . . . . 32

8 Conclusions 35

Acknowledgments 36

Glossary of Symbols 36

Bibliography 37

9 Appendices 42 9.1 Proof of Theorem 7 (i) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 9.2 Proof of Theorem 7 (ii) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 9.3 Proof of Proposition 18 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 9.4 Proof of Theorem 22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 9.5 Proof of Proposition 24 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 9.6 Proof of Proposition 30 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 9.7 Proof of Proposition 31 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 9.8 Proof of Proposition 32 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 9.9 Proof of Proposition 33 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 9.10 Proof of Theorem 34 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 9.11 Proof of Proposition 35 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 9.12 Proof of Proposition 36 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 9.13 Proof of Proposition 37 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 9.14 A note on the design of discrete integrators on the transformed space . . . . 55

Learnability of Linear Port-Hamiltonian Systems

1. Introduction

Machine learning has experienced substantial development in recent years due to signiﬁcant advances in algorithmics and a fast growth in computational power. The universal approximation properties of neural networks (Cybenko (1989); Hornik et al. (1989)) and other similar families make it possible for them to learn any function with very few prior assumptions. A typical modus operandi in supervised machine learning is ﬁrst to choose a neural network architecture, to perform forward propagation using available data, to compute some loss function, and then to carry out backward propagation, that is, gradient descent, to recursively optimize the parameters. This paradigm has proved to be very successful in the learning of numerous complicated tasks, including time-series forecasting (Hochreiter and Schmidhuber (1997)), computer vision (Krizhevsky et al. (2012)), and natural language processing (Devlin et al. (2018)).

In physics and engineering, machine learning is called to play an essential role in predicting and integrating the equations associated with physical dynamical systems. Physical systems are primarily formulated in terms of ordinary, time-delay, and partial diﬀerential equations that can be deduced mostly from variational principles. Consequently, some researchers propose to learn adequately discretized versions of their corresponding vector ﬁelds (see, for instance, Raissi and Karniadakis (2017),Qin et al. (2019), Long et al. (2018), and references therein). In addition to vector ﬁelds learning, researchers have proposed modelfree methods like transformers (Shalova and Oseledets (2020); Acciaio et al. (2022)), reservoir computing (Jaeger and Haas (2004); Lu et al. (2018); Pathak et al. (2018a,b)), recurrent neural networks (Bailer-Jones et al. (1998)), convolutional neural networks (Mukhopadhyay and Banerjee (2020)), or LSTMs (Wang (2017)).

Various universal approximation properties theoretically explain the empirical success of some of these approaches (see, for instance, Grigoryeva and Ortega (2018a,b); Gonon and Ortega (2020, 2021)) of some of these learning paradigms. Nevertheless, for physics-related problems, like in mechanics or optics, it is natural to build into the learning algorithm any prior knowledge that we may have about the system based on physics ﬁrst principles. This may include speciﬁc forms of the laws of motion, conservation laws, symmetry invariance, as well as other underlying geometric and variational structures. This observation regarding the construction of structure-preserving schemes has been profusely exploited with much success before the emergence of machine learning in the ﬁeld of numerical integration (Gonzalez (2000); Marsden and West (2001); Leimkuhler and Reich (2004); Mc Lachlan and Quispel (2006)). Many examples in that context show how the failure to maintain speciﬁc conservation laws can lead to physically inconsistent solutions.

The translation of this idea to the context of machine learning has led to the emergence of a new domain collectively known as physics-informed machine learning (see Raissi et al. (2017); Wu et al. (2018); Karniadakis et al. (2021) and references therein). In the speciﬁc case of Hamiltonian systems, the two main structural constraints are that the ﬂow is symplectic and the energy, that is,, the Hamiltonian, is conserved along the ﬂow. Additionally, symmetries are frequently present, which carries the emergence of additional conserved quantities in the form of the so-called momentum maps via Noether s Theorem (Abraham and Marsden (1978); Marsden and Ratiu (1999); Ortega and Ratiu (2004)). These are all examples of qualitative properties to be preserved by the learning algorithms. Needless

Juan-Pablo Ortega and Daiying Yin

to say, the above-mentioned model-free approaches generically fail to preserve all these structures. With these in mind, several attempts have been made in the literature to develop tailor-made learning algorithms for Hamiltonian systems. For example, in Greydanus et al. (2019); Celledoni et al. (2023), neural methods are proposed to learn the Hamiltonian function directly. In Chen et al. (2020), a symplectic recurrent neural network is proposed that uses symplectic integration while matching the predictions and observations and leads to a structure-preserving paradigm. Other structure-preserving methods include the so-called Symp Net (Jin et al. (2020)), the generating function neural networks (GFNN) in Chen and Tao (2021), and the symplectic reversible neural networks in Valperga et al. (2022). Symp Net constructs a universal approximating family of symplectic maps, while GFNN applies a modiﬁed KAM theory to control long-term prediction error. Symplectic reversible neural networks are also proposed as a family of universal approximating maps that concern, in particular, reversible symplectic dynamics. In Zhong et al. (2020), a parametric framework of learning Hamiltonian state dynamics with control is proposed, assuming that the Hamiltonian is separable. Under the same assumption, Tong et al. (2021) proposes to learn with a parametrized Hamiltonian in a Taylor series form.

This paper s focus diﬀers from the references mentioned above in two ways. First, these methods are designed to learn the state evolution of Hamiltonian systems, whereas our approach focuses on learning the input-output dynamics of port-Hamiltonian systems while remaining agnostic about the physical state space. As will be introduced later, these systems have an underlying Dirac structure that describes the geometry of numerous physical systems with external inputs (van der Schaft and Jeltsema (2014)) and includes the dynamics of the observations of Hamiltonian systems as a particular case. Even though various learning schemes for these systems have already been proposed in the literature (Nageshrao et al. (2015); Cheriﬁ(2020); Desai et al. (2021); Beckers et al. (2022)), most works on the learning of Hamiltonian systems deal with autonomous (separable) Hamiltonian systems on which one assumes access to the entire phase space and not only to its observations. Second, instead of a general nonlinear system for which only approximation error can be possibly estimated, we consider, as a ﬁrst approach exclusively linear systems, in which case, we can obtain explicit representations of linear port-Hamiltonian systems in normal form and characterize the symmetries and quotient spaces associated to the invariance by system automorphisms. Thereby, we propose a structure-preserving learning paradigm with a provable minimal parameter space.

The contributions in this paper are contained in several results that we brieﬂy introduce in the following lines. In Section 2, we deﬁne the notion of linear port-Hamiltonian systems in normal form and present some necessary introductory concepts. We start in Theorem 7 by introducing system morphisms that allow us to represent any linear port-Hamiltonian system in normal form as the image of another linear system of the same dimension in which the state equation is in controllable canonical form. An obvious observation is that since the constructed linear system and the original port-Hamiltonian system are linked by a system morphism, the input/output relations of the former are input/output relations of the latter once the initial state conditions have been properly set up. In particular, the new system can be used to learn to reproduce the input/output dynamics of the original port-Hamiltonian system (for a subspace of initial conditions) and this learning paradigm is structure-preserving by construction. Similarly, Theorem 7 also contains another type of

Learnability of Linear Port-Hamiltonian Systems

system morphisms that link any linear port-Hamiltonian system in normal form to some linear system of the same dimension in observable canonical form. Consequently, the inputouput relations of the original port-Hamiltonian system with respect to any initial condition can be captured by the observable Hamiltonian representation. Both representations are derived based on classical techniques from control theory, the Cayley-Hamilton theorem, and are ultimately corollaries of the Williamson normal form (Williamson (1936, 1937); Ikramov (2018)). We show that the controllable and observable representations are closely related to each other, and both system morphisms become isomorphisms for canonical port Hamiltonian systems. However, for the purpose of learning a general port-Hamiltonian system that may not be canonical, we reveal that there is a trade-oﬀbetween the structurepreserving property and the expressive power. These results establish a strong link between classical notions in the control theory, that is, controllability and observability, and those in machine learning, namely, structure-preservation and expressive power.

Based on these explicit constructions and using the parametrizations that come with them, we aim to tackle in Section 4 the unique identiﬁability of input-output dynamics of linear port-Hamiltonian systems in normal form. Such a characterization is obviously needed to solve the model estimation problem since, in applications, we only have access to input/output data, and diﬀerent state space systems can induce the same ﬁlter that produces that data. This fact has important implications when it comes to the learning of port-Hamiltonian systems out of ﬁnite-sample realizations of a given data-generating process because such degeneracy makes impossible its exact recovery. Said diﬀerently, it is not the space of port-Hamiltonian systems that needs to be characterized but its quotient space with respect to the equivalence relation deﬁned by the constraint on inducing the same input/ouput ﬁlter. We shall see in Subsection 4.1 that the presence of non-canonical systems in PHn and possible initialization inconsistencies make it, in general, diﬃcult to directly characterize that quotient space by ﬁlter-equivalence and we shall settle for the closest to it that we can get, namely, the quotient space by system automorphisms that, as it will be justiﬁed, approximates the general case in a certain sense and admits an explicit characterization as a Lie groupoid orbit space (Subsection 4.3). In Subsection 4.4, we restrict our identiﬁcation analysis to canonical port-Hamiltonian systems and show, ﬁrst, that in that situation eliminating the system isomorphisms completely identiﬁes the set of input/output systems up to state initializations (Sussmann (1976)), and second, that the corresponding quotient spaces can be characterized as orbit spaces with respect to a group (as opposed to a groupoid in the general unrestricted case) action, where the group is explicitly given by a semi-direct product. Moreover, (see Subsection 4.6) this orbit space can be explicitly endowed with a smooth manifold structure that has global Euclidean coordinates that can be used at the time of constructing estimation algorithms. Consequently, up to state initializations, canonical port-Hamiltonian dynamics can be identiﬁed fully and explicitly in either the controllable or the observable Hamiltonian representations and learned by estimating an initial state condition and a unique set of parameters in a smooth manifold that is obtained as a group orbit space.

Another learning-related problem that we tackle is that, in applications, one is obliged to remain agnostic as to the dimension of the underlying data-generating port-Hamiltonian system. This leads to the diﬃculty of choosing the dimension of the controllable/observable Hamiltonian representations. We solve this issue by proving in Theorem 34 that, for m n,

Juan-Pablo Ortega and Daiying Yin

any 2n-dimensional linear port-Hamiltonian system in normal form can be regarded as the restriction of a 2m-dimensional one to some subspace. This fact, together with some subsequent results, guarantees theoretically that we can choose a suﬃciently large m in practice and parametrize the observable Hamiltonian representation in dimension 2m and use it for learning without assuming any knowledge about the dimension of the data generating system. The paper concludes with some numerical examples in Section 7 that illustrate the viability of the method that we propose in systems with various levels of complexity and dimensions, as well as the computational advantages associated with using the parameter space in which unique identiﬁcation is guaranteed. For the reader s convenience, the Python code necessary to reproduce these numerics is public and can be found in https: //github.com/YINDAIYING/Learnability-of-Linear-Port-Hamiltonian-Systems.

2. Preliminaries

In this section, we introduce various notions and preliminary results necessary to understand the context and the contributions of the paper.

2.1 State-space systems and morphisms

A continuous time state-space system is given by the following two equations

z = F(z, u),

y = h(z), (1)

where u U is the input, z Z is the internal state and F : Z U Z is called the state map. The ﬁrst equation is called the state equation while the second one is usually referred to as the observation equation. The solutions of (1) (when available and unique) yield an input/output map that is by construction causal and time-invariant. State-space systems will be sometimes denoted using the triplet (Z, F, h).

Deﬁnition 1 A map f : Z1 Z2 is called a system morphism (see Grigoryeva and Ortega (2021)) between the continuous-time state-space systems (Z1, F1, h1) and (Z2, F2, h2) if it satisﬁes the following two properties:

(i) System equivariance: f(F1(z1, u)) = F2(f(z1), u), for all z1 Z1 and u U.

(ii) Readout invariance: h1(z1) = h2(f(z1)) for all z1 Z1.

As a direct consequence of this deﬁnition, the composition of system morphisms is again a system morphism. In the case f is invertible and f 1 is also a morphism, we say that f is a system isomorphism. An elementary but very important fact is that if f : Z1 Z2 is a linear system-equivariant map between (Z1, F1, h1) and (Z2, F2, h2) (Z1 and Z2 are in this case vector spaces) then, for any solution z1 C1(I, Z1) of the state equation associated to F1 and to the input u C1(I, U), with I R an interval, its image f z1 C1(I, Z2) is a solution for the state space system associated to F2 with the same input. Indeed, for any t I we have, by the linearity and the system equivariance of f:

d dt[f(z1(t))] = Df(z1(t)) z1(t) = f( z1(t)) = f(F1(z1(t), u(t))) = F2(f(z1(t)), u(t)).

Learnability of Linear Port-Hamiltonian Systems

Notice that if at time t = 0, the output of both systems (Z1, F1, h1) and (Z2, F2, h2) are the same, that is, the initial conditions z1(0) and z2(0) at the time of integrating (1) are chosen so that h1(z1(0)) = h2(f(z1(0))), then the two systems (Z1, F1, h1) and (Z2, F2, h2) have the same associated input/output relation, in the sense that we introduce later on in deﬁnition (7). This observation has an important consequence, namely that, in general, input/output systems are not uniquely identiﬁed since all the system-isomorphic state-space systems with appropriate initializations yield the same input/output map.

2.2 Hamiltonian and port-Hamiltonian systems

Hamiltonian systems are dynamical systems whose behavior is governed by Hamilton s variational principle. Even though these autonomous systems can be in general formulated on any symplectic manifold (Abraham and Marsden (1978)), we will restrict in this paper to the case in which the phase space is the even-dimensional vector space R2n endowed with the Darboux canonical symplectic form. In this case, the Hamiltonian system determined by the Hamiltonian function H C1(R2n) is given by the diﬀerential equation

is the so-called the canonical symplectic matrix. Note that J = JT =

J 1 and hence endows R2n also with a complex structure. In this paper, we will denote the canonical symplectic matrix as J, unless the context requires to specify the dimension, in which case we denote it by Jn.

A linear Hamiltonian system is determined by a quadratic Hamiltonian function H(z) = 1 2z T Qz, where z R2n and Q M2n is a square matrix that without loss of generality can be assumed to be symmetric. In this case, Hamilton s equations (2) reduce to

z = JQz. (3)

Port-Hamiltonian systems (see van der Schaft and Jeltsema (2014)) are state-space systems that generalize autonomous Hamiltonian systems to the case in which external signals or inputs control in a time-varying way the dynamical behavior of the Hamiltonian system. The family of input-state-output port-Hamiltonian systems are those port-Hamiltonian systems with no algebraic constraints on the state-space variables, and where the ﬂow and eﬀort variables of the resistive, control and interaction ports are split into conjugated pairs. In such cases, the implicit representation may be proved (see van der Schaft and Jeltsema (2014)) to be equivalent to the following explicit form:

x = [J(x) R(x)] H

x (x) + g(x)u,

y = g T (x) H

where (u, y) is the input-output pair (corresponding to the control and output conjugated ports), J(x) is a skew-symmetric interconnection structure and R(x) is a symmetric positivedeﬁnite dissipation matrix. Our work concerns linear port-Hamiltonian systems in the

Juan-Pablo Ortega and Daiying Yin

normal form which we deﬁne now: a linear port-Hamiltonian system (4) is in normal form if the skew-symmetric matrix J is constant and equal to the canonical symplectic matrix J, the Hamiltonian matrix Q is symmetric positive-deﬁnite, and the energy dissipation matrix R = 0, in which case (4) takes the form:

z = JQz + Bu,

with z R2n, u, y R, and where B R2n speciﬁes the interconnection structure simultaneously at the input and output levels. By deﬁnition, such systems are fully determined by the pair (Q, B), and hence we deﬁne by

(Q, B)|0 < Q M2n, Q = QT , B R2n

the space of paramters of (5). Let θPHn : ΘPHn PHn the map that associates to the parameter (Q, B) θPHn the corresponding port-Hamiltonian state space system. For convenience, we shall often use (Q, B) to denote elements in PHn unless there is a risk of confusion. Note that the condition Q > 0 implies that the origin is a Lyapunov stable equilibrium of (3). All these systems have the existence and uniqueness of solutions property and hence determine a family of input/output systems, also known as ﬁlters, that will be denoted by PHn. More speciﬁcally, the elements in PHn are maps U(Q,B) : C1([0, 1]) R2n C1([0, 1]) given by

U(Q,B) : C1([0, 1]) R2n C1([0, 1])

(u, x0) U(Q,B)(u, x0)t = BT Qe JQt t

0 e JQs Bu(s) ds + x0

(7) t [0, 1]. Note that PHn includes as a special case linear observations of autonomous linear Hamiltonian systems (case B = 0). Note that as a manifold ΘPHn = S+

2n R2n, where S+

2n denotes the space of symmetric positive-deﬁnite matrices (SPD). We recall that S+

2n has a natural diﬀerentiable manifold structure whose tangent space at any point is the vector space of symmetric matrices S2n (see Quang Minh and Murino (2018), and references therein).

Port-Hamiltonian systems are also closely linked to the so-called aﬃne Hamiltonian input-output systems that have been considered as a natural extension of Hamiltonian systems with external forces and studied extensively in the literature (see Crouch and van der Schaft (1987) for the deterministic case and Bismut (1982); L azaro-Cam ı and Ortega (2008) for stochastic extensions), which take the form

x = XH(x) + Xg(x)u,

y = g(x), (8)

where XH and Xg are the Hamiltonian vector ﬁelds of H, g C1(R2n). In the linear case, (8) reduces to

z = JQz JBu,

Learnability of Linear Port-Hamiltonian Systems

The relation between (9) and (5) is that y = BT z = BT JQz = ( JB)T Qz, showing that the time derivative of the aﬃne Hamiltonian input-output system has a port-Hamiltonian structure. Note that in the last equality, we used that BT JB = 0 since J is antisymmetric.

Consider now a general linear single-input/single-output system that takes the form

x = Ax + Bu,

where A Mn, B, C Rn. Very often in control theory, it is the so-called transfer matrix rather than the input/output system which is studied. The transfer matrix G(s) of (10) is deﬁned as G(s) = CT (Is A)B and converts the diﬀerential equations in the time domain to an algebraic equation in the Laplace frequency domain. It can be proved that the transfer matrix of the port-Hamiltonian systems (5) satisﬁes G(s) = G( s) and that of systems of the type (9) satisﬁes G(s) = G( s). The converse statements also hold for canonical realizations (see the deﬁnition in the next section and Brockett and Rahimi (1972), Maschke and van der Schaft (1992)). These facts are a strong indication that the systems (5) and (9) carry intrinsic symmetries that should be explicitly characterized. We shall do so in Section 4 for port-Hamiltonian systems but only using the original state-space representation.

2.3 Controllability and observability

Given a general linear system like (10), we recall that its controllability and observability matrices are deﬁned by

B | AB | . . . | An 1B

... CT An 1

, respectively.

The system is called controllable (respectively, observable) if its controllability (respectively, observability) matrix has full rank. Any linear controllable (respectively, observable) system can be transformed into the so-called controllable (respectively, observable) canonical forms by using appropriate linear system isomorphisms (see Polderman and Willems (1998)). Conversely, systems in these canonical forms are automatically controllable (respectively, observable). In the next section, we characterize the controllable/observable/canonical systems in the linear port-Hamiltonian category.

Controllability and observability are intertwined concepts in the linear port-Hamiltonian category. Indeed, it can be proved (see Medianu et al. (2013)) that if a linear port Hamiltonian system without dissipation is controllable and det(Q) = 0, then it is also observable. Conversely, if it is observable, then this implies that det(Q) = 0 and it is also controllable (see Medianu et al. (2013)). As it is customary in systems theory, we say a linear port-Hamiltonian system in normal form is canonical if it is both controllable and observable. In view of the results that we just recalled, if det(Q) = 0, then either controllability or observability is equivalent to the system being canonical. Furthermore, it can be shown that being canonical is a generic property, that is, the set of canonical systems forms an open and dense subset. We shall denote by PHcan

n PHn the subset of PHn made of

Juan-Pablo Ortega and Daiying Yin

canonical linear port-Hamiltonian systems. Later on in the paper, the signiﬁcance of these observations will become apparent.

2.4 The symplectic Lie group and its Lie algebra

A square matrix S M2n in dimension 2n is called symplectic if it satisﬁes ST JS = J. The set of all symplectic matrices forms a Lie group denoted by Sp(2n, R). It is well-known that if S Sp(2n, R) then det(S) = 1 and hence Sp(2n, R) is a subgroup of the general linear group GL(2n, R). The Lie algebra sp(2n, R) of Sp(2n, R) is given by the matrices A M2n that satisfy the identity AT J + JA = 0. Equivalently, A sp(2n, R) if and only if A = JR, where R M2n is symmetric. We will refer to the elements in Sp(2n, R) as symplectic matrices and to those in sp(2n, R) as inﬁnitesimally symplectic.

Notably, the eigenvalues of the elements in sp(2n, R) appear in speciﬁc patterns that are spelled out in the following classical proposition (see (Abraham and Marsden, 1978, Section 3.1)).

Proposition 2 The characteristic polynomial of any matrix in A sp(2n, R) is even. Thus, if λ is an eigenvalue of A then so are λ, λ, and λ.

The importance of this group in our developments is that the (constant) vector ﬁeld associated with the Hamilton s equations (3) is an element in sp(2n, R). Its ﬂow determines a one-parameter subgroup of elements in Sp(2n, R). We also introduce the unitary group U(n, C), which consists of matrices U Mn(C) with UU = U U = In, where U denotes the conjugate transpose of U. We denote by U(n) (see De Gosson (2006)) the image of U(n, C) in Sp(2n, R) by the monomorphism

The so-called 2-out-of-3 property (Arnold (1989)) implies that U(n) = O(2n, R) GL(n, C) Sp(2n, R), and it is indeed the intersection of any two out of the three groups.

2.5 Williamson s normal form

The following classical result can be found in Williamson (1936, 1937); Ikramov (2018); De Gosson (2006).

Theorem 3 Let M M2n be a positive-deﬁnite symmetric real matrix. Then

(i) There exists a symplectic matrix S Sp(2n, R) such that M = ST

D = diag(d) an n-dimensional diagonal matrix with positive entries and d = (d1, . . . , dn)T

(ii) The values d1, . . . , dn are independent, up to reordering, on the choice of the symplectic

matrix S used to diagonalize M .

Learnability of Linear Port-Hamiltonian Systems

(iii) Assume S and S are two elements of Sp(2n, R) such that M = ST

S , where D is as above, then S(S ) 1 U(n).

Later in this paper, we always use the notation D = diag(d) to denote that D is a diagonal matrix with diagonal entries given by the vector d = (d1, . . . , dn)T . The elements di in the above theorem are called the symplectic eigenvalues of M since they are also the eigenvalues of JM.

Remark 4 The above theorem can be generalized to positive-semideﬁnite real symmetric matrices. Indeed, it can ﬁrst be shown that if the kernel of M is a symplectic subspace of R2n of dimension 2m, then the statement of Theorem 3 still holds true holds with the only added feature that exactly m of the diagonal entries in D are equal to 0 (see Son and Stykel (2022)). More generally, without the symplecticity assumption, all that it can be

said is that there exists S Sp(2n, R) such that M = ST

S where D1 and D2

may contain diagonal zero entries (see Idel et al. (2017); Egusquiza and Parra-Rodriguez (2022)).

3. Controllable and observable Hamiltonian representations

In this section, we state two representation results for linear port-Hamiltonian systems in normal form, which are the main building blocks in our learnability results. More precisely, we deﬁne two subfamilies of linear systems of the type (10), that are respectively called controllable/observable Hamiltonian representations, that are by construction controllable/observable (Deﬁnition 5). We subsequently show in Theorem 7 that morphisms can be established between the elements in these families and those in the category PHn of normal form port-Hamiltonian systems.

As it will be spelled out later on in detail, the existence of these morphisms immediately guarantees that the complexity of the family of ﬁlters PHn is actually not O(n2), as it could be guessed from (5), but O(n). However, our proposed representations have certain limitations for non-canonical port-Hamiltonian systems. For example, the observable representation is guaranteed to capture all possible input-output dynamics of port-Hamiltonian systems (full expressive power), but it does not always produce port-Hamiltonian dynamics (fails to be structure-preserving). In the controllable case, structure preservation is guaranteed, but there is, in general, no full expressive power. Fortunately, for canonical port-Hamiltonian systems, all the morphisms that we shall introduce become isomorphisms, meaning that they are both structure-preserving and have full expressive power. Roughly speaking, the more canonical a port-Hamiltonian system is, the better the corresponding representations behave in terms of structure-preserving properties and expressive power.

The representations introduced below can be seen as a reparametrization of the elements (Q, B) PHn in terms of a diagonal matrix D = diag(d) Mn, d Rn, and a vector

v R2n, where D is obtained from Williamson s Theorem 3 as Q = ST

v = S B. This makes it obvious that the learning problem for port-Hamiltonian systems

Juan-Pablo Ortega and Daiying Yin

has parameter complexity of at most O(n) even if the Hamiltonian matrix has complexity O(n2).

We emphasize that even in the canonical situation, the availability of the controllable/observable representations does not yet provide a well-speciﬁed learning problem for this category since the invariance of these systems under system automorphisms implies the existence of symmetries (or degeneracies) in the parametrizations, which will be the focus of the next section.

The proofs of all our results are provided in the appendices.

Deﬁnition 5 Given d = (d1, . . . , dn)T Rn, with di > 0, and v R2n, we say that a 2ndimensional linear state space system is a controllable Hamiltonian (respectively, observable Hamiltonian) representation if it takes the form

1 (d) s + (0, 0, , 0, 1)T u,

2 (d, v) s,

1 (d) s + gobs

2 (d, v) u,

y = (0, 0, , 0, 1) s,

1 (d) M2n and gctr

2 (d, v) M1,2n (respectively, gobs

1 (d) M2n and gobs

2 (d, v) R2n) are constructed as follows:

(i) Given d Rn, let {a0, a1, . . . , a2n 1} be the real coeﬃcients that make λ2n + 2n 1

i=0 ai λi = (λ2 + d2

2) . . . (λ2 + d2

n) an equality between the two polynomials in λ. Let a2n = 1 by convention. Note that the entries ai with an odd index i are zero. Deﬁne:

0 1 0 . . . 0 0 0 1 . . . 0 ... ... ... ... ... 0 0 0 . . . 1 a0 a1 a2 . . . a2n 1

(respectively, gobs

1 (d) = gctr

(ii) Given d and v, then

2 (d, v) :=

0 c2n 1 0 c2n 3 . . . 0 c1

, (resp., gobs

2 (d, v) = gctr

c2k+1 = v T

for k = 0, . . . , n 1, and

f2 0 ... 0 fn 1

with fl = dl

j1,...,jk =l 1 j1< <jk n

2, l = 1, . . . , n.

Learnability of Linear Port-Hamiltonian Systems

We denote CHn (respectively, OHn) the set of all systems of the form (12), and we call them controllable Hamiltonian (respectively, observable Hamiltonian) representations. The symbol CHn (respectively, OHn) denotes the set of input/output systems induced by the state space systems in CHn (respectively, OHn). We emphasize that the elements of both CHn and OHn can be parameterized with the set

ΘCHn = ΘOHn :=

(d, v)|di > 0, v R2n

Sometimes later on in the paper we shall write ai(d) and cj(d, v) to indicate that ai and cj are functions of d and v.

Remark 6 Observe that the controllable and the observable Hamiltonian representations of port-Hamiltonian systems are closely related to each other. The controllable Hamiltonian matrix gctr

1 is the transpose of the observable Hamiltonian matrix gobs

1 . Moreover, as can be directly observed from the construction, the input and readout matrices of the two representations, that is, gctr

2 , are transpose of each other.

Consider now the maps θCHn : ΘCHn CHn and θOHn : ΘOHn OHn that associate to each parameter values the corresponding state-space system. Note that the elements in CHn (respectively, in OHn) of the form (12) are in canonical controllable (respectively, observable) form in the sense of Sontag (1998) and they are hence controllable (respectively, observable). Our main result below establishes a relationship between port-Hamiltonian systems and controllable (respectively, observable) Hamiltonian representations as deﬁned above, which will be used later on for considerations on the structure preservation and expressiveness in the modeling of PHn.

Theorem 7 (i) There exists, for each S Sp(2n, R), a map

ϕS : CHn PHn

θCHn(d, v) θPHn

with D = diag(d), such that the controllable Hamiltonian system θCHn(d, v) CHn and the port-Hamiltonian image ϕS (θCHn(d, v)) PHn are linked by a linear system morphism f(d,v)

S : R2n R2n.

(ii) Given a port-Hamiltonian system θPHn(Q, B) PHn, there exists an explicit linear

system morphism f(Q,B) : R2n R2n between the state space of θPHn(Q, B) PHn and that of an observable Hamiltonian system θOHn(d, v) OHn, where (d, v) ΘOHn is determined by the Williamson s normal form decomposition of Q determined by S

Sp(2n, R), that is, Q = ST

S, D = diag(d) and v = S B.

Remark 8 We emphasize that given (Q, B) ΘPHn, the pair (d, v) ΘCHn/ΘOHn is not uniquely determined by Williamson s decomposition. This can be seen from Theorem 3 because the element S Sp(2n, R) in its statement is not unique and the entries di of d are independent of S up to their ordering.

Juan-Pablo Ortega and Daiying Yin

Remark 9 (Controllability, observability, and invertibility)

(i) In the proof of the theorem above (available in the Appendix), we deﬁne the linear

system morphism f(d,v)

S : R2n R2n as z = f(d,v)

S (s) := Ls and an explicit construction of the matrix L is provided. It turns out that, the matrix L is invertible if and only if the image port-Hamiltonian system (5) is controllable, or equivalently, observable. Indeed, using the same notation as in the proof of Theorem 7, we have

L1v L2v L2nv

S 1L1v S 1L2v S 1L2nv

S 1L2n kv = S 1

+ + a2n k I2n

(Jn S T QS 1)k + a2n 1 (Jn S T QS 1)k 1 + + a2n k I2n

(SJn QS 1)k + a2n 1 (SJn QS 1)k 1 + + a2n k I2n

(Jn Q)k + a2n 1 (Jn Q)k 1 + + a2n k I2n

Therefore, L can be transformed by elementary column operations into the controllability matrix of (5) and hence L being invertible, that is, the two systems being isomorphic, is equivalent to the controllability matrix of (5) having full rank (regardless of the choice of S Sp(2n, R)), which is again equivalent to (5) being canonical. Additionally, the condition for f(d,v)

S to be invertible can also be formulated in terms of D and v directly, which we will discuss in Subsection 4.4.

(ii) Systems in CHn are by construction in controllable canonical form, and are therefore

always controllable. If the image system (5) by ϕS that we want to learn is controllable (or equivalently, observable), then by the previous point L is necessarily an invertible matrix which means that (12) and (5) are isomorphic systems by construction. As a consequence, (12) is not only controllable but also observable.

Remark 10 (Application to structure-preserving system learning) As a corollary of the previous result, we can use controllable Hamiltonian representations to learn port-Hamiltonian systems in an eﬃcient and structure-preserving fashion. Indeed, given a realization of a port-Hamiltonian system, a system of the type θCHn(d, v) CHn can be estimated using an appropriate loss (see Section 7). A representation of this type is more advantageous than the original port-Hamiltonian one for two reasons:

(i) The model complexity of the controllable Hamiltonian representation is only of order

O(n), as opposed to O(n2) for the original port-Hamiltonian one.

(ii) This learning scheme is automatically structure-preserving. Indeed, once a system θCHn(d, v) CHn has been estimated for a given realization, we have shown that there exists a family of linear morphisms, each of which is between the state space of θCHn(d, v)

Learnability of Linear Port-Hamiltonian Systems

CHn and some θPHn(Q, B) PHn, such that any solution of (12) is automatically a solution of some system in PHn. Hence, even in the presence of estimation errors for (d, v) ΘCHn, the solutions of θCHn(d, v) still correspond to a port-Hamiltonian system and hence this structure is preserved by the learning scheme.

Remark 11 (System learning and expressive power) Expressive power is an important property of any machine learning paradigm. As a continuation of the previous remarks, we emphasize that there is an important relation between the controllability of a system in PHn and the expressive power of the corresponding representation in CHn. Indeed, if (5) is controllable, by point (ii) in Remark 9, the corresponding preimage system θCHn(d, v) CHn can capture all possible solutions of (5), which amounts to the learning scheme based on ΘCHn having full expressive power. To see this, let z0 be an initial state of the controllable system θPHn(Q, B) PHn in (5). Since in that case we can ﬁnd an invertible system isomorphism f(d,v)

S that links it to some θCHn(d, v) ΘCHn,

there exists some corresponding initial state s0 =

(z0). Then, by Theorem 7 and the uniqueness of the solutions of ODEs, the solution of (12) with initial state s0 is a representation of the solution of (5) with initial state z0. However, if (5) fails to be controllable (that is, f(d,v)

S not invertible), then such an initial condition s0 may not exist. As a rule of thumb, the more controllable a system of the type (5) is, the higher the rank of f(d,v)

S is, and then the more expressive the corresponding controllable Hamiltonian representations are.

Remark 12 (Expressive power and structure-preservation) We emphasize that systems in OHn always have full expressive power guaranteed by the system morphism in Theorem 7. This implies that any input-output dynamics generated by the original port-Hamiltonian system will be captured by some of the observable Hamiltonian representations in the statement. However, unlike in the controllable case, the system morphism is between θPHn(Q, B) PHn and θOHn(d, v) OHn. Therefore, unless (Q, B) is canonical, in which case the morphism becomes an isomorphism, we cannot, in general, assert the structure-preserving property of this representation.

Remark 13 (Positive semi-deﬁnite Hamiltonians) The above results can be easily generalized to positive semi-deﬁnite (PSD) Hamiltonians with the aid of the generalized Williamson s theorem in the references Son and Stykel (2022); Idel et al. (2017); Egusquiza and Parra-Rodriguez (2022) that we brieﬂy discussed in Section 2.5. In general, the number of unknown parameters in the vector d is doubled (because of the matrices D1 and D2 that appear in this case), and their relation with the coeﬃcients {a0, a1, . . . , a2n 1} has to be modiﬁed accordingly, that is, λ2n + 2n 1

i=0 ai λi = (λ2+d1dn+1)(λ2+d2dn+2) . . . (λ2+dnd2n), where some of the di s could be 0. The expression

Juan-Pablo Ortega and Daiying Yin

1 (d) remains the same, whereas the expression of

2 (d, v) becomes Fk,0 0 0 Fk,1

f2,p 0 ... 0 fn 1,p

and fl,p = dnp+l

j1,...,jk =l 1 j1< <jk n

dj1dj2 djkdj1+ndj2+n djk+n for p = 0, 1, and l =

1, . . . , n. In this paper, we mainly deal with positive deﬁnite Q, since the possible degeneracy of a positive semi-deﬁnite Q destroys the symmetries studied later on in Section 4.

Remark 14 (Symmetries of the Hamiltonian representations) The parameterizations of the systems in CHn and OHn exhibit obvious symmetries. For example, the functions gctr

1 (d) and gobs

1 (d) are invariant under the permutation of the diagonal entries di. Moreover, gctr

2 (d, v) (similarly for gobs

2 (d, v)) contains entries c2k+1 of the

, which is in particular invariant under the

rotation of the planes spanned by the i-th and (n + i)-th entries of v. These observations will be central in the next section, in which we shall show that these and other symmetries of the representations in CHn or OHn are closely related to the system automorphism group of the space PHn.

4. Unique identiﬁcation of linear port-Hamiltonian systems

In this section, we study the unique identiﬁability of input-output dynamics of linear port Hamiltonian systems in normal form. Such a characterization is obviously needed to solve the model estimation problem. The rationale is that, in applications, we only have access to input/output data, and diﬀerent state space systems in PHn can induce the same ﬁlter that produces that data. This fact has important implications when it comes to the learning of port-Hamiltonian systems out of ﬁnite-sample realizations of a given data-generating process (Q, B) PHn because such degeneracy makes impossible the exact recovery of (Q, B) PHn in that context, no matter how good the properties of the algorithm used for that task are or how much data we have at our disposal. This observation indicates that it is not in the space PHn that we should look at for unique identiﬁcation but the quotient space associated to PHn with respect to certain equivalence relation filter that uniquely identiﬁes port-Hamiltonian ﬁlters, that is, PHn = PHn/ filter. However, as we shall see later on in Subsection 4.1, the presence of non-canonical systems in PHn makes it, in general, diﬃcult to directly characterize the quotient space PHn/ filter.

As we pointed out after Deﬁnition 1, all the system-isomorphic state-space systems with corresponding initializations yield the same ﬁlters or input/output map. However,

Learnability of Linear Port-Hamiltonian Systems

we emphasize that a given ﬁlter can be realized by state-space systems that are not even system-isomorphic (see Example 1 later on). On the other hand, filter-equivalence requires that the outputs of the two systems at time t = 0 are consistent with exactly the same initializations, whereas this is not part of the deﬁnition of sys-equivalence. Motivated by this fact, we study in Subsection 4.1 how filter and sys are related in terms of PHn and the controllable Hamiltonian representations CHn (which by Theorem 7 automatically induce port-Hamiltonian dynamics). In Subsection 4.2, we lower our expectations and characterize PHn/ sys as an approximation to PHn/ filter. The term approximation in this sentence is justiﬁed because filter and sys coincide on the set of canonical Hamiltonian representations CHcan

n , which is system-isomorphic as a set to PHcan

n , which is open and dense in PHn. In particular, unique identiﬁability can be achieved in CHcan

n by studying sys, that is, Θcan

CHn/ filter = Θcan

CHn/ sys. In addition to the discussion regarding filter and sys, recall that in the previous section, we have established a link between PHn and the representation spaces CHn and OHn which, as we saw in Deﬁnition 5, are both parametrized by the set

ΘCHn = ΘOHn =

(d, v) | v R2n, d Rn, di > 0, i {1, . . . , n}

Now, it is a natural question to ask what is the equivalence relation that corresponds to sys on the parameter space ΘCHn, and if it is possible to explicitly characterize the quotient space PHn/ sys on ΘCHn in a certain sense. All these questions are addressed step-by-step in the following subsections.

In Subsection 4.1, we show that two canonical controllable Hamiltonian representations are filter-equivalent if and only if they are sys-equivalent. In Subsection 4.2, we deﬁne an equivalence relation on ΘCHn and we show that PHn/ sys = ΘCHn/ (see Theorem 22). In Subsection 4.3, we characterize the equivalence classes PHn/ sys and ΘCHn/ as Lie groupoid orbit spaces.

In Subsection 4.4, we exclusively restrict our analysis to canonical port-Hamiltonian systems PHcan

n . We ﬁrst show that the parameter subset Θcan

CHn ΘCHn that corresponds to PHcan

n is open and dense in ΘCHn as it is determined by certain generic non-resonance and nondegeneracy conditions. If we deﬁne on ΘCHn the equivalence relation sys of system automorphisms of the corresponding controllable/observable Hamiltonian representations (see Deﬁnition 17), then it can be proved that, restricted to the canonical subset Θcan

CHn, the equivalence relation coincides with sys, and hence

n / sys = Θcan

CHn/ = Θcan

In Subsection 4.5, we prove that the fact that we restricted the above equivalence relations to canonical subsets allows us to characterize the corresponding quotients as orbit spaces with respect to a group (as opposed to groupoids in the general unrestricted case) action, where the group is given by a semi-direct product Sn φ Tn that will be speciﬁed in detail later on. Finally, in Subsection 4.6, we show that the orbit space Θcan

CHn/(Sn φTn) can be explicitly identiﬁed as a smooth manifold Rn

+ and endowed with global Euclidean coordinates, and hence

n / sys = Θcan

CHn/ = Θcan

CHn/ filter = Θcan

CHn/ sys = Θcan

CHn/(Sn φ Tn) = Rn

Juan-Pablo Ortega and Daiying Yin

Consequently, up to initializations, canonical port-Hamiltonian dynamics can be identiﬁed fully and explicitly in either the controllable or the observable Hamiltonian representations (12) and learned by estimating an initial state condition and a unique set of parameters in a smooth manifold that is obtained as a group orbit space.

4.1 The unique identiﬁcation problem for ﬁlters in PHn

In the context of model estimation/machine learning, we would like to characterize and identify the ﬁlters that constitute the elements in PHn. In Section 2.1, we have seen that two systems that are system isomorphic and are initialized according to the isomorphism induce the same input-output dynamics, which indicates that these isomorphisms are redundancies/symmetries in PHn. Our aim is to quotient out the symmetries given by system automorphisms and to investigate whether the quotient space uniquely identiﬁes the ﬁlters in PHn.

Deﬁnition 15 (PHn with equivalence relations sys and filter)

(i) The fact that two systems θPHn(Q1, B1) and θPHn(Q2, B2) in PHn induce the same ﬁl-

ter deﬁnes an equivalence relation in PHn, which we denote by (Q1, B1) filter (Q2, B2). Consequently, we have by deﬁnition PHn = PHn/ filter, which we call the unique identiﬁability space.

(ii) We observe that θPHn(Q1, B1) and θPHn(Q2, B2) in PHn are linearly system isomor-

phic according to Deﬁnition 1 if and only if there exists an invertible matrix L such that

LJQ1 = JQ2L

It is straightforward to check that system isomorphisms determine an equivalence relation on PHn. If θPHn(Q1, B1) and θPHn(Q2, B2) are system isomorphic, we write (Q1, B1) sys (Q2, B2). We denote by PHn/ sys the quotient space. The equivalence class in PHn/ sys that contains the element θPHn(Q, B) is denoted by [Q, B] PHn/ sys.

It is a natural question to ask about the relation between PHn/ sys and PHn/ filter, and if they are the same. However, in general, neither of the two equivalence relations sys and filter implies the other. To see filter does not imply sys, we note that in the next Example 1, a ﬁlter in PHn could be realized by two elements in PHn that are not sys-equivalent since ﬁlters identify exclusively the canonical part (that is, the minimal realization, see Kalman (1963)). To see the other direction, that is, sys does not imply filter, we can simply consider the value of the ﬁlter at time t = 0 induced by two systems (Q1, B1) filter (Q2, B2) from (7), which gives BT

1 Q1z0 = BT

2 Q2z0 for any z0, leading to BT

2 Q2. On the other hand, we have seen in (14) that (Q1, B1) sys (Q2, B2) only guarantees BT

2 Q2L, but not BT

2 Q2, unless L can be shown to be the identity matrix.

Learnability of Linear Port-Hamiltonian Systems

Example 1 Consider two systems θPHn(Q1, B1), θPHn(Q2, B2) PHn where

1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1

1 0 0 0 0 2 0 0 0 0 1 0 0 0 0 3

, B1 = B2 =

Both systems induce the same ﬁlter y(u, z0)t =

0 cos(t s)u(s)ds + (cos t, 0, sin t, 0)T z0, where z0 is the initial state. However, these two systems cannot be system isomorphic, since by (14) in that case there would exist an invertible L such that LJQ1 = JQ2L, and hence JQ1 would have the same set of eigenvalues as JQ2, which is not the case.

We have seen from the above that, in general, PHn/ filter and PHn/ sys are diﬀerent objects, and neither one is a subset of the other. In practice, we are more interested in characterizing the former, which appears to be diﬃcult due to issues that involve initialization consistency. Nevertheless, we can partially solve the problem by restricting to the generic subset of canonical port-Hamiltonian systems PHcan

n , and consider their corresponding controllable Hamiltonian representations θCHn(Θcan

CHn) by system isomorphisms, then, on those representations, the two equivalence relations will coincide exactly, that is,, Θcan

CHn/ filter = Θcan

CHn/ sys, which we ultimately characterize in Section 4.5. We present rigorous deﬁnitions of these equivalence relations on the parameter space before we state our main result Theorem 19.

Lemma 16 For (d1, v1) and (d2, v2) ΘCHn = ΘOHn, θCHn(d1, v1) sys θCHn(d2, v2) if and only if θOHn(d1, v1) sys θOHn(d2, v2)

Proof The proof is basically a restatement of the fact that gctr

1 (d) = gobs

1 (d)T and gctr

2 (d, v) = gobs

2 (d, v)T .

Deﬁnition 17 (ΘCHn with equivalence relations sys and filter)

(i) We shall denote (d1, v1) sys (d2, v2) if θCHn(d1, v1) and θCHn(d2, v2) are system

isomorphic for (d1, v1), (d2, v2) ΘCHn. Note that system isomorphisms for controllable/observable Hamiltonian representations are indeed equivalent as we showed in Lemma 16.

(ii) We shall denote (d1, v1) filter (d2, v2) if θCHn(d1, v1) and θCHn(d2, v2) induce the

same ﬁlter for (d1, v1), (d2, v2) ΘCHn. Note that, unlike sys, filter is deﬁned speciﬁcally for ΘCHn, and could be diﬀerent if one replace ΘCHn with ΘOHn.

Proposition 18 Given (d1, v1) and (d2, v2) in ΘCHn, then

(I) θCHn(d1, v1) sys θCHn(d2, v2) if and only if ai(d1) = ai(d2) and ci(d1, v1) = ci(d2, v2)

for all i = 1, . . . , n. In other words, there exists a permutation matrix Pσ Mn such that,

for D = diag(d) and P =

, the following conditions hold true:

Juan-Pablo Ortega and Daiying Yin

(F1)k 0 0 (F1)k

(F2)k 0 0 (F2)k

v2, k = 0, . . . , n 1

The matrices Fi are deﬁned in Theorem 7.

(II) θCHn(d1, v1) filter θCHn(d2, v2) if and only if ci(d1, v1) = ci(d2, v2) and ei(d1, v1) =

ei(d2, v2) for all i = 1, . . . , n, where the scalar functions ei are deﬁned recursively as

e1 = c1 e2 = c3 a2n 2 e1 e3 = c5 a2n 2 e2 a2n 4 e1

en = c2n 1 a2n 2 en 1 a2n 4 en 2 a2 e1.

Theorem 19 Given (d1, v1) and (d2, v2) in Θcan

CHn, then θCHn(d1, v1) filter θCHn(d2, v2) if and only if θCHn(d1, v1) sys θCHn(d2, v2), that is, Θcan

CHn/ filter = Θcan

Proof The ﬁrst part of the statement immediately follows from Proposition 18 and the fact that e1 = c1 = 0, which is guaranteed by the fact that we are considering canonical systems, see the characterizations in Section 4.4.

4.2 Equivalence classes of port-Hamiltonian systems by system isomorphisms

We have seen that PHn/ sys is not the set of port-Hamiltonian ﬁlters due to the presence of non-canonical systems and possible initialization inconsistencies. However, it is still informative to study the quotient space PHn/ sys because when restricted to the canonical systems, PHcan

n / sys uniquely identiﬁes canonical port-Hamiltonian dynamics up to initializations. Furthermore, PHcan

n / sys is isomorphic to Θcan

CHn/ filter. In other words, PHcan

n / sys uniquely identiﬁes the set of canonical controllable Hamiltonian representations CHcan

n . We shall make this point clearer in Sections 4.5 and 4.6. In this section, we introduce a manageable characterization of the quotient space PHn/ sys by using parameter spaces. First, motivated by Williamson s theorem, we consider the space ΘCHn deﬁned before as the set of all pairs of the form (d, v), where d = (d1, d2, . . . , dn)T with di > 0, and v = (v1, v2, . . . , v2n)T R2n. Inspired by the representation results, we now deﬁne an equivalence relation on ΘCHn as below whose equivalence classes are denoted by [d, v]. The importance of the next deﬁnition is that, as we shall prove in Theorem 22, the relation on ΘCHn plays the same role as sys on PHn.

Deﬁnition 20 The pairs (d1, v1) and (d2, v2) in ΘCHn are -equivalent, that is, (d1, v1) (d2, v2), if there exists a permutation matrix Pσ Mn and an invertible matrix

Learnability of Linear Port-Hamiltonian Systems

A such that, for Di = diag(di), i {1, 2} and P =

, the following conditions hold

(iv) v2 = PAv1.

Proposition 21 The relation deﬁned in Deﬁnition 20 is an equivalence relation on ΘCHn.

In the next subsection, we shall give meaning to in terms of groupoid orbits. Now, we aim to characterize the sys equivalence relation on PHn as the equivalence relation on the space ΘCHn of (d, v)-pairs, that is, we shall prove that ΘCHn/ = PHn/ sys. This will be proved in three steps. First, we show that for an arbitrary S Sp(2n, R), the map ϕS deﬁned in Theorem 7 composed with θCHn is compatible with the equivalence relations and sys, that is, (d1, v1) (d2, v2) if and only if ϕS(θCHn(d1, v1)) sys ϕS(θCHn(d2, v2)). Then, we show that the unique map ψS induced by ϕS θCHn on the quotient spaces does not depend on the choice of S and hence the family of maps ψS parameterized by S Sp(2n, R) induces a unique map Φ : ΘCHn/ PHn/ sys which is a homeomorphism.

Theorem 22 (Characterization of PHn/ sys as ΘCHn/ ) Given any arbitrary S Sp(2n, R), the map ϕS θCHn induces on the quotient spaces a map Φ : ΘCHn/ PHn/ sys which does not depend on S Sp(2n, R) and is given by

Φ([d, v] ) =

where D = diag(d). Moreover, Φ is a homeomorphism with respect to the quotient topologies.

4.3 The quotient spaces as groupoid orbit spaces

Recall that from a category theory point of view, a group can be seen as a category with a single object where all morphisms are invertible. Groupoids are a natural generalization of this notion and refer to categories with possibly more than one object, where again all morphisms are invertible (see Mackenzie (2005) for a comprehensive introduction). As it is customary, groupoids will be denoted with the symbol α, β : G M (or simply G M), where α and β are the target and the source maps, respectively. Given m M, the groupoid orbit that contains this point is given by Om = α

M. The orbit space associated to G M is denoted by M/G.

Juan-Pablo Ortega and Daiying Yin

In this section, we provide an alternative point of view for Theorem 22 in terms of groupoid orbits. More precisely, we show ﬁrst that the set of equivalence classes PHn/ sys (resp. ΘCHn/ ) is the orbit space ΘPHn/Gn (resp. ΘCHn/Hn) of a groupoid Gn ΘPHn (resp. Hn ΘCHn) which we construct in the following paragraphs. In a second step we show that the statement in Theorem 22 is equivalent to saying that the orbit spaces PHn/ sys and ΘCHn/Hn of the two groupoids coincide.

Deﬁnition 23

1. Let Gn := {(L, (Q, B)) |L GL(2n, R), (Q, B) ΘPHn such that

(i) JT LJQL 1is symmetric positive-deﬁnite (ii) B = JT LT JLB}.

2. Let the target and source maps α, β : Gn ΘPHn be deﬁned as

α(L, (Q, B)) := (JT LJQL 1, LB) and β(L, (Q, B)) := (Q, B).

3. Deﬁne the set of composable pairs as

n := {((L1, (Q1, B1)), (L2, (Q2, B2))) | β((L1, (Q1, B1))) = α((L2, (Q2, B2)))}.

4. Let the multiplication map m : G(2)

n Gn be deﬁned as m((L1, (Q1, B1)), (L2, (Q2, B2))) = (L1L2, (Q2, B2)).

5. Let the identity section : ΘPHn Gn be deﬁned as (Q, B) := (I2n, (Q, B)).

6. Let the inversion map i : Gn Gn be deﬁned as i(L, (Q, B)) := (L 1, (JT LJQL 1, LB)).

Proposition 24 The deﬁnition above determines a Lie groupoid Gn ΘPHn with Gn the total space, ΘPHn the base space, and structure maps α, β, m, , i. We refer to Gn ΘPHn as the port-Hamiltonian groupoid. The orbit space of this groupoid ΘPHn/Gn coincides with PHn/ sys.

Deﬁnition 25

1. Let Hn :=

((Pσ, A), (d, v)) |Pσ Mn is a permutation matrix, A GL(2n, R),

(d, v) ΘCHn, such that (i) AT

A, where D = diag(d)

2. Let the target and source maps α, β : Hn ΘCHn be deﬁned as

α((Pσ, A), (d, v)) := (d, v) and β((Pσ, A), (d, v)) := (Pσd, PAv), where P =

3. Deﬁne the set of composable pairs as

((Pσ,1, A1), (d1, v1)), ((Pσ,2, A2), (d2, v2))

| β((Pσ,2, A2), (d2, v2)) = α((Pσ,1, A1), (d1, v1))

4. Let the multiplication map m : H(2)

n Hn be deﬁned as m

((Pσ,1, A1), (d1, v1)), ((Pσ,2, A2), (d2, v2))

= ((Pσ,2Pσ,1, P T

σ,1A2Pσ,1A1), (d1, v1)).

Learnability of Linear Port-Hamiltonian Systems

5. Let the identity section : ΘCHn Hn be deﬁned as (d, v) := ((In, I2n), (d, v)).

6. Let the inversion map i : Hn Hn be deﬁned as

i((Pσ, A), (d, v)) := ((P T

σ , PσA 1P T

σ ), (Pσd, PAv)).

Proposition 26 The deﬁnition above determines a Lie groupoid Hn ΘCHn with Hn the total space, ΘCHn the base space, and structure maps α, β, m, , i. We refer to Hn ΘCHn as the reduced port-Hamiltonian groupoid. The orbit space of this groupoid ΘCHn/Hn coincides with ΘCHn/ .

Theorem 22 can now be restated in terms of the elements that we just introduced.

Theorem 27 The orbit spaces of the Lie groupoids Gn ΘPHn and Hn ΘCHn are isomorphic.

4.4 Characterization of canonical port-Hamiltonian systems

In Subsections 4.2 and 4.3 we have provided a characterization of PHn/ sys in terms of ΘCHn/ and groupoid orbit spaces. Recall from Subsection 4.1 that the diﬃculty of the unique identiﬁability of ﬁlters in PHn comes from two parts: the possible presence of non-canonical systems, and the possible initialization inconsistency. We have shown in Subsection 4.1 that, by restricting to canonical systems, the ﬁlters induced by controllable Hamiltonian representations CHcan

n can be uniquely identiﬁed, even though we still cannot do the same for PHcan

n . Hence, it is worth studying what the quotient spaces above look like when restricted to the subset that contains only canonical port-Hamiltonian systems. In this section, we take a step in that direction.

Recall that a port-Hamiltonian system in PHn of the form (5) is controllable (or equivalently, observable/canonical) if and only if

B | JQB | . . . | (JQ)2n 1B

Using the Williamson decomposition of Q into D and S, and v := S B, this is equivalent to

By deﬁnition, we have that PHcan

n (respectively, Θcan

CHn) is a subset of PHn (respectively, ΘCHn) made of systems that satisfy (16) (respectively, (17)). We now characterize the space of pairs (d, v) ΘCHn that correspond to canonical port-Hamiltonian systems in normal form. The calculation of the determinant in (17) yields

1 j<k n(dj+dk)2(dj dk)2

up to the sign. Therefore,

(d, v) ΘCHn | entries of d are distinct and v2

n+l > 0 for l {1, . . . , n}

We shall refer to the statement on the entries of d being all diﬀerent as the non-resonance condition and to v2

n+l > 0 for all l {1, . . . , n} as the nondegeneracy condition. There might be a concern about whether diﬀerent choices of the matrix S lead to diﬀerent vectors v and hence the notion of nondegeneracy would be ill-deﬁned. This is indeed not a

Juan-Pablo Ortega and Daiying Yin

problem since, as we show in Remark 28 below, once the non-resonance condition is assumed, diﬀerent vectors v are obtained by rotating the planes spanned by each and every pair of l-th and n + l-th entries, which preserves the value of v2

n+l. Thus, the nondegeneracy condition is actually based on the non-resonance condition.

Remark 28 (Williamson s decomposition in the canonical case) We have mentioned in Theorem 3 (iii) that two symplectic matrices S and S that Williamson decompose the same Q diﬀer by a unitary matrix. We now note that for an element Q that satisﬁes the non-resonance condition, S and S do not only diﬀer by an arbitrary U U(n), (see (11) for the deﬁnition of U(n)) but by a special one R that has the form

cos θ1 0 sin θ1 0 ... ... 0 cos θn 0 sin θn sin θ1 0 cos θ1 0 ... ... 0 sin θn 0 cos θn

This fact accounts for part of the symmetry that we shall spell out later on. The proof of this fact is purely computational: the assumption that the diagonal entries of D are all

positive and distinct, the fact that U satisﬁes the equation U

at the same time, U U(n) = SO(2n, R) Sp(2n, R), guarantees the claim.

Remark 29 (Being canonical is a generic property) It is well-known that the set of canonical systems, as a subset of all linear systems, corresponds to a Zariski open set, which is open and dense in the usual topology (Tcho (1983)). In particular, this also holds for linear port-Hamiltonian systems. Therefore, PHcan

n is open and dense in PHn. On the other hand, using the characterization provided above, it is clear that Θcan

CHn is also open and dense in ΘCHn.

The isomorphism in Theorem 22 naturally restricts to canonical subsets, that is, PHcan

n / sys = Θcan

CHn/ . On the other hand, we will see below another isomorphism result involving PHcan

Proposition 30 (Characterization of PHcan

n / sys as Θcan

The map Φ : Θcan

CHn/ sys PHcan

n / sys deﬁned by Φ([d, v]sys) =

D = diag(d), is an isomorphism.

We just proved that both Θcan

CHn/ and Θcan

CHn/ sys are isomorphic to PHcan

n / sys, and even via the same ismorphism Φ. Therefore, the equivalence relations and sys coincide when restricted to Θcan

CHn. To summarize, we have proved in this subsection that

n / sys = Θcan

CHn/ = Θcan

In the next subsection, we continue the investigation of the above chain of isomorphisms.

Learnability of Linear Port-Hamiltonian Systems

4.5 The unique identiﬁability space for canonical port-Hamiltonian systems as

a group orbit space

In Subsection 4.3, it is proved that the quotient space PHn/ sys can be treated as a Lie groupoid orbit space. We now show that the restricted quotient space to canonical port Hamiltonian systems, that is, PHcan

n / sys, is isomorphic to the orbit space of a certain group action on Θcan

CHn, where the group is a semi-direct product of the n-permutation group and the n-torus, that is, Sn φ Tn. The intuition behind this fact is that restricting to the subset of canonical systems PHcan

n removes the degeneracies in PHn, which allows to reduce the symmetry of the Lie groupoid Gn ΘPHn to that of the Lie group Sn φ Tn.

We start by deﬁning the group action. First, let the permutation group Sn act on Rn

by permuting the entries di of the vector d Rn. For each i {1, . . . , n} the circle S1 acts on the plane spanned by the i-th and (n + i)-th entries of v by rotations. More precisely, we deﬁne the action of Sn on elements d and v as

(d1, . . . , dn)T

= (dσ(1), . . . , dσ(n))T = Pσ (d1, . . . , dn)T

where Pσ is the corresponding permutation matrix and

(v1, . . . , v2n)T

= (vσ(1), . . . , vσ(n), vn+σ(1), . . . , vn+σ(n))T =

(v1, . . . , v2n)T ,

respectively. Then the σ-action on a pair (d, v) is understood as acting on d and v simultaneously. We also deﬁne the action of the i-th circle of the torus Tn as the planar rotation of the space spanned by the i-th and (n+i)-th entries of v. This torus action is understood to leave d invariant. More concretely, it is the action

(d1, . . . , dn, v1, . . . , v2n)T

= (d1, . . . , dn, v1, . . . , vi 1, cosθivi sinθivn+i, vi+1, . . . , vn,

vn+1, . . . , vn+i 1, sinθivi + cosθivn+i, vn+i+1, . . . , v2n)T .

With these actions of the groups Sn and Tn on ΘCHn we deﬁne the map Γ(σ,(θ1,...,θn)T ) : (Rn

Γ(σ,(θ1,...,θn)T )(d, v) = Γθ1 Γθn Γσ(d, v)

= (Pσ d, Γθ1 Γθn

) = (Pσ d, RP v), (19)

which constitutes an action of the semi-direct product group Sn φ Tn, where φ : Sn Aut(Tn) is given by the permutation φ(σ)((θ1, . . . , θn)T ) = Pσ (θ1, . . . , θn)T . Note that the matrix of Γθ1 Γθn is given by R in (18), Pσ is the permutation matrix that corresponds

to σ Sn, and P =

Proposition 31 The map Γ(σ,(θ1,...,θn)T ) deﬁned as (19) for σ Sn and (θ1, . . . , θn)T Tn

is a left group action of (Sn φ Tn) on ΘCHn.

Juan-Pablo Ortega and Daiying Yin

Using the deﬁnition of the (Sn φ Tn)-action on ΘCHn, two elements (d1, v1), (d2, v2) ΘCHn are in the same orbit if and only if the following conditions hold true for some σ Sn:

(i) d2,i = d1,σ(i),

1,σ(i) + v2

1,n+σ(i), i = 1, . . . , n.

By Proposition 18 (I) parts (i) and (ii), it can be seen that there could be a close relation between the (Sn φ Tn)-action and the equivalence relation sys on ΘCHn. The next proposition demonstrates that the orbit spaces of the (Sn φ Tn)-action coincide with the equivalence classes of the relation sys when we restrict our attention to the subset Θcan

Proposition 32 (Characterization of Θcan

CHn/ sys as Θcan

CHn/(Sn φ Tn)) Given (d1, v1) and (d2, v2) in Θcan

CHn, then (d1, v1) sys (d2, v2) if and only if (d1, v1) and (d2, v2) lie in the same orbit of the (Sn φ Tn)-action.

4.6 Global Euclidean coordinates for the unique identiﬁability space of

canonical port-Hamiltonian systems

Recall from Section 4.4 that Θcan

CHn contains pairs (d, v) where d Rn

+ and v R2n are such that the entries dl s are all distinct and v2

n+l > 0 for all l = 1, . . . , n. We deﬁne for

convenience a function R : R2n Rn

0 as R((v1, . . . , v2n)T ) =

n+1, . . . , v2

T . Now observe that the quotient space Θcan

CHn/(Sn φ Tn) naturally has a smooth manifold structure. We brieﬂy prove this in the following lines. Note that the torus Tn is a connected abelian compact Lie group. The symmetry group Sn is a ﬁnite group, and hence compact as well. Thus, it is easy to see that the semi-direct product Sn φ Tn is also a compact Lie group, and hence its action on Θcan

CHn is automatically proper. On the other hand, since Θcan

CHn is the space of (d, v) pairs satisfying that d contains distinct entries and R(v)(l) > 0 for l = 1, . . . , n, it necessarily holds that the only element in Sn φ Tn that possibly keep any element in Θcan

CHn invariant is the identity, which implies the (Sn φ Tn)-action on Θcan

CHn is free. Classical results in Lie theory (Ortega and Ratiu, 2004, Proposition 2.3.8) guarantee that Θcan

CHn/(Sn φ Tn) admits a unique smooth structure such that the quotient map π : Θcan

CHn/(Sn φ Tn) is a submersion. With this as a motivation, we try to ﬁnd the quotient space explicitly in the following.

For a ﬁxed d, we denote by d the reordered vector constructed out of d by placing the entries in increasing order. Denote by Rn

+ the set of d Rn

+ with distinct positive entries in increasing order. We have then the following proposition that explicitly characterizes the quotient space Θcan

CHn/(Sn φ Tn).

Proposition 33 (Global Euclidean coordinates for orbit space Θcan

CHn/(Sn φ Tn)) The map f : Θcan

CHn/(Sn φ Tn) Rn

+ deﬁned by f([d, v]) = (d , R(Γσ(v))), where σ Sn is the unique permutation such that Γσ(d) = d , is an isomorphism.

5. Linear port-Hamiltonian systems in normal form are restrictions of

higher dimensional ones

In this section, we prove a theorem (Theorem 34), inspired by the classical Kalman Decomposition (Jacob and Zwart (2012)), which says the ﬁlter induced by any (Q, B) PHn can

Learnability of Linear Port-Hamiltonian Systems

be regarded as that induced by some (Q , B ) PHm, where m can be any integer that is at least n. The motivation for these considerations is given by the fact that in many practical situations in which an input/ouput system has to be learned, the dimension of the underlying state-space system is not known. In that situation, we may want to have the ﬂexibility of considering the actual system that needs to be learned as a lower-dimensional restriction of a much larger-dimensional one that we have picked for the learning task.

We shall carry this out by producing an explicit injective system morphism between the state space of (Q, B) and that of (Q , B ) in our next Theorem 34. In Proposition 35, we show that the quotient space PHn/ sys can be characterized as PHm,n/ sys, where PHm,n PHm is the space containing all the systems of the form (Q , B ). Motivated by the developments in Section 4, we then characterize the pair (d , v ) that corresponds to (Q , B ) in Proposition 36. Eventually, in Proposition 37, we show that the isomorphism PHn/ sys = ΘCHn/ can be lifted to high dimension as well. We shall comment further at the end of this section on the signiﬁcance of the above-mentioned results in the context of machine learning.

The following theorem states that the ﬁlter induced by (Q, B) PHn can be reproduced using systems in an arbitrarily higher dimension.

Theorem 34 Given any system (Q, B) PHn, then

(i) For any m n, there exists an orthogonal matrix O O(2m, R) such that the ﬁlter

induced by (Q , B ) =

Q 0 0 I2m 2n

PHm coincides with that induced

(ii) The map f : R2n R2m deﬁned by f(z) = O

z is an injective system morphism

between the state spaces of (Q, B) and (Q , B ).

As it can be seen in the proof (included in Appendix 9.10), the matrix O O(2m, R) above is constructed so that

Jn 0 0 Jm n

OT = Jm. (20)

From now on, we denote by PHm,n PHm the space of linear port-Hamiltonian sys-

tems parametrized by pairs (Q , B ) of the form

Q 0 0 I2m 2n

O(2m, R) satisﬁes (20), and equip it with the system automorphism relation sys deﬁned on PHm. The following proposition states that, up to system isomorphism, PHn is indeed the same as PHm,n. This means that, with appropriate initialization, we can exactly reproduce the input/output dynamics of 2n-dimensional port-Hamiltonian systems in higher dimension by simply considering the elements (Q , B ) in PHm,n.

Proposition 35 The function f : PHn/ sys PHm,n/ sys deﬁned by

f([Q, B]sys) =

Q 0 0 I2m 2n

is an isomorphism, where O O(2m, R) is as in Theorem 34 and hence satisﬁes (20).

Juan-Pablo Ortega and Daiying Yin

Recall that for a system (Q, B) PHn, we derive the corresponding object (d, v)

ΘCHn from Williamson s decomposition Q = ST

S and v = S B. We have seen

that (Q , B ) PHm,n PHm is also a linear port-Hamiltonian system in normal form. Therefore, it makes sense to investigate the relation between (d, v) and the element (d , v ) which corresponds to (Q , B ). The following proposition asserts that d can be obtained from d by padding it with ones and, similarly, v can be obtained by splitting v and padding each segment with zeros.

Proposition 36 (Symplectic eigenvalues of the higher dimensional system) Let (Q, B) and (Q , B ) be as in Theorem 34, and let d and d be their corresponding symplectic eigenvalues. Then, up to reordering, d = (d1, , dn, 1, 1, . . . , 1)T . Even though v and v

are not uniquely determined (See Remark 8), there exists a choice of v that is related to v = (v1, , vn, vn+1, , v2n)T via

v1, , vn, 0, , 0

, vn+1, , v2n, 0 0

From the above proposition, we call d the extended symplectic eigenvalues and v the extended vector. Now we deﬁne the space ΘCHm,n as the set of all pairs of the form (d , v ) and equip ΘCHm,n with the equivalence relation as in Deﬁnition 20 but in dimension m instead of n. Recall that we proved ΘCHn/ = PHn/ sys. Now we proceed to show that the above isomorphism in dimension 2n can be lifted to dimension 2m by considering only the restricted parameter spaces with vectors of the form (d , v ) and (Q , B ).

Proposition 37 The function f : ΘCHm,n/ PHm,n/ sys deﬁned by

f([d , v ] ) =

where D = diag(d ), is an isomorphism.

Note that in general d contains repeated symplectic eigenvalues because of all the ones used in the extension and that v 2

m+l = 0 for l > n. Therefore, it is impossible that ΘCHm,n contains canonical systems for m > n. In other words, lifting PHn to PHm,n introduces degeneracies that exclude the possibility of the systems being canonical.

We emphasize that the above-mentioned series of results are crucial in machine learning applications. Very often in practice, the dimension 2n of the underlying data-generating process, that is, the latent port-Hamiltonian system (5), is not known, causing a problem when choosing the dimension of the controllable/observable Hamiltonian representation for learning. This issue can be solved by composing the morphism in Theorem 34 (ii) (which is injective) and the one in Theorem 7 (not necessarily injective). The composition of system morphisms is still a system morphism, this time between the underlying system θPHn(Q, B) and the observable Hamiltonian representation in an arbitrarily higher dimension 2m 2n. In this way, the observable Hamiltonian representations in dimension 2m still have full expressive power to represent any 2n-dimensional system in PHn, and hence can be used for learning. Practically, one can choose a suﬃciently large m, and parameterize the observable Hamiltonian representation using (d, v) (we use the notation (d, v) instead of (d , v )

Learnability of Linear Port-Hamiltonian Systems

because practically we do not know what n is) and then estimate them. We emphasize that the higher-dimensional port-Hamiltonian systems are in general not canonical, hence the (d, v)-pair that corresponds to the data-generating process is not guaranteed to be unique. Still, we always know there is at least one choice of (d, v) that works no matter how large an m we choose, and which is constructed using the recipe in Proposition 36.

6. Practical implementation of the results

We start with a diagram that summarizes the results that we have proved.

Theorem 38 The following diagram holds true using the isomorphisms explicitly constructed in all the preceding results. We denote the inclusion between one set and the other by a one-directional arrow.

ΘCHm,n/ PHm,n/ sys

ΘCHn/Hn ΘCHn/ PHn/ sys ΘPHn/Gn

CHn/(Sn φ Tn) Θcan

CHn/ sys Θcan

CHn/ filter

We now comment on how to use the results contained in the diagram above depending on the diﬀerent learning situations that we may encounter. Indeed, we can use our statements to tackle three diﬀerent learning scenarios:

Case 1: The target port-Hamiltonian system (the data generating process that we want to

learn) is canonical and its state-space dimension is known, that is, θPHn(Q, B) PHcan

n with n known. This is the most favorable situation in the sense that we can exactly represent the system θPHn(Q, B) by either the controllable or the observable Hamiltonian representations, which are both isomorphic to the original system. Furthermore, since, in this case, the input/output map can be uniquely identiﬁed by properly setting up the initialization, it can be learned by estimating an initial state condition of the representation used and the unique parameters in Rn

Case 2: The target port-Hamiltonian system is not guaranteed to be canonical but its

dimension is known, that is, θPHn(Q, B) PHn with n known. In this case, there is a trade-oﬀbetween the controllable Hamiltonian representation and the observable one.

Juan-Pablo Ortega and Daiying Yin

As mentioned before, the controllable one will be structure-preserving but its expressive power depends on the controllability of the target system θPHn(Q, B). On the other hand, the observable one always possesses full expressive power but does not always guarantee the port-Hamiltonian structure of the induced ﬁlter.

Case 3: We are agnostic about the dimension of the target port-Hamiltonian system,

that is, given θPHn(Q, B) PHn with n unknown. In this case, we need to choose a suﬃciently large m so that m n, then based on composition of system morphisms, it suﬃces to learn some (d, v) ΘCHm and use the 2m-dimensional observable Hamiltonian representation to reproduce the input-output dynamics of (Q, B). Due to the loss of the canonical property, such a (d, v) pair may not be unique. Additionally, we do not know the dimension 2n of the data generating process, and hence we are ignorant of how many ones should be padded into d (and similarly, how many zeros are padded into the vector v). However, we do know that an element (d, v) exists in some ΘCHm,n ΘCHm that captures the input/output dynamics, given by Proposition 36.

An important special case is when there is no input to the port-Hamiltonian system, that is, u(t) = 0. In this case, the port-Hamiltonian system reduces to a linear Hamiltonian system with an arbitrary linear readout matrix. We emphasize that the observable Hamiltonian representation in a higher dimension is totally independent of B since it is simply given by

y = (0, 0, , 0, 1) s, (21)

In other words, Hamiltonian systems with linear readout can be learned by adjusting the initial state s0 and symplectic eigenvalues di, without even knowing the linear readout function that yields the observations.

7. Numerical illustrations

In this section, we present two numerical examples to demonstrate the eﬀectiveness of our representation results from a learning point of view.

7.1 Non-dissipative circuit

Similar to an example in Medianu and Lefevre (2021), we consider a circuit consisting of a power source with voltage V = u(t), together with ﬁve parallelizations, each of them containing a capacitor Ci with charge Qi and an inductor Li with magnetic ﬂux linkage φi for i = 1, . . . , 5 (see Figure 1). Using Kirchhoﬀlaws, we obtain the following port Hamiltonian system in normal form (22) and (23), where the Hamiltonian of the system is

H(Q1, . . . , Q5, φ1, . . . , φ5) = Q2

Learnability of Linear Port-Hamiltonian Systems

H Q1 ... H Q5

Figure 1: Lossless circuit port-Hamiltonian system

This port-Hamiltonian system treats the power supply V = u as input and the current through the power supply, that is y, as output. One veriﬁes that such a system is noncanonical. Our purpose is to learn the input-output behavior of this system without any access to the internal physical state and train only with input-output observations.

In our implementation, we choose for simplicity Ci = 1 and Li = 1 for i = 1, . . . , 5. We choose to learn with a 10-dimensional observable Hamiltonian representation to show that the dynamics can be captured even in the non-canonical case. (Indeed, with our choice of Ci s and Li s, the system is readily checked to be noncanonical). We randomly generate an initial condition for the ground-truth system and integrate it using Euler s method (see Appendix 9.14 for more sophisticated structure-preserving integration methods) with a discretization step of 0.01 for 1000 time steps. The input is chosen as u(t) = sin(t). The 1000 pairs of input and output data will be used as training data. During the training phase, we estimate the initial state x R10 as well as the parameters d R5

+ and v R10. This is carried out via gradient descent using a learning rate of λ = 0.1 for 500 epochs. At each gradient descent iteration we integrate the state-space equations corresponding to the

Juan-Pablo Ortega and Daiying Yin

current parameter values over 1000 times steps with Euler s method and then we compute the squared error with respect to the training set.

We set a testing period of 4000 time steps and demonstrate the robustness of our approach by not only testing our trained model on the original input u(t) = sin(t) but evaluating on other three commonly used input signals (see Figure 2, 3, 4 and 5). The numerical experiments provide a strong indication that the underlying system is learned independently of the input signal and is robust with respect to various forms of inputs.

(a) Input signal u(t) (b) output y(t)

Figure 2: Training and testing on a sinusoidal signal.

(a) Input signal u(t) (b) output y(t)

Figure 3: Testing on a constant signal. The training had been carried out using a sinusoidal signal. See Figure 2.

7.2 Positive deﬁnite Frenkel-Kontorova model

As a second example, we consider a modiﬁcation of the well-known Frenkel-Kontorova model such that it becomes a linear port-Hamiltonian system with a positive-deﬁnite Hamiltonian function. Recall that the general form of Frenkel-Kontorova model describes the motion of classical particles with nearest neighbor interactions using periodic potentials. The Hamil-

Learnability of Linear Port-Hamiltonian Systems

(a) Input signal u(t) (b) output y(t)

Figure 4: Testing on a square signal. The training had been carried out using a sinusoidal signal. See Figure 2.

(a) Input signal u(t) (b) output y(t)

Figure 5: Testing on a ramp signal. The training had been carried out using a sinusoidal signal. See Figure 2.

tonian function can be written as

1 cos qn + 1

2g (qn+1 qn a0)2

Since we are dealing with linear systems, we remove the periodic potential and rescale the potential coeﬃcient. By ﬁxing a0 = 0, we obtain the Hamiltonian

n + (qn+1 qn)2

In order to consider a Hamiltonian that is strictly positive deﬁnite, we add a term 1

1 to the Hamiltonian, which carries the physical meaning that the particle q1 interacts with

Juan-Pablo Ortega and Daiying Yin

the origin via a spring. In summary, our model of interest now has the positive-deﬁnite Hamiltonian

n + (qn+1 qn)2

n + (qn+1 qn)2

For the sake of simplicity, we consider in the above, a Hamiltonian system with N = 2 unit mass particles (so that pi = qi) and an external force F = u that is imposed on the ﬁrst particle. This gives a linear port-Hamiltonian system in normal form as below with the output being the velocity of the ﬁrst particle.

q1 q2 p1 p2

H q1 H q2 H p1 H p2

In contrast to the ﬁrst example, this system is canonical. Therefore, based on our theoretical results, any input-output dynamics can be captured by either a controllable or an observable Hamiltonian representation, and furthermore, it is possible to uniquely identify the system by learning an initial condition, and the parameters in the quotient space R2

+. For the sake of numerical illustration, we choose the initial state condition x = (2, 1, 3, 3)T

for the ground-truth system and integrate it 1000 time steps times using Euler s method with step of 0.01 (see Appendix 9.14 for more sophisticated structure-preserving integration methods), where the input is chosen as u(t) = sin(t). The 1000 pairs of input and output data are then used as training data.

As motivated above, we apply two diﬀerent training mechanisms in which we learn the initial state condition and the parameter values of the model using both the natural parameters from ΘOHn of the observable Hamiltonian representation and those in the unique identiﬁability space R2

+. As in the previous example, we carry out the training using gradient descent with a learning rate of λ = 0.02 over 1500 epochs out of randomly chosen initial values for the initial state condition and the model parameters in ΘCHn and R2

+. We record the validation error during the 1500 gradient descent iterations of both training mechanisms to compare their convergence rates. Heuristically, it should be expected that the rate of convergence is faster when the models are trained using the coordinates that provide unique identiﬁability. This is empirically conﬁrmed in Figure 6 (indeed, unique identiﬁability provides exponentially faster convergence). After 1500 iterations, the prediction accuracy when training was carried out using the unique identiﬁability space signiﬁcantly outperforms the other setting, as can be seen in Figure 7. Moreover, we found that the learned parameters d R2

+ are exactly the same as the eigenvalues of the Hamiltonian matrix, which is theoretically guaranteed by the unique identiﬁability. It is worth emphasizing that, despite the diﬀerence in the convergence rates, both mechanisms eventually lead to perfect path continuations of the input-output dynamics after enough training iterations.

Learnability of Linear Port-Hamiltonian Systems

Figure 6: Logarithm of validation errors of the two training mechanisms based on using the natural parameters of the observable representation and the unique identiﬁability space

(a) Using observable representation (b) Using unique identiﬁability space

Figure 7: Training and testing performance of the two training mechanisms after 1500 gradient descent iterations based on using the natural parameters of the observable representation (pane (a)) and the unique identiﬁability space (pane (b))

8. Conclusions

In this paper, we have introduced a complete structure-preserving learning scheme for singleinput/single-output (SISO) linear port-Hamiltonian systems. The construction is based on the solution, when possible, of the unique identiﬁcation problem for these systems, in ways that reveal fundamental relationships between classical notions in control theory and crucial properties in the machine learning context, like structure-preservation and expressive power.

The main building block in our construction is a representation result that we introduced for linear port-Hamiltonian systems in normal form that provides two subfamilies of linear systems that are by construction controllable and observable (Deﬁnition 5). We showed that morphisms can be established between the elements in these families and those in the category of normal form port-Hamiltonian systems. The existence of these morphisms immediately guarantees that the complexity of a generic subset of the family of

Juan-Pablo Ortega and Daiying Yin

port-Hamiltonian ﬁlters is actually not O(n2), as it could be guessed from the standard parametrization of this family, but O(n). We showed that the expressive power of our proposed representations is limited for non-canonical port-Hamiltonian systems. Indeed, we saw that the observable representation is guaranteed to capture all possible input-output dynamics of port-Hamiltonian systems (full expressive power), but it does not always produce port-Hamiltonian dynamics (fails to be structure-preserving). In the controllable case, structure preservation is guaranteed, but there is, in general, no full expressive power. For canonical port-Hamiltonian systems, these representations are both structure-preserving and have full expressive power.

We saw that even in the canonical situation, the availability of the controllable/observable representations did not yet provide a well-speciﬁed learning problem for this category since the invariance of these systems under system automorphisms implies the existence of symmetries (or degeneracies) in those parametrizations. We tackled this problem by solving the unique identiﬁability of input-output dynamics of linear port-Hamiltonian systems in normal form up to initializations by characterizing the quotient space by system automorphisms as a Lie groupoid orbit space. Moreover, we showed that in the canonical case the corresponding quotient spaces can be characterized as orbit spaces with respect to an explicit group action and can be explicitly endowed with a smooth manifold structure that has global Euclidean coordinates that can be used at the time of constructing estimation algorithms. Consequently, we showed that canonical port-Hamiltonian dynamics can be identiﬁed fully and explicitly in either the controllable or the observable Hamiltonian representations and learned by estimating an initial state condition and a unique set of parameters in a smooth manifold obtained as a group orbit space. Additionally, we complemented this learning scheme with results that allow us to extend it to situations where we remain agnostic regarding the dimension of the underlying data-generating port-Hamiltonian system.

We concluded the paper with some numerical examples that illustrate the viability of the method we propose in systems with various levels of complexity and dimensions and the computational advantages associated with using the parameter space in which unique identiﬁcation is guaranteed.

Acknowledgments

The authors thank Lyudmila Grigoryeva for helpful discussions and remarks and acknowledge partial ﬁnancial support from the Swiss National Science Foundation (grant number 175801/1) and the School of Physical and Mathematical Sciences of the Nanyang Technological University. DY is funded by the Nanyang President s Graduate Scholarship of Nanyang Technological University.

Glossary of Symbols

ΘCHm,n The space of parameters (d , v ) for PHm,n

+ The set of n-tuples of distinct positive real numbers in increasing order

Tn The n-torus

CHn/OHn The space of ﬁlters induced by CHn/OHn

Gn ΘP Hn Port-Hamiltonian groupoid, see Proposition 24

Learnability of Linear Port-Hamiltonian Systems

Hn ΘCHn Reduced port-Hamiltonian groupoid, see Proposition 26

PHn The space of input-output dynamics/ﬁlters induced by systems in PHn

n The space of input-output dynamics/ﬁlters induced by systems in PHcan

sp(2n, R) Lie algebra of the symplectic group

An equivalence relation deﬁned on ΘCHn

filter The equivalence relation of inducing the same ﬁlter

sys The equivalence relation of system automorphism

ΘCHn/ΘOHn The space of parameters (d, v) for CHn and/or OHn, which are the same

θCHn/θOHn The map that send parameters in for ΘCHn/ΘOHn to the corresponding state space system in CHn/OHn

CHn The subset of ΘCHn that corresponds to canonical systems

ΘP Hn The space of parameters (Q, B) for PHn

θP Hn The map that sends parameters in for ΘP Hn to the corresponding state space system in PHn

B Input matrix of a port-Hamiltonian system in normal form

CHn/OHn The space of 2n-dimensional controllable/observable Hamiltonian representations

F : Z U Z State equation

H : R2n R Hamiltonian function

PHn The space of 2n-dimensional linear normal form port-Hamiltonian systems (5)

n The subspace of PHn consisting of canonical linear normal form port Hamiltonian systems

PHm,n The subspace of PHm containing all (Q , B ) =

Q 0 0 I2m 2n

(Q, B) PHn, O O(2m, R)

Q Quadratic form that determines a linear Hamiltonian system

Sn Permutation group of n-elements

Sp(2n, R) Symplectic group

Canonical symplectic matrix

Ralph Abraham and Jerrold E. Marsden. Foundations of Mechanics. Addison-Wesley, Reading, MA, 2nd edition, 1978.

Ralph Abraham, Jerrold E. Marsden, and Tudor S. Ratiu. Manifolds, Tensor Analysis, and

Applications, volume 75. Applied Mathematical Sciences. Springer-Verlag, 1988.

Juan-Pablo Ortega and Daiying Yin

Beatrice Acciaio, Anastasis Kratsios, and Gudmund Pammer. Metric hypertransformers

are universal adapted maps. ar Xiv preprint ar Xiv:2201.13094, 2022.

V. I. Arnold. Mathematical Methods of Classical Mechanics. Springer, 1989.

Coryn A L Bailer-Jones, David J C Mac Kay, and Philip J Withers. A recurrent neural

network for modelling dynamical systems. Network: Computation in Neural Systems, 9 (4):531 547, 1998.

Thomas Beckers, Jacob Seidman, Paris Perdikaris, and George J Pappas. Gaussian process

port-Hamiltonian systems: Bayesian learning with physics prior. In 2022 IEEE 61st Conference on Decision and Control (CDC), pages 1447 1453. IEEE, 2022.

Jean-Michel Bismut. M ecanique al eatoire. Springer, 1982.

Roger W Brockett and Abdolhossein Rahimi. Lie algebras and linear diﬀerential equations.

Ordinary Diﬀerential Equations, pages 379 386, 1972.

Elena Celledoni, Andrea Leone, Davide Murari, and Brynjulf Owren. Learning hamiltonians

of constrained mechanical systems. Journal of Computational and Applied Mathematics, 417:114608, 2023.

Renyi Chen and Molei Tao. Data-driven prediction of general Hamiltonian dynamics via

learning exactly-symplectic maps. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 1717 1727. PMLR, 2021.

Zhengdao Chen, Jianyu Zhang, Martin Arjovsky, and L eon Bottou. Symplectic recurrent

neural networks. In International Conference on Learning Representations, 2020.

Karim Cheriﬁ. An overview on recent machine learning techniques for Port Hamiltonian

systems. Physica D: Nonlinear Phenomena, 411:132620, 2020.

Peter E. Crouch and Arjan van der Schaft. Variational and hamiltonian control systems.

G. Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of

Control, Signals, and Systems, 2(4):303 314, dec 1989.

Maurice A De Gosson. Symplectic geometry and quantum mechanics, volume 166. Springer

Science & Business Media, 2006.

Shaan A Desai, Marios Mattheakis, David Sondak, Pavlos Protopapas, and Stephen J

Roberts. Port-Hamiltonian neural networks for learning explicit time-dependent dynamical systems. Physical Review E, 104(3):34312, 2021.

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. BERT: Pre-training

of deep bidirectional transformers for language understanding. Co RR, abs/1810.0, 2018.

I L Egusquiza and A Parra-Rodriguez. Algebraic canonical quantization of lumped super-

conducting networks. Physical Review B, 106(2):24510, jul 2022.

Learnability of Linear Port-Hamiltonian Systems

Lukas Gonon and Juan-Pablo Ortega. Reservoir computing universality with stochastic

inputs. IEEE Transactions on Neural Networks and Learning Systems, 31(1):100 112, 2020.

Lukas Gonon and Juan-Pablo Ortega. Fading memory echo state networks are universal.

Neural Networks, 138:10 13, 2021.

Oscar Gonzalez. Time integration and discrete Hamiltonian systems. In Mechanics: from

theory to computation, pages 257 275. Springer, 2000.

Samuel Greydanus, Misko Dzamba, and Jason Yosinski. Hamiltonian neural networks. In

Advances in Neural Information Processing Systems, pages 15353 15363, 2019.

Lyudmila Grigoryeva and Juan-Pablo Ortega. Universal discrete-time reservoir computers

with stochastic inputs and linear readouts using non-homogeneous state-aﬃne systems. Journal of Machine Learning Research, 19(24):1 40, 2018a.

Lyudmila Grigoryeva and Juan-Pablo Ortega. Echo state networks are universal. Neural

Networks, 108:495 508, 2018b.

Lyudmila Grigoryeva and Juan-Pablo Ortega. Dimension reduction in recurrent networks

by canonicalization. Journal of Geometric Mechanics, 13(4):647 677, 2021.

Sepp Hochreiter and J urgen Schmidhuber. Long short-term memory. Neural Computation,

9(8):1735 1780, 1997.

Kurt Hornik, Maxwell Stinchcombe, and Halbert White. Multilayer feedforward networks

are universal approximators. Neural Networks, 2(5):359 366, 1989.

Martin Idel, Sebasti an Soto Gaona, and Michael M Wolf. Perturbation bounds for Williamson s symplectic normal form. Linear Algebra and its Applications, 525:45 58, 2017.

Kh. D Ikramov. On the symplectic eigenvalues of positive deﬁnite matrices. Moscow Uni-

versity Computational Mathematics and Cybernetics, 42:1 4, 2018.

Birgit Jacob and Hans Zwart. Linear Port-Hamiltonian Systems on Inﬁnite-dimensional

Spaces. Birkh auser, 2012.

Herbert Jaeger and Harald Haas. Harnessing Nonlinearity: Predicting Chaotic Systems and

Saving Energy in Wireless Communication. Science, 304(5667):78 80, 2004.

Pengzhan Jin, Zhen Zhang, Aiqing Zhu, Yifa Tang, and George Em Karniadakis. Symp Nets:

Intrinsic structure-preserving symplectic networks for identifying Hamiltonian systems. Neural Networks, 132:166 179, 2020.

R. E. Kalman. Mathematical description of linear dynamical systems. Journal of the Society

for Industrial and Applied Mathematics Series A Control, 1(2):152 192, 1963.

Juan-Pablo Ortega and Daiying Yin

George Em Karniadakis, Ioannis G Kevrekidis, Lu Lu, Paris Perdikaris, Sifan Wang, and

Liu Yang. Physics-informed machine learning. Nature Reviews Physics, 3(6):422 440, 2021.

Alex Krizhevsky, Ilya Sutskever, and Geoﬀrey E Hinton. Image Net classiﬁcation with deep convolutional neural networks. In F Pereira, C J Burges, L Bottou, and K Q Weinberger, editors, Advances in Neural Information Processing Systems, volume 25. Curran Associates, Inc., 2012.

Joan-Andreu L azaro-Cam ı and Juan-Pablo Ortega. Stochastic hamiltonian dynamical sys-

tems. Reports on Mathematical Physics, 61(1):65 122, 2008.

Benedict Leimkuhler and Sebastian Reich. Simulating Hamiltonian Dynamics. Cambridge

University Press, 2004.

Zichao Long, Yiping Lu, Xianzhong Ma, and Bin Dong. PDE-Net: Learning PDEs from

Data. In Jennifer Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 3208 3216. PMLR, 2018.

Zhixin Lu, Brian R. Hunt, and Edward Ott. Attractor reconstruction by machine learning.

Chaos, 28(6), 2018.

Kirill C H Mackenzie. General theory of Lie groupoids and Lie algebroids. Number 213.

Cambridge University Press, 2005.

Jerrold E. Marsden and Tudor S. Ratiu. Introduction to mechanics and symmetry. Springer-

Verlag, New York, second edition, 1999.

Jerrold E. Marsden and Matthew West. Discrete mechanics and variational integrators.

Acta Numerica, 10:357 514, 2001.

B.M. Maschke and A.J. van der Schaft. Port-controlled hamiltonian systems: Modelling

origins and systemtheoretic properties. IFAC Proceedings Volumes, 25(13):359 365, 1992. 2nd IFAC Symposium on Nonlinear Control Systems Design 1992, Bordeaux, France, 2426 June.

Robert I Mc Lachlan and G Reinout W Quispel. Geometric integrators for ODEs. Journal

of Physics A: Mathematical and General, 39(19):5251, 2006.

Silviu Medianu and Laurent Lefevre. Structural identiﬁability of linear port hamiltonian

systems. Systems & Control Letters, 151:104915, 2021.

Silviu Medianu, Laurent Lefevre, and Dan Stefanoiu. Identiﬁability of linear lossless Port-

controlled Hamiltonian systems. In 2nd International Conference on Systems and Computer Science, pages 56 61, 2013.

Sumona Mukhopadhyay and Santo Banerjee. Learning dynamical systems in noise using

convolutional neural networks. Chaos: An Interdisciplinary Journal of Nonlinear Science, 30(10):103125, 2020.

Learnability of Linear Port-Hamiltonian Systems

S P Nageshrao, G A D Lopes, D Jeltsema, and R Babuska. Adaptive and learning control of

port-Hamiltonian systems: a survey. IEEE Transactions on Automatic Control, page 37, 2015.

Juan-Pablo Ortega and Tudor S. Ratiu. Momentum Maps and Hamiltonian Reduction. Birkhauser Verlag, 2004.

Jaideep Pathak, Brian Hunt, Michelle Girvan, Zhixin Lu, and Edward Ott. Model-Free Pre-

diction of Large Spatiotemporally Chaotic Systems from Data: A Reservoir Computing Approach. Physical Review Letters, 120(2):24102, 2018a.

Jaideep Pathak, Alexander Wikner, Rebeckah Fussell, Sarthak Chandra, Brian R. Hunt,

Michelle Girvan, and Edward Ott. Hybrid forecasting of chaotic processes: Using machine learning in conjunction with a knowledge-based model. Chaos, 28(4), 2018b.

Jan Willem Polderman and Jan C. Willems. Introduction to Mathematical Systems Theory.

Springer New York, NY, 1998.

Tong Qin, Kailiang Wu, and Dongbin Xiu. Data driven governing equations approximation

using deep neural networks. Journal of Computational Physics, 395:620 635, oct 2019.

H a Quang Minh and Vittorio Murino. Covariances in Computer Vision and Machine Learn-

ing. Morgan and Claypool Publishers, 2018.

Maziar Raissi and George E Karniadakis. Hidden physics models: machine learning of nonlinear partial diﬀerential equations. Co RR, abs/1708.0, 2017. URL http://arxiv. org/abs/1708.00588.

Maziar Raissi, Paris Perdikaris, and George Em Karniadakis. Physics informed deep learn-

ing (part i): Data-driven solutions of nonlinear partial diﬀerential equations. ar Xiv preprint ar Xiv:1711.10561, 2017.

Anna Shalova and Ivan Oseledets. Tensorized transformer for dynamical systems modeling.

ar Xiv preprint ar Xiv:2006.03445, 2020.

Nguyen Son and Tatjana Stykel. Symplectic eigenvalues of positive-semideﬁnite matrices

and the trace minimization theorem. 08 2022. doi: 10.48550/ar Xiv.2208.05291.

Eduardo Sontag. Mathematical Control Theory: Deterministic Finite Dimensional Systems.

Springer-Verlag, 1998.

Hector J., Sussmann. Existence and uniqueness of minimal realizations of nonlinear systems.

Mathematical systems theory, 1976.

Krzysztof Tcho. On generic properties of linear systems: An overview. Kybernetika, 19, 01

Yunjin Tong, Shiying Xiong, Xingzhe He, Guanghan Pan, and Bo Zhu. Symplectic neural

networks in taylor series form for hamiltonian systems. Journal of Computational Physics, 437:110325, 2021.

Juan-Pablo Ortega and Daiying Yin

Riccardo Valperga, Kevin Webster, Dmitry Turaev, Victoria Klein, and Jeroen Lamb.

Learning reversible symplectic dynamics. In Roya Firoozi, Negar Mehr, Esen Yel, Rika Antonova, Jeannette Bohg, Mac Schwager, and Mykel Kochenderfer, editors, Proceedings of The 4th Annual Learning for Dynamics and Control Conference, volume 168 of Proceedings of Machine Learning Research, pages 906 916. PMLR, 23 24 Jun 2022.

Arjan van der Schaft and Dimitri Jeltsema. Port-hamiltonian systems theory: an introduc-

tory overview. Foundations and Trends in Systems and Control, 1(2-3):173 378, 2014.

Yu Wang. A new concept using LSTM Neural Networks for dynamic system identiﬁcation.

2017 American Control Conference (ACC), pages 5324 5329, 2017.

John Williamson. On the algebraic problem concerning the normal forms of linear dynamical

systems. American Journal of Mathematics, 58(1):141 163, 1936.

John Williamson. On the normal forms of linear canonical transformations in dynamics.

American Journal of Mathematics, 59(3):599 617, 1937.

Jin-Long Wu, Heng Xiao, and Eric Paterson. Physics-informed machine learning approach

for augmenting turbulence models: A comprehensive framework. Physical Review Fluids, 3(7):74602, 2018.

Yaofeng Desmond Zhong, Biswadip Dey, and Amit Chakraborty. Symplectic ode-net: Learning hamiltonian dynamics with control. In International Conference on Learning Representations, 2020.

9. Appendices

9.1 Proof of Theorem 7 (i)

Let (d, v) ΘCHn and let

1 (d) s + (0, 0, , 0, 1)T u,

2 (d, v) s,

be the corresponding linear controllable state-space system. In the following paragraphs, we construct for every S Sp(2n, R), a linear system morphism f (d,v)

S : R2n R2n between (26) and the port-Hamiltonian system (Q, B) = ϕS(θCHn(d, v)) PHn in the statement. Notice, ﬁrst of all that Q is by construction symmetric and positive-deﬁnite. Let now L M2n be the matrix implementing the linear map f (d,v)

S , that is, f (d,v)

S (s) = Ls, s R2n. We now explicitly construct L and prove that it provides a system morphism.

We start by denoting A := Jn

, and deﬁne for each k = 1, . . . , 2n, a matrix Lk M2n as

L2n k := Ak + a2n 1 Ak 1 + + a2n k I2n.

In particular, L2n = I2n. Then, L is constructed as L :=

L1v L2v L2nv

, and L := S 1L .

We now check that f (d,v)

S (s) = Ls is indeed a system morphism between (26) and the port-Hamiltonian

system (5) with Q = ST

S and B = S 1v. This amounts to checking that

1 (d) = Jn QL

(ii) L (0, 0, , 0, 1)T = B

Learnability of Linear Port-Hamiltonian Systems

2 (d, v) = BT QL.

We note that (ii) trivially holds. Now, (i) is equivalent to

1 (d) = Jn ST

SS 1L = S 1Jn

L1v L2v L2nv

0 1 0 . . . 0 0 0 1 . . . 0 ...

... ... ...

... 0 0 0 . . . 1 a0 a1 a2 . . . a2n 1

L1v L2v L2nv

We compare the k-th columns of the left and the right-hand sides in this equality. When k = 1, the diﬀerence between the ﬁrst columns in the left and the right-hand side is

AL1v + a0v = A(A2n 1 + a2n 1 A2n 2 + + a1 I)v + a0v

= (A2n + a2n 1A2n 1 + + a1A + a0)v = 0. (27)

The last equality holds as a consequence of the Cayley-Hamilton theorem. Indeed, by the deﬁnition of the entries {a0, a1, . . . , a2n 1} we have that the characteristic polynomial of A is

det (λI2n A) = det

λIn D D λIn

= det (λI2n) det

2) . . . (λ2 + d2

Consequently, since by the Cayley-Hamilton theorem, A solves its characteristic polynomial, we can conclude that A2n + a2n 1A2n 1 + + a1A + a0 = 0 and hence (27) follows. When 1 < k 2n, the diﬀerence between the k-th columns in the left and the right-hand side is

(Lk 1v ak 1v) ALkv = (Lk 1 ak 1I2n ALk)v = 0,

Lk 1 ak 1I2n ALk = (A2n k+1 + a2n 1 A2n k + + ak 1 I2n) ak 1 I2n

A(A2n k + a2n 1 A2n k 1 + + ak I2n) = 0.

We have hence proved that (i) holds. We now proceed to check (iii). This amounts to computing

BT QL = (S 1v)T ST

SS 1L = v T

L1v L2v L2nv

Let us denote

c2n c2n 1 c2n 2 . . . c2 c1

Juan-Pablo Ortega and Daiying Yin

Then we observe that for k = 1, . . . , n,

+ + a2n 2k+1 I2n

+ + a2n 2k+2

+ a2n 2 J2k 3

+ + a2n 2k+2 Jn

The last equation follows from the fact that each summand is a skew-symmetric matrix. On the other hand, for k = 0, . . . , n 1,

c2k+1 = v T

+ + a2n 2k I2n

+ + a2n 2k I2n

+ a2n 2 ( 1)k 1

Substitute the values of coeﬃcients a2k as expressions in terms of di s, we obtain that

c2k+1 = v T

for k = 0, . . . , n 1, and

f2 0 ... 0 fn 1

with fl = dl

j1,...,jk =l 1 j1< <jk n

2, l = 1, . . . , n. This is exactly how we deﬁne gctr

2 (d, v). Hence,

(iii) is also veriﬁed.

9.2 Proof of Theorem 7 (ii)

Let (Q, B) PHn. Obtain d and v from (Q, B) as in the statement of the theorem. We aim to construct a linear system morphism f (Q,B)

S : R2n R2n between the port-Hamiltonian system (Q, B) PHn and the observable Hamiltonian representation associated to (d, v) ΘOHn, that is,

1 (d) s + gobs

2 (d, v) u,

y = (0, 0, , 0, 1) s.

Denote by L M2n the matrix implementing the linear map f (Q,B)

S , that is, f (Q,B)

S (s) = Ls, s R2n.

We now construct a L which yields a system morphism. We start by writing A := Jn

Learnability of Linear Port-Hamiltonian Systems

for each k = 1, . . . , 2n, a matrix Lk M2n as

L2n k := (Jn Q)k + a2n 1 (Jn Q)k 1 + + a2n k I2n

k 1 + + a2n k I2n

= S 1(Ak + a2n 1 Ak 1 + + a2n k I2n) S.

In particular, L2n = I2n. Then, deﬁne L is as L :=

BT QL1 BT QL2

... BT QL2n

We now check that f (Q,B)

S (s) = Ls is indeed a system morphism between the port-Hamiltonian system

(5) and the observable Hamiltonian representation (30) with Q = ST

S and B = S 1v. This

amounts to checking that

1 (d) L = LJn Q

(ii) LB = gobs

(iii) BT Q = (0, 0, , 0, 1) L.

We note that (iii) is straightforward. Now, (i) is equivalent to

0 0 . . . 0 a0 1 0 . . . 0 a1

0 1 ... 0 a2 ...

... 0 0 . . . 1 a2n 1

BT QL1 BT QL2

... BT QL2n

BT QL1 BT QL2

... BT QL2n

Compare now the k-th rows of the left and the right-hand sides of this equality. When k = 1, the diﬀerence between the ﬁrst rows in the left and the right-hand sides are

BT QL1S 1AS + a0BT QL2n = BT Q(L1S 1AS + a0I2n)

= BT QS 1(A2n + a2n 1A2n 1 + + a1A + a0 I2n)S = BT Q 0 = 0.

The last equality follows, as in the proof of Theorem 7, from the Cayley-Hamilton theorem.

When 1 < k 2n, the diﬀerence between the k-th rows in the left and the right-hand sides are:

BT QLk 1 ak 1BT QL2n BT QLk S 1AS = BT Q(Lk 1 ak 1 I2n Lk S 1AS)

(A2n k+1 + a2n 1 A2n k + + ak 1 I2n) ak 1 I2n

(A2n k + a2n 1 A2n k 1 + + ak I2n)A

which shows that (i) holds. We now proceed to check (ii). This is equivalent to computing

BT QL1 BT QL2

... BT QL2n

Let us denote LB =

c2n c2n 1 c2n 2 . . . c2 c1

T . Then we have, for k = 1, . . . , 2n,

c2n k+1 = BT QLk B = (S 1v)T ST

SS 1(A2n k + a2n 1 A2n k 1 + + ak I2n)SS 1v

(A2n k + a2n 1 A2n k 1 + + ak I2n)v,

Juan-Pablo Ortega and Daiying Yin

which coincides exactly with the expression of c2n k+1 in the equations (28) and (29) that we provided in the controllable Hamiltonian case. Thus, for (iii) to hold, we simply need to require that gobs

2 (d, v) = (gctr

2 (d, v))T .

9.3 Proof of Proposition 18

Proof of part (i). We have that (d1, v1) sys (d2, v2) implies the existence of an invertible matrix L such that

1 (d1) = gctr

L (0, 0, , 0, 1)T = (0, 0, , 0, 1)T

2 (d1, v1) = gctr

2 (d2, v2) L

The ﬁrst condition implies that det(λI gctr

1 (d1)) = det(λI gctr

1 (d2)), meaning that

Therefore, (i) is clear. With the symmetry in (i), it is clear that gctr

1 (d1) = gctr

1 (d2). Note that the second condition says the last column of L is (0, 0, , 0, 1)T . Bring both facts into the ﬁrst condition L gctr

1 (d1) = gctr

1 (d2) L and compare both sides. This will deduce L can only be the identity. Thus the third condition becomes gctr

2 (d1, v1) = gctr

2 (d2, v2), which is exactly (ii). Conversely, with (i) and (ii) hold, we can check L being identity works. Thus, (d1, v1) sys (d2, v2).

Proof of part (ii). Since θCHn(di, vi) (i = 1, 2) are linear systems, we can explicitly write down the ﬁlters as

(yi(u), z0)t = gctr

1 (di)(t s) (0, 0, , 0, 1)T u(s)ds + gctr

2 (di, vi)egctr

1 (di)t z0.

We consider two special cases. In the ﬁrst case, we take time t = 0, then (yi(u), z0)0 = gctr

2 (di, vi) z0. Since z0 can be arbitrary, the two ﬁlters coincide if and only if cj(d1, v1) = cj(d2, v2) for all j = 1, . . . , n. In the second case, we take z0 = 0, then

(yi(u), 0)t = gctr

1 (di)(t s) + (gctr

1 (di))2 (t s)2

(0, 0, , 0, 1)T u(s)ds.

By diﬀerentiating the above with respect to t, and using the fact that the input u(t) is arbitrary (and hence can choose u(0) arbitrarily), we see that y1 and y2 coincide as ﬁlters if and only if gctr

2 (d1, v1)(gctr

1 (d1))k (0, 0, , 0, 1)T = gctr

2 (d2, v2)(gctr

1 (d2))k (0, 0, , 0, 1)T for all k N. Moreover, one veriﬁes that gctr

2 (d, v)(gctr

1 (d))k (0, 0, , 0, 1)T = 0 for odd k. Thus, if we deﬁne ei(d, v) = gctr

2 (d, v)(gctr

1 (d))2i (0, 0, , 0, 1)T , then y1 = y2 is equivalent to ei(d1, v1) = ei(d2, v2) for all i N. Now, one ﬁnds a recursion in the values of ei s, that is,, for all m n

em(d, v) = a2n 2 em 1(d, v) a2n 4 em 2(d, v) a2 e1(d, v) a0 em n(d, v).

More precisely, it can be checked that the following recursion holds true

e1 = c1 e2 = c3 a2n 2 e1 e3 = c5 a2n 2 e2 a2n 4 e1

en = c2n 1 a2n 2 en 1 a2n 4 en 2 a2 e1.

On the other hand, (a2n 2, . . . , a2, a0) happens to be the coeﬃcients of the characteristic polynomial of gctr

1 (d, v), therefore, by Cayley-Hamilton Theorem, em(d, v) = 0 for all m > n.

In conclusion, by combining the two special cases, we see that θCHn(d1, v1) and θCHn(d2, v2) induce the same ﬁlter if and only if ci(d1, v1) = ci(d2, v2) and ei(d1, v1) = ei(d2, v2) for all 0 i n 1,

Learnability of Linear Port-Hamiltonian Systems

9.4 Proof of Theorem 22

ϕS θCHn is compatible with and sys. Fix a choice of S Sp(2n, R). We need to show that (d1, v1) (d2, v2) if and only if

:= (Q1, B1) sys (Q2, B2) :=

This means there exists an invertible L such that (14) holds. We claim that L = S 1PAS does the job, where P and A are given by Deﬁnition 20.

The ﬁrst condition is

LJQ1 = JQ2L

LJQ1L 1 = JQ2

SS 1A 1P 1S = JST

A 1P 1S = S 1J

A 1 = P T J

The second condition is true by construction, namely

LB1 = B2 S 1PASS 1v1 = S 1v2 v2 = PAv1.

The third condition is

Based on the compatibility result above, we know that ϕS θCHn induces a unique map ΦS : ΘCHn/

PHn/ sys deﬁned as ΦS([d, v] ) =

. We now verify that ΦS does not depend on

the choice of S Sp(2n, R).

ΦS is independent of S. It suﬃces to check that, for S1 = S2, we have

, which again goes back to checking (14) holds for some

Juan-Pablo Ortega and Daiying Yin

invertible L. We claim that L = S 1

2 S1 does the job. The ﬁrst condition is

The second condition is

The third condition is

Since ΦS does not depend on S Sp(2n, R), we may as well choose S = Jn and call it Φ. Then Φ has the

expression Φ([d, v] ) =

. We now verify that Φ is injective and surjective, and hence an

isomorphism.

Φ is surjective. For an arbitrary choice [Q, B]sys of equivalence class, we take a representative Q and B.

Since Q is symmetric positive-deﬁnite, by Williamson s theorem, Q = ST

S for some S Sp(2n, R)

and the diagonal entries of D are nonnegative and can be identiﬁed with d. Let v = S B. Then we have ΦS([d, v] ) = [Q, B]sys. Given that ΦS = Φ for any S, it holds that Φ([d, v] ) = [Q, B]sys. This concludes Φ being surjective.

Φ is injective. For

, it means there exists some invertible L such

that the conditions in (14) are all satisﬁed. We aim to show that (d1, v1) (d2, v2). The ﬁrst condition gives

Therefore, {d1,i|i = 1, . . . , n} is the same as {d2,i|i = 1, . . . , n} as a set, and this implies the existence of some σ Sn such that d2,i = d1,σ(i) for i = 1, . . . , n. In other words, there exists some permutation matrix

Learnability of Linear Port-Hamiltonian Systems

Pσ such that P

. Thus, (i) of Deﬁnition 20 holds. Further, we have

if we denote A := P T L. Thus, (iii) of Deﬁnition 20 holds true. The second condition of (14) says v2 = Lv1 = PAv1. Thus, (iv) of Deﬁnition 20 holds true. Lastly, the third condition implies

Thus, (ii) in Deﬁnition 20 holds. We conclude that Φ is injective.

Φ is a homeomorphism with respect to the quotient topology. Before we prove this statement, we ﬁrst quote a lemma (see, for instance, Abraham et al. (1988)).

Lemma 39 Let X and Y be sets equipped with equivalence relations X and Y respectively. If φ : X Y is a map such that, for any x1, x2 X, x1 X x2 if and only if φ(x1) Y φ(x2), then φ projects to a unique map φ : X/ X Y/ Y between the quotient spaces given by φ([x] X ) = [φ(x)] Y and such that the following diagram commutes. In particular, if φ is a homeomorphism between two topological spaces X and Y , then φ is also a homeomorphism.

We now proceed with the proof. (i) If (Q1, B1) and (Q2, B2) PHn are linked by some linear symplectic map S Sp(2n, R) by (Q2, B2) =

(S T Q1S 1, SB1), then (Q1, B1) sys (Q2, B2). Therefore, as an immediate consequence of Williamson s normal form, we have PHn/ sys= PHdiag

n / sys, where

D = diag(d), di > 0, v R2n

(ii) There is an obvious homeomorphism ϕ : ΘCHn PHdiag

n given by ϕ(d, v) =

fore, by identifying PHn/ sys with PHdiag

n / sys, the induced map of ϕ on the quotients is exactly Φ. By Lemma 39, Φ is also a homeomorphism.

To summarize, we have that the following diagram commutes.

ΘCHn/ PHn/ sys

Juan-Pablo Ortega and Daiying Yin

9.5 Proof of Proposition 24

The axioms of being a groupoid mostly follow from the deﬁnition. Here, we only check the closure of the multiplication operation m, that is, (L1L2, (Q2, B2)) Gn. Note that

JT (L1L2)JQ2(L1L2) 1 = JT L1J(JT L2JQ2L 1

1 = JT L1JQ1L 1

is symmetric positive-deﬁnite. On the other hand, we have

JT (L1L2)T J(L1L2)B2 = JT LT

1 JL1L2B2 = JT LT

1 JL1B1) = JT LT

2 JB1 = JT LT

2 JL2B2 = B2.

Thus, closure of multiplication is proved. We also need to show that α and β are submersions. Indeed, for (L, (Q, B)) Gn and (N, (P, C)) T(L,(Q,B))Gn, it holds that

T(L,(Q,B))α(N, (P, C)) = d

JT (L + t N)J(Q + t P)(L + t N) 1, (L + t N)(B + t C)

= (JT NJQL 1 + JT LJPL 1 JT LJQL 1NL 1, LC + NB).

Obviously, LC + NB can traverse R2n with varying N GL(2n, R) and C R2n. For the ﬁrst component, we can take N = L such that it becomes JT LJPL 1. Since the tangent space of an open submanifold can be identiﬁed with the tangent space of the whole manifold, plus the fact that the tangent space of a vector space can be identiﬁed with itself, we naturally conclude that T(L,(Q,B))α is surjective and hence α is a submersion. Similarly, one check that β is a submersion. Then, the orbit of the groupoid containing (Q, B) is given by

α(β 1(Q, B)) = α({(L, (Q, B))|L satisﬁes 1.(i) and 1.(ii) in Deﬁnition 23 })

(JT LJQL 1, LB)|L satisﬁes 1.(i) and 1.(ii) in Deﬁnition 23

(Q , B )|(Q , B ) sys (Q, B)

9.6 Proof of Proposition 30

f is well-deﬁned. If (d1, v1) sys (d2, v2), then there exists an invertible matrix L0 such that

1 (d1) = gctr

L0 (0, 0, , 0, 1)T = (0, 0, , 0, 1)T

2 (d1, v1) = gctr

2 (d2, v2) L0

Since we are restricting on canonical systems, we apply the representation theorem to deduce the existence of some invertible matrices Li, i = 1, 2 such that

1 (di) = JQi Li

Li (0, 0, , 0, 1)T = Bi

2 (di, vi) = BT

Now, check L = L2L0L 1

1 is invertible and satisﬁes

LJQ1 = JQ2L

Therefore, (Q1, B1) sys (Q2, B2).

f is surjective. This is obvious, see the proof above.

f is injective. Given all the matrices are invertible, this can be shown by essentially reversing the proof of f being well-deﬁned.

Learnability of Linear Port-Hamiltonian Systems

9.7 Proof of Proposition 31

We directly verify that

Γ(σ,(θ1,...,θn)T ) ( σ,( θ1,..., θn)T )(d, v)

= Γ(σ σ,(θ1,...,θn)T +Pσ ( θ1,..., θn)T )(d, v)

= (Pσ σ d, Γ(θ1,...,θn)T ΓPσ ( θ1,..., θn)T

Pσ σ 0 0 Pσ σ

= (PσP σ d,

Γ(θ1,...,θn)T

cos θσ(1) 0 sin θσ(1) 0 ... ... 0 cos θσ(n) 0 sin θσ(n) sin θσ(1) 0 cos θσ(1) 0 ... ... 0 sin θσ(n) 0 cos θσ(n)

P σ 0 0 P σ

= (PσP σ d, Γ(θ1,...,θn)T

Γ( θ1,..., θn)T

P σ 0 0 P σ

= Γ(σ,(θ1,...,θn)T )(Γ( σ,( θ1,..., θn)T )(d, v)).

9.8 Proof of Proposition 32

Recall that (d1, v1) and (d2, v2) lie in the same (Sn φ Tn)-orbit if and only if for some σ Sn and (θ1, . . . , θn) Tn.

(i) d2,i = d1,σ(i), i = 1, . . . , n.

1,σ(i) + v2

Clearly, (i) above is equivalent to Proposition 18 (i). Moreover, Proposition 18 (ii) implies that for k = 0, . . . , n 1,

F1,k 0 0 F1,k

v1 = (P T v2)T

F1,k 0 0 F1,k

2,σ 1(i) + v2

2,n+σ 1(i))

Now, let R1 = (R1,1, . . . , R1,n)T , where R1,i = v2

1,n+i. Let R2 = (R2,1, . . . , R2,n)T , where R2,i = v2

2,σ 1(i) +v2

2,n+σ 1(i). Identify the diagonal matrix F1,k as a row vector in Rn. Then, the above is equivalent to saying that the inner product of F1,k with R1 and R2 are the same for all k = 0, . . . , n 1. Rewrite these inner products as matrix multiplication gives

( R1 R2) = 0.

The determinant of this matrix is

. Since there are no repeated symplectic eigenvalues, we must have R1 = R2, namely v2

2,σ 1(i) + v2

2,n+σ 1(i) for all i = 1, . . . , n. Thus, (ii) holds by inversing the permutation σ. The converse is clearly true.

Juan-Pablo Ortega and Daiying Yin

9.9 Proof of Proposition 33

f is well-deﬁned. Let (d1, v1) and (d2, v2) be in the same orbit of the (Sn φ Tn)-action. This means there exists σ Sn and Θ Tn such that Γσ(d1) = d2 and ΓΘ(Γσ(v1)) = v2. This immediately implies (d1) = (d2) , as well as R(v2) = R(Γσ(v1)). Moreover, let σi Sn be the unique permutation such that Γσi(di) = (di) , i = 1, 2. Then we have,

2 ((d2) ) = Γσ 1

Since all the entries of d are distinct, we necessarily have σ = σ 1

2 σ1. We want to show R(Γσ1(v1)) = R(Γσ2(v2)), but since R and Γσ commutes for any σ Sn, this is equivalent to

Γσ1(R(v1)) = Γσ2(R(v2))

Γσ1(R(v1)) = Γσ2(R(Γσ(v1)))

Γσ1(R(v1)) = Γσ2(R(Γσ 1

Γσ1(R(v1)) = R(Γσ1(v1)),

which is clearly true. f is surjective. This is obvious. f is injective. Now suppose ((d1) , R(Γσ1(v1))) = ((d2) , R(Γσ2(v2))). This immediately implies the existence of some σ Sn such that Γσ(d1) = d2. On the other hand, since di = Γσ 1

i (di) , i = 1, 2,

we have σ = σ 1

2 σ1 and hence d2 = Γσ 1

2 σ1(d1). On the other hand, R(Γσ1(v1)) = R(Γσ2(v2)) implies R(Γσ 1

2 σ1(v1)) = R(v2), which further implies the existence of some Θ Tn such that v2 = ΓΘ(Γσ 1

2 σ1(v1)). This concludes that (d1, v1) and (d2, v2) lie in the same orbit.

9.10 Proof of Theorem 34

Proof of part (i). Say we are given a latent system

z = Jn Qz + Bu

, B R2n and Q a 2n by 2n symmetric, positive-deﬁnite matrix. Consider the matrix

Jn 0 0 Jm n

0 In 0 0 In 0 0 0 0 0 0 Im n 0 0 Im n 0

There exists a conjugate transform by an orthogonal matrix that turns this matrix into Jm, since only elementary row(column) permutation matrices are involved, and these elementary matrices are themselves

orthogonal. That is, there exists OOT = OT O = I2m such that O

Jn 0 0 Jm n

OT = Jm. Now, consider the

following linear port-Hamiltonian system in normal form

Jn 0 0 Jm n

Q 0 0 I2m 2n

Q 0 0 I2m 2n

Q 0 0 I2m 2n

Learnability of Linear Port-Hamiltonian Systems

with the change of variable z = OT z, which is equivalent to

Jn Q 0 0 Jm n

which, restricted to the upper subspace, coinsides with (31). Moreover, the matrix O

Q 0 0 I2m 2n

again symmetric positive-deﬁnite by construction.

Proof of part (ii). According to the system morphism conditions, we just need to check

LJn Q = Jm O

Q 0 0 I2m 2n

The ﬁrst condition is

LJn Q = Jm O

Q 0 0 I2m 2n

OT LJn Q = OT Jm O

Q 0 0 I2m 2n

Jn 0 0 Jm n

Q 0 0 I2m 2n

The second and third conditions are clear with L = O

9.11 Proof of Proposition 35

f is well-deﬁned. Given (Q1, B1) sys (Q2, B2), there exists an invertible L R2n such that (14) is

satisﬁed. Let L = O

L 0 0 I2m 2n

OT . Check that L satisﬁes the conditions (14) together with (Q

2). Therefore, (Q

f is surjective. This is clear from deﬁnition of (Q , B ).

f is injective. Given (Q

2), it means there exists an invertible L R2m such that L

satisﬁes the conditions in (14) together with (Q

2). Write the matrix OT L O in the form L1 L2 L3 L4

, where L1 R2n. Then check L1 satisﬁes the conditions (14) together with (Q1, B1) and (Q2, B2).

Therefore, (Q1, B1) sys (Q2, B2).

Juan-Pablo Ortega and Daiying Yin

9.12 Proof of Proposition 36

Clearly, Q is also symmetric and positive-deﬁnite. Thus, again by Williamson s theorem, Q = (S )T

As before, we have

λI2m (S ) 1

λI2m Jm(S )T

= det(λI2m Jm Q )

Q 0 0 I2m 2n

Q 0 0 I2m 2n

Q 0 0 I2m 2n

λ(Jm O) 1Jm(Jm O) +

Q 0 0 I2m 2n

Q 0 0 I2m 2n

λJn 0 0 λJm n

Q 0 0 I2m 2n

= det(λJm n + I2m 2n) det(λJn + Q)

= (λ2 + 1)m n det(λI2n Jn Q)

= (λ2 + 1)m n

If we ﬁxed the order of symplectic eigenvalues d according to d = (d1, . . . , dn, 1, . . . , 1), then

Q 0 0 I2m 2n

ST 0 0 Jm n

D 0 0 D 0 0 Im n 0 0 Im n

D 0 0 Im n 0 0 D 0 0 Im n

Learnability of Linear Port-Hamiltonian Systems

Now, we check the matrix O

OT is symplectic, that is,

Jn 0 0 Jm n

Jn 0 0 Jm n

Therefore, O

OT is a symplectic matrix diagonalizing Q in Williamson s theorem. Then, we

v = S B = O

9.13 Proof of Proposition 37

Similar to the proof of Theorem 22, simply replace Q with O

Q 0 0 I2m 2n

OT , B with O

OT , D with

upper 0m n v T

9.14 A note on the design of discrete integrators on the transformed space

Even though we used just a simple Euler integration scheme in the numerical illustration, structurepreserving integration algorithms could have been used. In particular, we could have used an implicit midpoint rule which is symplectic (see Marsden and West (2001)), that is, it preserves the symplectic form dq dp. Recall that if LLag(q, q) is the Lagrangian function of the system of interest, then the midpoint integrator is obtained by using the discrete Lagrangian

d (q0, q1, h) = h LLag((1 α)q0 + αq1, q1 q0

2 to approximate the exact discrete Lagrangian

d (q0, q1, h) =

LLag(q0,1(t), q0,1(t))dt.

Explicitly, the midpoint integrator for a linear autonomous Hamiltonian system is

zn+1 zn = h JQ

which in terms of the controllable Hamiltonian representation reads

L(sn+1 sn) = h

2 JQL(sn+1 + sn) = h

1 (d)(sn+1 + sn), (32)

where the second equality holds by the construction of L in the proof of Theorem 7 part (i).

Thus, for the symplectic structure to be preserved in the original space, we can merely integrate by requiring sn+1 sn = h

1 (d)(sn+1 + sn), where gctr

1 (d) as we have seen, takes the form

0 1 0 . . . 0 0 0 1 . . . 0 ...

... ... ... 0 0 0 . . . 1 a0 a1 a2 . . . a2m 1

Juan-Pablo Ortega and Daiying Yin

Therefore, the integrator is given by

sn+1 = (I2n h

1 (d)) 1(I2n + h

where the matrix inverse is well-deﬁned for suﬃciently small time step h. Indeed, the integrator can be deﬁned on the quotient space of L, since by (32), we may as well choose sn+1 such that

sn+1 sn = h

1 (d)(sn+1 + sn) + sker

for an arbitrary sker ker(L).

By a similar argument, the midpoint rule in terms of observable Hamiltonian representation reads

sn+1 sn = L(zn+1 zn) = h

2 LJQ(zn+1 + zn) = h

1 (d)(sn+1 + sn), (33)

where the last equality holds by construction of L from Theorem 7 Part (ii).

Therefore, the integrator is

sn+1 = (I2n h

1 (d)) 1(I2n + h

In the case of port-Hamiltonian system, if the system is driven by some ﬁber-preserving external force f H, that is, some input as in our case, then the discrete Lagrange-d Alembert Principle can be used to construct variational integrators so that all the correspondence relationships and error analysis of standard variational integrators still hold (see Marsden and West (2001)). For example, the midpoint rule applied to the controllable Hamiltonian representation becomes

zn+1 zn = h JQ

L(sn+1 sn) = h

1 (d)(sn+1 + sn) +

Note that this structure-preserving integrator is not explicit in general.