# clifford_group_equivariant_neural_networks__4e320b33.pdf Clifford Group Equivariant Neural Networks David Ruhe AI4Science Lab, AMLab, API University of Amsterdam david.ruhe@gmail.com Johannes Brandstetter Microsoft Research AI4Science brandstetter@ml.jku.at Patrick Forré AI4Science Lab, AMLab University of Amsterdam p.d.forre@uva.nl We introduce Clifford Group Equivariant Neural Networks: a novel approach for constructing O(n)- and E(n)-equivariant models. We identify and study the Clifford group: a subgroup inside the Clifford algebra tailored to achieve several favorable properties. Primarily, the group s action forms an orthogonal automorphism that extends beyond the typical vector space to the entire Clifford algebra while respecting the multivector grading. This leads to several non-equivalent subrepresentations corresponding to the multivector decomposition. Furthermore, we prove that the action respects not just the vector space structure of the Clifford algebra but also its multiplicative structure, i.e., the geometric product. These findings imply that every polynomial in multivectors, including their grade projections, constitutes an equivariant map with respect to the Clifford group, allowing us to parameterize equivariant neural network layers. An advantage worth mentioning is that we obtain expressive layers that can elegantly generalize to inner-product spaces of any dimension. We demonstrate, notably from a single core implementation, state-of-the-art performance on several distinct tasks, including a three-dimensional n-body experiment, a four-dimensional Lorentz-equivariant high-energy physics experiment, and a five-dimensional convex hull experiment. 1 Introduction Incorporating group equivariance to ensure symmetry constraints in neural networks has been a highly fruitful line of research [CW16, WGTB17, CGKW18, KT18, WC19, WGW+18, Est20, BBCV21, WFVW21, CLW22]. Besides translation and permutation equivariance [ZKR+17, SGT+08], rotation equivariance proved to be vitally important for many graph-structured problems as encountered in, e.g., the natural sciences. Applications of such methods include modeling the dynamics of complex physical systems or motion trajectories [KFW+18, BHP+22]; studying or generating molecules, proteins, and crystals [RDRL14, GLB+16, CTS+17, SSK+18, ZCD+20, TVS+21, AGB22]; and point cloud analysis [WSK+15, UPH+19]. Note that many of these focus on three-dimensional problems involving rotation, reflection, or translation equivariance by considering the groups O(3), SO(3), E(3), or SE(3). Such equivariant neural networks can be broadly divided into three categories: approaches that scalarize geometric quantities, methods employing regular group representations, and those utilizing irreducible representations, often of O(3) [HRXH22]. Scalarization methods operate exclusively on Code is available at https://github.com/David Ruhe/clifford-group-equivariant-neural-networks 37th Conference on Neural Information Processing Systems (Neur IPS 2023). 1 |{z} Scalars e1 + e2 + e3 | {z } Vectors e12 + e13 + e23 | {z } Bivectors e123 | {z } Trivectors Figure 1: CGENNs (represented with ϕ) are able to operate on multivectors (elements of the Clifford algebra) in an O(n)- or E(n)-equivariant way. Specifically, when an action ρ(w) of the Clifford group, representing an orthogonal transformation such as a rotation, is applied to the data, the model s representations corotate. Multivectors can be decomposed into scalar, vector, bivector, trivector, and even higher-order components. These elements can represent geometric quantities such as (oriented) areas or volumes. The action ρ(w) is designed to respect these structures when acting on them. scalar features or manipulate higher-order geometric quantities such as vectors via scalar multiplication [SSK+18, CCG18, KGG20, KKN20, SHW21, JES+21, GBG21, SUG21, DLD+21, HHR+22, TF22]. They can be limited by the fact that they do not extract all directional information. Regular representation methods construct equivariant maps through an integral over the respective group [CW16, KT18]. For continuous Lie groups, however, this integral is intractable and requires coarse approximation [FSIW20, Bek19]. Methods of the third category employ the irreducible representations of O(3) (the Wigner-D matrices) and operate in a steerable spherical harmonics basis [TSK+18, AHK19, FWFW20, BHP+22, BMS+22]. This basis allows a decomposition into type-l vector subspaces that transform under Dl: the type-l matrix representation of O(3) [Len90, FA+91]. Through tensor products decomposed using Clebsch-Gordan coefficients (Clebsch-Gordan tensor product), vectors (of different types) interact equivariantly. These tensor products can be parameterized using learnable weights. Key limitations of such methods include the necessity for an alternative basis, along with acquiring the Clebsch-Gordan coefficients, which, although they are known for unitary groups of any dimension [KR67], are not trivial to obtain [AKHv D11]. We propose Clifford Group Equivariant Neural Networks (CGENNs): an equivariant parameterization of neural networks based on Clifford algebras. Inside the algebra, we identify the Clifford group and its action, termed the (adjusted) twisted conjugation, which has several advantageous properties. Unlike classical approaches that represent these groups on their corresponding vector spaces, we carefully extend the action to the entire Clifford algebra. There, it automatically acts as an orthogonal automorphism that respects the multivector grading, enabling nontrivial subrepresentations that operate on the algebra subspaces. Furthermore, the twisted conjugation respects the Clifford algebra s multiplicative structure, i.e. the geometric product, allowing us to bypass the need for explicit tensor product representations. As a result, we obtain two remarkable properties. First, all polynomials in multivectors generate Clifford group equivariant maps from the Clifford algebra to itself. Additionally, grade projections are equivariant, allowing for a denser parameterization of such polynomials. We then demonstrate how to construct parameterizable neural network layers using these properties. Our method comes with several advantages. First, instead of operating on alternative basis representations such as the spherical harmonics, CGENNs (similarly to scalarization methods) directly transform data in a vector basis. Second, multivector representations allow a (geometrically meaningful) product structure while maintaining a finite dimensionality as opposed to tensor product representations. Through geometric products, we can transform vector-valued information, resulting in a more accurate and nuanced interpretation of the underlying structures compared to scalarization methods. Further, we can represent exotic geometric objects such as pseudovectors, encountered in certain physics problems, which transform in a nonstandard manner. Third, our method readily generalizes to orthogonal groups regardless of the dimension or metric signature of the space, thereby attaining O(n)- or E(n)-equivariance. These advantages are demonstrated on equivariance bench- marks of different dimensionality. Note that specialized tools were developed for several of these tasks, while CGENNs can be applied more generally. 2 The Clifford Algebra Clifford algebras (also known as geometric algebras) are powerful mathematical objects with applications in various areas of science and engineering. For a complete formal construction, we refer the reader to Appendix D. Let V be a finite-dimensional vector space over a field F equipped with a quadratic form q : V F. The Clifford algebra Cl(V, q) is the unitary, associative, noncommutative algebra generated by V such that for every v V the relation v2 = q(v) holds, i.e., vectors square to scalars. This simple property solely generates a unique mathematical theory that underpins many applications. Note that every element x of the Clifford algebra Cl(V, q) is a linear combination of (formal, non-commutative) products of vectors modulo the condition that every appearing square v2 gets identified with the scalar square q(v): x = P i I ci vi,1 vi,ki 2. Here, the index set I is finite, ci F, k N0, vi,j V . The Clifford algebra s associated bilinear form b(v1, v2) := 1 2 (q(v1 + v2) q(v1) q(v2)) yields the fundamental Clifford identity: v1v2 + v2v1 = 2b(v1, v2) for v1, v2 V (Lemma D.3). In this context, the quantity v1v2 represents the geometric product, which is aptly named for its ability to compute geometric properties and facilitate various transformations. Note that when v1, v2 are orthogonal (e.g., for orthogonal basis vectors), b(v1, v2) = 0, in which case v1v2 = v2v1. The dimensionality of the algebra is 2n, where n := dim V (Theorem D.15). Let e1, . . . , en be an orthogonal basis of V . The tuple (e A)A [n], [n] := {1, . . . , n}, is an orthogonal basis for Cl(V, q), where for all such A the product e A := Q< i A ei is taken in increasing order (Theorem D.26). We will see below that we can decompose the algebra into vector subspaces Cl(m)(V, q), m = 0, . . . , n, called grades, where dim Cl(m)(V, q) = n m . Elements of grade m = 0 and m = 1 are scalars (Cl(0)(V, q) = F) and vectors (Cl(1)(V, q) = V ), respectively, while elements of grade m = 2 and m = 3 are referred to as bivectors and trivectors. Similar terms are used for elements of even higher grade. These higher-order grades can represent (oriented) points, areas, and volumes, as depicted in Figure 1. Clifford algebras provide tools that allow for meaningful algebraic representation and manipulation of geometric quantities, including areas and volumes [RDK21, DM02, DFM09]. In addition, they offer generalizations such as extensions of the exterior and Grassmannian algebras, along with the natural inclusion of complex numbers and Hamilton s quaternions [Gra62, Ham66]. Applications of Clifford algebras can be found in robotics [BCRLZE06, HZBC08], computer graphics [WCL05, BTH22], signal processing [Hit12, BMQQS+21], and animation [HP10, CC12]. In the context of machine learning, Clifford algebras and hypercomplex numbers have been employed to improve the performance of algorithms in various tasks. For example, [MFW21] learn an equivariant embedding using geometric neurons used for classification tasks. Further, [Spe21] introduce geometric algebra attention networks for point cloud problems in physics, chemistry, and biology. More recently, [BBWG22] introduce Clifford neural layers and Clifford Fourier transforms for accelerating solving partial differential equations. [RGd K+23] continue this direction, strengthening the geometric inductive bias by the introduction of geometric templates. Concurrently with this work, [BDHBC23] develop the geometric algebra transformer. Further, [LK14, PRM+18, ZXXC18] introduce complexvalued and quaternion-valued networks. Finally, [KId HN23] define normalizing flows on the group of unit quaternions for sampling molecular crystal structures. 3 Theoretical Results In order to construct equivariant multivector-valued neural networks, we outline our theoretical results. We first introduce the following theorem regarding the multivector grading of the Clifford algebra, which is well-known in case the algebra s metric is non-degenerate. Although the general case, including a potentially degenerate metric, appears to be accepted, we were unable to find a self-contained proof during our studies. Hence, we include it here for completeness. Theorem 3.1 (The multivector grading of the Clifford algebra). Let e1, . . . , en and b1, . . . , bn be two orthogonal bases of (V, q). Then the following sub-vector spaces Cl(m)(V, q) of Cl(V, q), 2If ki = 0 the product of vectors vi,1 vi,ki is empty, and, in this case, we mean the unit 1 in Cl(V, q). m = 0, . . . , n, are independent of the choice of the orthogonal basis, i.e., Cl(m)(V, q) := span {e A | A [n], |A| = m} != span {b A | A [n], |A| = m} . (1) The proof can be found in Theorem D.27. Intuitively, this means that the claims made in the following (and the supplementary material) are not dependent on the chosen frame of reference, even in the degenerate setting. We now declare that the Clifford algebra Cl(V, q) decomposes into an orthogonal direct sum of the vector subspaces Cl(m)(V, q). To this end, we need to extend the bilinear form b from V to Cl(V, q). For elements x1, x2, x Cl(V, q), the extended bilinear form b and the extended quadratic form q are given via the projection onto the zero-component (explained below), where β : Cl(V, q) Cl(V, q) denotes the main anti-involution of Cl(V, q)3 : b : Cl(V, q) Cl(V, q) F, b(x1, x2) := (β(x1)x2)(0), q(x) := b(x, x). (2) Note that by construction, both b and q reduce to their original versions when restricted to V . Using the extended quadratic form, the tuple (Cl(V, q), q) turns into a quadratic vector space in itself. As a corollary (see Corollary D.30), the Clifford algebra has an orthogonal-sum decomposition w.r.t. the extended bilinear form b: m=0 Cl(m)(V, q), dim Cl(m)(V, q) = n m This result implies that we can always write x Cl(V, q) as x = x(0) + x(1) + + x(n), where x(m) Cl(m)(V, q) denotes the grade-m part of x. Selecting a grade defines an orthogonal projection: (_)(m) : Cl(V, q) Cl(m)(V, q), x 7 x(m), m = 0, . . . , n. (4) Let us further introduce the notation Cl[0](V, q) := Ln m even Cl(m)(V, q), whose elements are of even parity, and Cl[1](V, q) := Ln m odd Cl(m)(V, q) for those of odd parity. We use x = x[0] + x[1] to denote the parity decomposition of a multivector. 3.1 The Clifford Group and its Clifford Algebra Representations Let Cl (V, q) denote the group of invertible elements of the Clifford algebra, i.e., the set of those elements w Cl(V, q) that have an inverse w 1 Cl(V, q): w 1w = ww 1 = 1. For w Cl (V, q), we then define the (adjusted) twisted conjugation as follows: ρ(w) : Cl(V, q) Cl(V, q), ρ(w)(x) := wx[0]w 1 + α(w)x[1]w 1, (5) where α is the main involution of Cl(V, q), which is given by α(w) := w[0] w[1]. This map ρ(w) : Cl(V, q) Cl(V, q), notably not just from V V , will be essential for constructing equivariant neural networks operating on the Clifford algebra. In general, ρ and similar maps defined in the literature do not always posses the required properties (see Motivation E.1). However, when our ρ is restricted to a carefully chosen subgroup of Cl (V, q), many desirable characteristics emerge. This subgroup will be called the Clifford group4 of Cl(V, q) and we define it as: Γ(V, q) := n w Cl (V, q) Cl[0](V, q) Cl[1](V, q) v V, ρ(w)(v) V o . (6) In words, Γ(V, q) contains all invertible (parity) homogeneous elements that preserve vectors (m = 1 elements) via ρ. The special Clifford group is defined as Γ[0](V, q) := Γ(V, q) Cl[0](V, q). Regarding the twisted conjugation, ρ(w) was ensured to reduce to a reflection when restricted to V a property that we will conveniently use in the upcoming section. Specifically, when w, x Cl(1)(V, q) = V , w Cl (V, q), then ρ(w) reflects x in the hyperplane normal to w: ρ(w)(x) = wxw 1 != x 2 b(w, x) b(w, w)w. (7) Next, we collect several advantageous identities of ρ in the following theorem, which we elaborate on afterwards. For proofs, consider Lemma E.8, Theorem E.10, and Theorem E.29. 3Recall that any x Cl(V, q) can be written as a linear combination of vector products. β is the map that reverses the order: β P i I ci vi,1 vi,ki := P i I ci vi,ki vi,1. For details, see Definition D.18. 4We elaborate shortly on the term Clifford group in contrast to similar definitions in Remark E.14. Cl(V, q) Cl(V, q) ℓtimes z }| { Cl(V, q) Cl(V, q) ρ(w) ρ(w) ρ(w) Cl(m)(V, q) Cl(m)(V, q) Figure 2: Commutative diagrams expressing Clifford group equivariance with respect to the main operations: polynomials F (left) and grade projections (_)(m) (right). Theorem 3.2. Let w1, w2, w Γ(V, q), x1, x2, x Cl(V, q), c F, m {0, . . . , n}. ρ then satisfies: 1. Additivity: ρ(w)(x1 + x2) = ρ(w)(x1) + ρ(w)(x2), 2. Multiplicativity: ρ(w)(x1x2) = ρ(w)(x1)ρ(w)(x2), and: ρ(w)(c) = c, 3. Invertibility: ρ(w 1)(x) = ρ(w) 1(x), 4. Composition: ρ(w2) (ρ(w1)(x)) = ρ(w2w1)(x), and: ρ(c)(x) = x for c = 0, 5. Orthogonality: b (ρ(w)(x1), ρ(w)(x2)) = b (x1, x2). The first two properties state that ρ(w) is not only additive, but even multiplicative regarding the geometric product. The third states that ρ(w) is invertible, making it an algebra automorphism of Cl(V, q). The fourth property then states that ρ : Γ(V, q) Aut Alg(Cl(V, q)) is also a group homomorphism to the group of all algebra automorphisms. In other words, Cl(V, q) is a linear representation of Γ(V, q) and, moreover, it is also an algebra representation of Γ(V, q). Finally, the last point shows that each ρ(w) generates an orthogonal map with respect to the extended bilinear form b. These properties yield the following results (see also Theorem E.16, Corollary E.18, Figure 2). Corollary 3.3 (All grade projections are Clifford group equivariant). For w Γ(V, q), x Cl(V, q) and m = 0, . . . , n we have the following equivariance property: ρ(w)(x(m)) = (ρ(w)(x))(m) . (8) In particular, for x Cl(m)(V, q) we also have ρ(w)(x) Cl(m)(V, q). This implies that the grade projections: Cl(V, q) Cl(m)(V, q) are Γ(V, q)-equivariant maps, and, that each Cl(m)(V, q) constitutes an orthogonal representation of Γ(V, q). The latter means that ρ induces a group homomorphisms from the Clifford group to the group of all orthogonal invertible linear transformations of Cl(m)(V, q): ρ(m) : Γ(V, q) O Cl(m)(V, q), q , ρ(m)(w) := ρ(w)|Cl(m)(V,q). (9) Corollary 3.4 (All polynomials are Clifford group equivariant). Let F F[T1, . . . , Tℓ] be a polynomial in ℓvariables with coefficients in F, w Γ(V, q). Further, consider ℓelements x1, . . . , xℓ Cl(V, q). Then we have the following equivariance property: ρ(w) (F(x1, . . . , xℓ)) = F(ρ(w)(x1), . . . , ρ(w)(xℓ)). (10) To prove that ρ(w) distributes over any polynomial, we use both its additivity and multiplicativity regarding the geometric product. By noting that one can learn the coefficients of such polynomials, we can build flexible parameterizations of Clifford group equivariant layers for quadratic vector spaces of any dimension or metric. We involve grade projections to achieve denser parameterizations. More details regarding neural network constructions follow in Section 4. 3.2 Orthogonal Representations To relate to the orthogonal group O(V, q), the set of invertible linear maps Φ : V V preserving q, first note that Equation (5) shows that for every w F we always have: ρ(w) = id Cl(V,q). So the action ρ on Cl(V, q) can already be defined on the quotient group Γ(V, q)/F . Moreover, we have: Theorem 3.5 (See also Remark E.31 and Corollary E.32 ). If (V, q) is non-degenerate5 then ρ induces a well-defined group isomorphism: ρ(1) : Γ(V, q)/F O(V, q), [w] 7 ρ(w)|V . (11) The above implies that O(V, q) acts on whole Cl(V, q) in a well-defined way. Concretely, if x Cl(V, q) is of the form x = P i I ci vi,1 vi,ki with vi,j V , ci F and Φ O(V, q) is given by ρ(1)([w]) = Φ with w Γ(V, q), then we have: ρ(w)(x) = X i I ci ρ(w)(vi,1) ρ(w)(vi,ki) = X i I ci Φ(vi,1) Φ(vi,ki). (12) This means that Φ acts on a multivector x by applying an orthogonal transformation to all its vector components vi,j, as illustrated in Figure 1. Theorem 3.5 implies that that a map f : Cl(V, q)ℓin Cl(V, q)ℓout is equivariant to the Clifford group Γ(V, q) if and only if it is equivariant to the orthogonal group O(V, q) when both use the actions on Cl(V, q) described above (for each component). To leverage these results for constructing O(V, q)-equivariant neural networks acting on vector features, i.e., x V ℓin, we refer to Section 4. To prove Theorem 3.5, we invoke Cartan-Dieudonné (Theorem C.13), stating that every orthogonal transformation decomposes into compositions of reflections, and recall that ρ reduces to a reflection when restricted to V , see Theorem E.25. Theorem 3.5 also allows one to provide explicit descriptions of the element of the Clifford group, see Corollary E.27. Finally, it is worth noting that our method is also Pin and Spin group-equivariant6. These groups, sharing properties with the Clifford group, are also often studied in relation to the orthogonal group. 4 Methodology We restrict our layers to use F := R. Our method is most similar to steerable methods such as [BHP+22]. However, unlike these works, we do not require an alternative basis representation based on spherical harmonics, nor do we need to worry about Clebsch-Gordan coefficients. Instead, we consider simply a steerable vector basis for V , which then automatically induces a steerable multivector basis for Cl(V, q) and its transformation kernels. By steerability, we mean that this basis can be transformed in a predictable way under an action from the Clifford group, which acts orthogonally on both V and Cl(V, q) (see Figure 1). We present layers yielding Clifford group-equivariant optimizable transformations. All the main ideas are based on Corollary 3.3 and Corollary 3.4. It is worth mentioning that the methods presented here form a first exploration of applying our theoretical results, making future optimizations rather likely. Linear Layers Let x1, . . . , xℓ Cl(V, q) be a tuple of multivectors expressed in a steerable basis, where ℓrepresents the number of input channels. Using the fact that a polynomial restricted to the first order constitutes a linear map, we can construct a linear layer by setting y(k) cout := T lin ϕcout (x1, . . . , xℓ)(k) := cin=1 ϕcoutcink x(k) cin , (13) where ϕcoutcink R are optimizable coefficients and cin, cout denote the input and output channel, respectively. As such, Tϕ : Cl(V, q)ℓ Cl(V, q) is a linear transformation in each algebra subspace 5Here we restrict (V, q) to be non-degenerate as we never consider degenerate quadratic forms in our experiments. However, in the supplementary material we generalize this also to the degenerate case by carefully taking the radical subspace R of (V, q) into account on both sides of the isomorphism. 6We discuss in Appendix E.6 a general definition of these groups. k. Recall that this is possible due to the result that ρ(w) respects the multivector subspaces. This computes a transformation for the output channel cout; the map can be repeated (using different sets of parameters) for other output channels, similar to classical neural network linear layers. For y(0) cout R (the scalar part of the Clifford algebra), we can additionally learn an invariant bias parameter. Geometric Product Layers A core strength of our method comes from the fact that we can also parameterize interaction terms. In this work, we only consider layers up to second order. Higher-order interactions are indirectly modeled via multiple successive layers. As an example, we take the pair x1, x2. Their interaction terms take the form x(i) 1 x(j) 2 (k) , i, j, k = 0, . . . , n; where we again make use of the fact that ρ(w) respects grade projections. As such, all the grade-k terms resulting from the interaction of x1 and x2 are parameterized with Pϕ(x1, x2)(k) := j=0 ϕijk x(i) 1 x(j) 2 (k) , (14) where Pϕ : Cl(V, q) Cl(V, q) Cl(V, q). This means that we get (n + 1)3 parameters for every geometric product between a pair of multivectors7. Parameterizing and computing all second-order terms amounts to ℓ2 such operations, which can be computationally expensive given a reasonable number of channels ℓ. Instead, we first apply a linear map to obtain y1, . . . , yℓ Cl(V, q). Through this map, the mixing (i.e., the terms that will get multiplied) gets learned. That is, we only get ℓ pairs (x1, y1), . . . , (xℓ, yℓ) from which we then compute z(k) cout := Pϕcout (xcin, ycin)(k). Note that here we have cin = cout, i.e., the number of channels does not change. Hence, we refer to this layer as the element-wise geometric product layer. We can obtain a more expressive (yet more expensive) parameterization by linearly combining such products by computing z(k) cout := T prod ϕcout(x1, . . . , xℓ, y1, . . . , yℓ)(k) := cin=1 Pϕcoutcin(xcin, ycin)(k), (15) which we call the fully-connected geometric product layer. Computational feasibility and experimental verification should determine which parameterization is preferred. Normalization and Nonlinearities Since our layers involve quadratic and potentially higher-order interaction terms, we need to ensure numerical stability. In order to do so, we use a normalization operating on each multivector subspace before computing geometric products by putting x(m) 7 x(m) σ(am) q(x(m)) 1 + 1, (16) where x(m) Cl(m)(V, q). Here, σ denotes the logistic sigmoid function, and am R is a learned scalar. The denominator interpolates between 1 and the quadratic form q x(m) , normalizing the magnitude of x(m). This ensures that the geometric products do not cause numerical instabilities without losing information about the magnitude of x(m), where a learned scalar interpolates between both regimes. Note that by Theorem 3.2, q(x(m)) is invariant under the action of the Clifford group, rendering Equation (16) an equivariant map. Next, we use the layer-wide normalization scheme proposed by [RGd K+23], which, since it is also based on the extended quadratic form, is also equivariant with respect to the twisted conjugation. Regarding nonlinearities, we use a slightly adjusted version of the units proposed by [RGd K+23]. Since the scalar subspace Cl(0)(V, q) is always invariant with respect to the twisted conjugation, we can apply x(m) 7 Re LU x(m) when m = 0 and x(m) 7 σϕ q x(m) x(m) otherwise. We can replace Re LU with any common scalar activation function. Here, σϕ represents a potentially parameterized nonlinear function. Usually, however, we restrict it to be the sigmoid function. Since we modify x(m) with an invariant scalar quantity, we retain equivariance. Such gating activations are commonly used in the equivariance literature [WGW+18, GS22]. 7In practice, we use fewer parameters due to the sparsity of the geometric product, implying that many interactions will invariably be zero, thereby making their parameterization redundant. O(3) Signed Volumes O(5) Convex Hulls 102 103 104 O(5) Regression CGENN (Ours) Figure 3: Left: Test mean-squared errors on the O(3) signed volume task as functions of the number of training data. Note that due to identical performance, some baselines are not clearly visible. Right: same, but for the O(5) convex hull task. Figure 4: Test meansquared-errors on the O(5) regression task. Embedding Data in the Clifford Algebra In this work, we consider data only from the vector space V or the scalars R, although generally one might also encounter, e.g., bivector data. That is, we have some scalar features h1, . . . , hk R and some vector features hk+1, . . . , hℓ V . Typical examples of scalar features include properties like mass, charge, temperature, and so on. Additionally, one-hot encoded categorical features are also included because {0, 1} R and they also transform trivially. Vector features include positions, velocities, and the like. Then, using the identifications Cl(0)(V, q) = R and Cl(1)(V, q) = V , we can embed the data into the scalar and vector subspaces of the Clifford algebra to obtain Clifford features x1, . . . , xℓ Cl(V, q). Similarly, we can predict scalaror vector-valued data as output of our model by grade-projecting onto the scalar or vector parts, respectively. We can then directly compare these quantities with groundtruth vectoror scalar-valued data through a loss function and use standard automatic differentiation techniques to optimize the model. Note that invariant predictions are obtained by predicting scalar quantities. 5 Experiments Here, we show that CGENNs excel across tasks, attaining top performance in several unique contexts. Parameter budgets as well as training setups are kept as similar as possible to the baseline references. All further experimental details can be found in the public code release. 5.1 Estimating Volumetric Quantities O(3) Experiment: Signed Volumes This task highlights the fact that equivariant architectures based on scalarization are not able to extract some essential geometric properties from input data. In a synthetic setting, we simulate a dataset consisting of random three-dimensional tetrahedra. A main advantage of our method is that it can extract covariant quantities including (among others) signed volumes, which we demonstrate in this task. Signed volumes are geometrically significant because they capture the orientation of geometric objects in multidimensional spaces. For instance, in computer graphics, they can determine whether a 3D object is facing towards or away from the camera, enabling proper rendering. The input to the network is the point cloud and the loss function is the mean-squared error between the signed volume and its true value. Note that we are predicting a covariant (as opposed to invariant) scalar quantity (also known as a pseudoscalar) under O(3) transformations using a positive-definite (Euclidean) metric. The results are displayed in the left part of Figure 3. We compare against a standard multilayer perceptron (MLP), an MLP version of the E(n)-GNN [SHW21] which uses neural networks to update positions with scalar multiplication, Vector Neurons (VN) [DLD+21], and Geometric Vector Perceptrons (GVP) [JES+21]. We see that the scalarization methods fail to access the features necessary for this task, as evidenced by their test loss not improving even with more available data. The multilayer perceptron, although a universal approximator, lacks the correct inductive biases. Our model, however, has the correct inductive biases (e.g., the equivariance property) and can also access the signed volume. Note that we do not take the permutation invariance of this task into account, as we are interested in comparing our standard feed-forward architectures against similar baselines. O(5) Experiment: Convex Hulls We go a step further and consider a five-dimensional Euclidean space, showcasing our model s ability to generalize to high dimensions. We also make the experiment more challenging by including more points and estimating the volume of the convex hull generated by these points a task that requires sophisticated algorithms in the classical case. Note that some points may live inside the hull and do not contribute to the volume. We use the same network architectures as before (but now embedded in a five-dimensional space) and present the results in Figure 3. We report the error bars for CGENNs, representing three times the standard deviation of the results of eight runs with varying seeds. Volume (unsigned) is an invariant quantity, enabling the baseline methods to approximate its value. However, we still see that CGENNs outperform the other methods, the only exception being the low-data regime of only 256 available data points. We attribute this to our method being slightly more flexible, making it slightly more prone to overfitting. To mitigate this issue, future work could explore regularization techniques or other methods to reduce overfitting in low-data scenarios. 5.2 O(5) Experiment: Regression We compare against the methods presented by [FWW21] who propose an O(5)-invariant regression problem. The task is to estimate the function f(x1, x2) := sin( x1 ) x2 3/2 + x 1 x2 x1 x2 , where the five-dimensional vectors x1, x2 are sampled from a standard Gaussian distribution in order to simulate train, test, and validation datasets. The results are shown in Figure 4. We used baselines from [FWW21] including an MLP, MLP with augmentation (MLP+Aug), and the O(5)- & SO(5)-MLP architectures. We maintain the same number of parameters for all data regimes. For extremely small datasets (30 and 100 samples), we observe some overfitting tendencies that can be countered with regularization (e.g., weight decay) or using a smaller model. For higher data regimes (300 samples and onward), CGENNs start to significantly outperform the baselines. 5.3 E(3) Experiment: n-Body System Method MSE ( ) SE(3)-Tr. 0.0244 TFN 0.0155 NMP 0.0107 Radial Field 0.0104 EGNN 0.0070 SEGNN 0.0043 CGENN 0.0039 0.0001 Table 1: Mean-squared error (MSE) on the n-body system experiment. The n-body experiment [KFW+18] serves as a benchmark for assessing the performance of equivariant (graph) neural networks in simulating physical systems [HRXH22]. In this experiment, the dynamics of n = 5 charged particles in a three-dimensional space are simulated. Given the initial positions and velocities of these particles, the task is to accurately estimate their positions after 1 000 timesteps. To address this challenge, we construct a graph neural network (GNN) using the Clifford equivariant layers introduced in the previous section. We use a standard message-passing algorithm [GSR+17] where the message and update networks are CGENNs. So long as the message aggregator is equivariant, the end-to-end model also maintains equivariance. The input to the network consists of the mean-subtracted positions of the particles (to achieve translation invariance) and their velocities. The model s output is the estimated displacement, which is to the input to achieve translation-equivariant estimated target positions. We include the invariant charges as part of the input and their products as edge attributes. We compare against the steerable SE(3)-Transformers [FWFW20], Tensor Field Networks [TSK+18], and SEGNN [BHP+22]. Scalarization baselines include Radial Field [KKN20] and EGNN [SHW21]. Finally, NMP [GSR+17] is not an E(3)- equivariant method. The number of parameters in our model is maintained similar to the EGNN and SEGNN baselines to ensure a fair comparison. Results of our experiment are presented in Table 1, where we also present for CGENN three times the standard deviation of three identical runs with different seeds. Our approach clearly outperforms earlier methods and is significantly better than [BHP+22], thereby surpassing the baselines. This experiment again demonstrates the advantage of leveraging covariant information in addition to scalar quantities, as it allows for a more accurate representation of the underlying physics and leads to better predictions. Model Accuracy ( ) AUC ( ) 1/ϵB ( ) (ϵS = 0.5) 1/ϵB ( ) (ϵS = 0.3) Res Ne Xt [XGD+17] 0.936 0.9837 302 1147 P-CNN [CMS17] 0.930 0.9803 201 759 PFN [KMT19] 0.932 0.9819 247 888 Particle Net [QG20] 0.940 0.9858 397 1615 EGNN [SHW21] 0.922 0.9760 148 540 LGN [BAO+20] 0.929 0.9640 124 435 Lorentz Net [GMZ+22] 0.942 0.9868 498 2195 CGENN 0.942 0.9869 500 2172 Table 2: Performance comparison between our proposed method and alternative algorithms on the top tagging experiment. We present the accuracy, Area Under the Receiver Operating Characteristic Curve (AUC), and background rejection 1/ϵB and at signal efficiencies of ϵS = 0.3 and ϵS = 0.5. 5.4 O(1, 3) Experiment: Top Tagging Jet tagging in collider physics is a technique used to identify and categorize high-energy jets produced in particle collisions, as measured by, e.g., CERN s ATLAS detector [BTB+08]. By combining information from various parts of the detector, it is possible to trace back these jets origins [KNS+19, ATL17]. The current experiment seeks to tag jets arising from the heaviest particles of the standard model: the top quarks [IQWW09]. A jet tag should be invariant with respect to the global reference frame, which can transform under Lorentz boosts due to the relativistic nature of the particles. A Lorentz boost is a transformation that relates the space and time coordinates of an event as seen from two inertial reference frames. The defining characteristic of these transformations is that they preserve the Minkowski metric, which is given by γ(ct, x, y, z) := (ct)2 x2 y2 z2. Note the difference with the standard positive definite Euclidean metric, as used in the previous experiments. The set of all such transformations is captured by the orthogonal group O(1, 3); therefore, our method is fully compatible with modeling this problem. We evaluate our model on a top tagging benchmark published by [KPTR19]. It contains 1.2M training entries, 400k validation entries, and 400k testing entries. For each jet, the energy-momentum 4-vectors are available for up to 200 constituent particles, making this a much larger-scale experiment than the ones presented earlier. Again, we employ a standard message passing graph neural network [GSR+17] using CGENNs as message and update networks. The baselines include Res Ne Xt [XGD+17], PCNN [CMS17], PFN [KMT19], Particle Net [QG20], LGN [BAO+20], EGNN [SHW21], and the more recent Lorentz Net [GMZ+22]. Among these, LGN is a steerable method, whereas EGNN and Lorentz Net are scalarization methods. The other methods are not Lorentz-equivariant. Among the performance metrics, there are classification accuracy, Area Under the Receiver Operating Characteristic Curve (AUC), and the background rejection rate 1/ϵB at signal efficiencies of ϵS = 0.3 and ϵS = 0.5, where ϵB and ϵS are the false positive and true positive rates, respectively. We observe that Lorentz Net, a method that uses invariant quantities, is an extremely competitive baseline that was optimized for this task. Despite this, CGENNs are able to match its performance while maintaining the same core implementation. 6 Conclusion We presented a novel approach for constructing O(n)- and E(n)-equivariant neural networks based on Clifford algebras. After establishing the required theoretical results, we proposed parameterizations of nonlinear multivector-valued maps that exhibit versatility and applicability across scenarios varying in dimension. This was achieved by the core insight that polynomials in multivectors are O(n)- equivariant functions. Theoretical results were empirically substantiated in three distinct experiments, outperforming or matching baselines that were sometimes specifically designed for these tasks. CGENNs induce a (non-prohibitive) degree of computational overhead similar to other steerable methods. On the plus side, we believe that improved code implementations such as custom GPU kernels or alternative parameterizations to the current ones can significantly alleviate this issue, potentially also resulting in improved performances on benchmark datasets. This work provides solid theoretical and experimental foundations for such developments. [AGB22] Simon Axelrod and Rafael Gómez-Bombarelli, Geom, energy-annotated molecular conformations for property prediction and molecular generation, Scientific Data, Springer Science and Business Media LLC, 2022. [AHK19] Brandon M. Anderson, Truong-Son Hy, and Risi Kondor, Cormorant: Covariant Molecular Neural Networks., Conference on Neural Information Processing Systems (Neur IPS), 2019, pp. 14510 14519. [AKHv D11] Arne Alex, Matthias Kalus, Alan Huckleberry, and Jan von Delft, A numerical algorithm for the explicit calculation of su (n) and sl (n, c) clebsch gordan coefficients, Journal of Mathematical Physics 52 (2011), no. 2, 023507. [Art57] Emil Artin, Geometric Algebra, Interscience Tracts in Pure and Applied Mathematics, no. 3, Interscience Publishers, Inc., 1957. [ATL17] ATLAS, Jet energy scale measurements and their systematic uncertainties in proton-proton collisions at s = 13 Te V with the ATLAS detector, ar Xiv preprint ar Xiv:1703.09665, 2017. [BAO+20] Alexander Bogatskiy, Brandon M. Anderson, Jan T. Offermann, Marwah Roussi, David W. Miller, and Risi Kondor, Lorentz Group Equivariant Neural Network for Particle Physics, International Conference on Machine Learning (ICML), 2020, pp. 992 1002. [BBCV21] M. Bronstein, Joan Bruna, Taco Cohen, and Petar Velivckovi c, Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges, ar Xiv, 2021. [BBWG22] Johannes Brandstetter, Rianne van den Berg, Max Welling, and Jayesh K Gupta, Clifford neural layers for pde modeling, ar Xiv preprint ar Xiv:2209.04934, 2022. [BCRLZE06] Eduardo Bayro-Corrochano, Leo Reyes-Lozano, and Julio Zamora-Esquivel, Conformal geometric algebra for robotic vision, Journal of Mathematical Imaging and Vision 24 (2006), 55 81. [BDHBC23] Johann Brehmer, Pim De Haan, Sönke Behrends, and Taco Cohen, Geometric algebra transformers, ar Xiv preprint ar Xiv:2305.18415, 2023. [BEd L+16] Henri Bal, Dick Epema, Cees de Laat, Rob van Nieuwpoort, John Romein, Frank Seinstra, Cees Snoek, and Harry Wijshoff, A medium-scale distributed system for computer science research: Infrastructure for the long term, IEEE Computer 49 (2016), no. 5, 54 63. [Bek19] Erik J Bekkers, B-spline cnns on lie groups, ar Xiv preprint ar Xiv:1909.12057, 2019. [BHP+22] Johannes Brandstetter, Rob Hesselink, Elise van der Pol, Erik J. Bekkers, and Max Welling, Geometric and Physical Quantities improve E(3) Equivariant Message Passing., International Conference on Learning Representations (ICLR), 2022. [Bie20] Lukas Biewald, Experiment tracking with weights and biases, 2020, Software available from wandb.com. [BMQQS+21] Uzair Aslam Bhatti, Zhou Ming-Quan, Huo Qing-Song, Sajid Ali, Aamir Hussain, Yan Yuhuan, Zhaoyuan Yu, Linwang Yuan, and Saqib Ali Nawaz, Advanced color edge detection using clifford algebra in satellite images, IEEE Photonics Journal 13 (2021), no. 2, 1 20. [BMS+22] Simon Batzner, Albert Musaelian, Lixin Sun, Mario Geiger, Jonathan P. Mailoa, Mordechai Kornbluth, Nicola Molinari, Tess E. Smidt, and Boris Kozinsky, E(3)- equivariant graph neural networks for data-efficient and accurate interatomic potentials, Nature Communications, 2022. [BTB+08] JM Butterworth, J Thion, U Bratzler, PN Ratoff, RB Nickerson, JM Seixas, I Grabowska-Bold, F Meisel, S Lokwitz, et al., The atlas experiment at the cern large hadron collider, Jinst 3 (2008), S08003. [BTH22] Stephane Breuils, Kanta Tachibana, and Eckhard Hitzer, New Applications of Clifford s Geometric Algebra, Advances in Applied Clifford Algebras, Springer Science and Business Media LLC, 2022. [CC12] John Stephen Roy Chisholm and Alan K Common, Clifford algebras and their applications in mathematical physics, vol. 183, Springer Science & Business Media, 2012. [CCG18] Benjamin Coors, Alexandru Paul Condurache, and Andreas Geiger, Spherenet: Learning Spherical Representations for Detection and Classification in Omnidirectional Images., European Conference on Computer Vision (ECCV), Springer International Publishing, 2018, pp. 525 541. [CGKW18] Taco S. Cohen, Mario Geiger, Jonas Köhler, and Max Welling, Spherical CNNs., International Conference on Learning Representations (ICLR), 2018. [CLW22] Gabriele Cesa, Leon Lang, and Maurice Weiler, A Program to Build E(N)-Equivariant Steerable CNNs., International Conference on Learning Representations (ICLR), 2022. [CMS17] CMS, Boosted jet identification using particle candidates and deep neural networks, Detector Performance Figures: CMS-DP-17-049, 2017. [Cru80] Albert Crumeyrolle, Algèbres de clifford dégénérées et revêtements des groupes conformes affines orthogonaux et symplectiques, Annales De L Institut Henri Poincarephysique Theorique 33 (1980), 235 249. [Cru90] Albert Crumeyrolle, Orthogonal and Symplectic Clifford Algebras: Spinor Structures, vol. 57, Springer Science & Business Media, 1990. [CTS+17] Stefan Chmiela, Alexandre Tkatchenko, Huziel E. Sauceda, Igor Poltavsky, Kristof T. Schütt, and Klaus-Robert Müller, Machine learning of accurate energy-conserving molecular force fields, Science Advances, American Association for the Advancement of Science (AAAS), 2017. [CW16] Taco Cohen and Max Welling, Group Equivariant Convolutional Networks., International Conference on Machine Learning (ICML), 2016, pp. 2990 2999. [DFM09] Leo Dorst, Daniel Fontijne, and Stephen Mann, Geometric algebra for computer science (revised edition): An object-oriented approach to geometry, Morgan Kaufmann, 2009. [DKL10] Tekin Dereli, Sahin Koçak, and Murat Limoncu, Degenerate Spin Groups as Semi Direct Products, Advances in Applied Clifford Algebras 20 (2010), no. 3-4, 565 573. [DLD+21] Congyue Deng, Or Litany, Yueqi Duan, Adrien Poulenard, Andrea Tagliasacchi, and Leonidas J. Guibas, Vector Neurons: A General Framework for SO(3)-Equivariant Networks., IEEE International Conference on Computer Vision (ICCV), IEEE, 2021, pp. 12180 12189. [DM02] Leo Dorst and Stephen Mann, Geometric algebra: a computational framework for geometrical applications, IEEE Computer Graphics and Applications 22 (2002), no. 3, 24 31. [Est20] Carlos Esteves, Theoretical aspects of group equivariant neural networks, ar Xiv, 2020. [FA+91] William T Freeman, Edward H Adelson, et al., The design and use of steerable filters, IEEE Transactions on Pattern analysis and machine intelligence 13 (1991), no. 9, 891 906. [FSIW20] Marc Finzi, Samuel Stanton, Pavel Izmailov, and Andrew Gordon Wilson, Generalizing Convolutional Neural Networks for Equivariance to Lie Groups on Arbitrary Continuous Data., International Conference on Machine Learning (ICML), 2020, pp. 3165 3176. [FWFW20] Fabian Fuchs, Daniel E. Worrall, Volker Fischer, and Max Welling, Se(3)- Transformers: 3d Roto-Translation Equivariant Attention Networks., Conference on Neural Information Processing Systems (Neur IPS), 2020. [FWW21] Marc Finzi, Max Welling, and Andrew Gordon Wilson, A practical method for constructing equivariant multilayer perceptrons for arbitrary matrix groups, International Conference on Machine Learning, PMLR, 2021, pp. 3318 3328. [GBG21] Johannes Gasteiger, Florian Becker, and Stephan Günnemann, Gemnet: Universal Directional Graph Neural Networks for Molecules., Conference on Neural Information Processing Systems (Neur IPS), 2021, pp. 6790 6802. [GLB+16] Richard Gowers, Max Linke, Jonathan Barnoud, Tyler Reddy, Manuel Melo, Sean Seyler, Jan Doma nski, David Dotson, Sébastien Buchoux, Ian Kenney, and Oliver Beckstein, Mdanalysis: A Python Package for the Rapid Analysis of Molecular Dynamics Simulations, Proceedings of the Python in Science Conference, Sci Py, 2016. [GMZ+22] Shiqi Gong, Qi Meng, Jue Zhang, Huilin Qu, Congqiao Li, Sitian Qian, Weitao Du, Zhi-Ming Ma, and Tie-Yan Liu, An efficient Lorentz equivariant graph neural network for jet tagging, Journal of High Energy Physics, Springer Science and Business Media LLC, 2022. [Gra62] Hermann Grassmann, Die ausdehnungslehre, vol. 1, Enslin, 1862. [GS22] Mario Geiger and Tess Smidt, e3nn: Euclidean neural networks, 2022. [GSR+17] Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl, Neural message passing for quantum chemistry, International conference on machine learning, PMLR, 2017, pp. 1263 1272. [Ham66] William Rowan Hamilton, Elements of quaternions, Longmans, Green, & Company, 1866. [HHR+22] Wenbing Huang, Jiaqi Han, Yu Rong, Tingyang Xu, Fuchun Sun, and Junzhou Huang, Equivariant Graph Mechanics Networks with Constraints., International Conference on Learning Representations (ICLR), 2022. [Hit12] Eckhard Hitzer, The clifford fourier transform in real clifford algebras, 19th International Conference on the Application of Computer Science and Mathematics in Architecture and Civil Engineering, 2012. [HP10] Eckhard Hitzer and Christian Perwass, Interactive 3d space group visualization with clucalc and the clifford geometric algebra description of space groups, Advances in applied Clifford algebras 20 (2010), 631 658. [HRXH22] Jiaqi Han, Yu Rong, Tingyang Xu, and Wenbing Huang, Geometrically equivariant graph neural networks: A survey, ar Xiv, 2022. [Hun07] John D Hunter, Matplotlib: A 2d graphics environment, Computing in science & engineering 9 (2007), no. 03, 90 95. [HZBC08] Dietmar Hildenbrand, Julio Zamora, and Eduardo Bayro-Corrochano, Inverse kinematics computation in computer graphics and robotics using conformal geometric algebra, Advances in applied Clifford algebras 18 (2008), 699 713. [IQWW09] Joseph R Incandela, Arnulf Quadt, Wolfgang Wagner, and Daniel Wicke, Status and prospects of top-quark physics, Progress in Particle and Nuclear Physics 63 (2009), no. 2, 239 292. [JES+21] Bowen Jing, Stephan Eismann, Patricia Suriana, Raphael John Lamarre Townshend, and Ron O. Dror, Learning from Protein Structure with Geometric Vector Perceptrons., International Conference on Learning Representations (ICLR), 2021. [KFW+18] Thomas N. Kipf, Ethan Fetaya, Kuan-Chieh Wang, Max Welling, and Richard S. Zemel, Neural Relational Inference for Interacting Systems., International Conference on Machine Learning (ICML), 2018, pp. 2693 2702. [KGG20] Johannes Klicpera, Janek Groß, and Stephan Günnemann, Directional Message Passing for Molecular Graphs., International Conference on Learning Representations (ICLR), 2020. [KId HN23] Jonas Köhler, Michele Invernizzi, Pim de Haan, and Frank Noé, Rigid body flows for sampling molecular crystal structures, ar Xiv preprint ar Xiv:2301.11355, 2023. [KKN20] Jonas Köhler, Leon Klein, and Frank Noé, Equivariant Flows: Exact Likelihood Generative Learning for Symmetric Densities., International Conference on Machine Learning (ICML), 2020, pp. 5361 5370. [KMT19] Patrick T. Komiske, Eric M. Metodiev, and Jesse Thaler, Energy flow networks: deep sets for particle jets, Journal of High Energy Physics, Springer Science and Business Media LLC, 2019. [KNS+19] Roman Kogler, Benjamin Nachman, Alexander Schmidt, Lily Asquith, Emma Winkels, Mario Campanelli, Chris Delitzsch, Philip Harris, Andreas Hinzmann, Deepak Kar, et al., Jet substructure at the large hadron collider, Reviews of Modern Physics 91 (2019), no. 4, 045003. [KPTR19] Gregor Kasieczka, Tilman Plehn, Jennifer Thompson, and Michael Russel, Top quark tagging reference dataset, March 2019. [KR67] LM Kaplan and M Resnikoff, Matrix products and the explicit 3, 6, 9, and 12-j coefficients of the regular representation of su (n), Journal of Mathematical Physics 8 (1967), no. 11, 2194 2205. [KT18] Risi Kondor and Shubhendu Trivedi, On the Generalization of Equivariance and Convolution in Neural Networks to the Action of Compact Groups., International Conference on Machine Learning (ICML), 2018, pp. 2752 2760. [Len90] Reiner Lenz, Group theoretical methods in image processing, vol. 413, Springer, 1990. [LK14] Sergey Levine and Vladlen Koltun, Learning complex neural network policies with trajectory optimization, International Conference on Machine Learning, PMLR, 2014, pp. 829 837. [LS09] Douglas Lundholm and Lars Svensson, Clifford Algebra, Geometric Algebra, and Applications, ar Xiv preprint ar Xiv:0907.5356, 2009. [MFW21] Pavlo Melnyk, Michael Felsberg, and Mårten Wadenbäck, Embed me if you can: A geometric perceptron, Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 1276 1284. [PGM+19] Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al., Pytorch: An imperative style, high-performance deep learning library, Advances in neural information processing systems, vol. 32, 2019. [PRM+18] Titouan Parcollet, Mirco Ravanelli, Mohamed Morchid, Georges Linarès, Chiheb Trabelsi, Renato De Mori, and Yoshua Bengio, Quaternion recurrent neural networks, ar Xiv preprint ar Xiv:1806.04418, 2018. [PVG+11] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel, Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss, Vincent Dubourg, et al., Scikit-learn: Machine learning in python, the Journal of machine Learning research 12 (2011), 2825 2830. [QG20] Huilin Qu and Loukas Gouskos, Jet tagging via particle clouds, Physical Review D, American Physical Society (APS), 2020. [RDK21] Martin Roelfs and Steven De Keninck, Graded symmetry groups: plane and simple, ar Xiv preprint ar Xiv:2107.03771, 2021. [RDRL14] Raghunathan Ramakrishnan, Pavlo O. Dral, Matthias Rupp, and O. Anatole von Lilienfeld, Quantum chemistry structures and properties of 134 kilo molecules, Scientific Data, Springer Science and Business Media LLC, 2014. [RGd K+23] David Ruhe, Jayesh K Gupta, Steven de Keninck, Max Welling, and Johannes Brandstetter, Geometric clifford algebra networks, ar Xiv preprint ar Xiv:2302.06594, 2023. [Ser12] Jean-Pierre Serre, A Course in Arithmetic, vol. 7, Springer Science & Business Media, 2012. [SGT+08] Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini, The graph neural network model, IEEE transactions on neural networks 20 (2008), no. 1, 61 80. [SHW21] Victor Garcia Satorras, E. Hoogeboom, and M. Welling, E(n) Equivariant Graph Neural Networks, International Conference on Machine Learning, 2021. [Spe21] Matthew Spellings, Geometric algebra attention networks for small point clouds, ar Xiv preprint ar Xiv:2110.02393, 2021. [SSK+18] K. T. Schütt, H. E. Sauceda, P.-J. Kindermans, A. Tkatchenko, and K.-R. Müller, Schnet A deep learning architecture for molecules and materials, The Journal of Chemical Physics 148 (2018), no. 24, 241722. [SUG21] Kristof Schütt, Oliver T. Unke, and Michael Gastegger, Equivariant message passing for the prediction of tensorial properties and molecular spectra., International Conference on Machine Learning (ICML), 2021, pp. 9377 9388. [Syl52] James Joseph Sylvester, XIX. A demonstration of the theorem that every homogeneous quadratic polynomial is reducible by real orthogonal substitutions to the form of a sum of positive and negative squares, The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 4 (1852), no. 23, 138 142. [TF22] Philipp Thölke and G. D. Fabritiis, Torch MD-NET: Equivariant Transformers for Neural Network based Molecular Potentials, Ar Xiv, vol. abs/2202.02541, 2022. [TSK+18] Nathaniel Thomas, T. Smidt, S. Kearnes, Lusann Yang, Li Li, Kai Kohlhoff, and Patrick F. Riley, Tensor Field Networks: Rotationand Translation-Equivariant Neural Networks for 3d Point Clouds, ar Xiv, 2018. [TVS+21] Raphael J. L. Townshend, Martin Vögele, Patricia Suriana, Alexander Derry, Alexander Powers, Yianni Laloudakis, Sidhika Balachandar, Bowen Jing, Brandon M. Anderson, Stephan Eismann, Risi Kondor, Russ B. Altman, and Ron O. Dror, Atom3d: Tasks on Molecules in Three Dimensions., Conference on Neural Information Processing Systems (Neur IPS), 2021. [UPH+19] Mikaela Angelina Uy, Quang-Hieu Pham, Binh-Son Hua, Duc Thanh Nguyen, and Sai-Kit Yeung, Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data., IEEE International Conference on Computer Vision (ICCV), 2019, pp. 1588 1597. [VDWCV11] Stefan Van Der Walt, S Chris Colbert, and Gael Varoquaux, The numpy array: a structure for efficient numerical computation, Computing in science & engineering 13 (2011), no. 2, 22 30. [VGO+20] Pauli Virtanen, Ralf Gommers, Travis E Oliphant, Matt Haberland, Tyler Reddy, David Cournapeau, Evgeni Burovski, Pearu Peterson, Warren Weckesser, Jonathan Bright, et al., Scipy 1.0: fundamental algorithms for scientific computing in python, Nature methods 17 (2020), no. 3, 261 272. [Was21] Michael L Waskom, Seaborn: statistical data visualization, Journal of Open Source Software 6 (2021), no. 60, 3021. [WC19] Maurice Weiler and Gabriele Cesa, General E(2)-Equivariant Steerable CNNs., Conference on Neural Information Processing Systems (Neur IPS), 2019, pp. 14334 14345. [WCL05] Rich Wareham, Jonathan Cameron, and Joan Lasenby, Applications of conformal geometric algebra in computer vision and graphics, Computer Algebra and Geometric Algebra with Applications: 6th International Workshop, IWMM 2004, Shanghai, China, May 19-21, 2004 and International Workshop, GIAE 2004, Xian, China, May 24-28, 2004, Revised Selected Papers, Springer, 2005, pp. 329 349. [WFVW21] Maurice Weiler, Patrick Forré, Erik Verlinde, and Max Welling, Coordinate independent convolutional networks isometry and gauge equivariant convolutions on riemannian manifolds, ar Xiv preprint ar Xiv:2106.06020, 2021. [WGTB17] Daniel E. Worrall, Stephan J. Garbin, Daniyar Turmukhambetov, and Gabriel J. Brostow, Harmonic Networks: Deep Translation and Rotation Equivariance., Computer Vision and Pattern Recognition (CVPR), 2017, pp. 7168 7177. [WGW+18] Maurice Weiler, Mario Geiger, Max Welling, Wouter Boomsma, and Taco Cohen, 3d Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data., Conference on Neural Information Processing Systems (Neur IPS), 2018, pp. 10402 10413. [WSK+15] Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao, 3d Shape Nets: A deep representation for volumetric shapes., Computer Vision and Pattern Recognition (CVPR), 2015, pp. 1912 1920. [XGD+17] Saining Xie, Ross B. Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He, Aggregated Residual Transformations for Deep Neural Networks., Computer Vision and Pattern Recognition (CVPR), 2017, pp. 5987 5995. [YCo20] YCor, Analogue of the special orthogonal group for singular quadratic forms, Math Overflow, 2020, url:https://mathoverflow.net/q/378828 (version: 2020-12-13). [ZCD+20] C Lawrence Zitnick, Lowik Chanussot, Abhishek Das, Siddharth Goyal, Javier Heras Domingo, Caleb Ho, Weihua Hu, Thibaut Lavril, Aini Palizhati, Morgane Riviere, and others, An introduction to electrocatalyst design using machine learning for renewable energy storage, ar Xiv, 2020. [ZKR+17] Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Russ R Salakhutdinov, and Alexander J Smola, Deep sets, Advances in neural information processing systems, vol. 30, 2017. [ZXXC18] Xuanyu Zhu, Yi Xu, Hongteng Xu, and Changjian Chen, Quaternion convolutional neural networks, Proceedings of the European Conference on Computer Vision (ECCV), 2018, pp. 631 647. Acknowledgments We thank Leo Dorst and Steven de Keninck for insightful discussions about Clifford (geometric) algebras and their applications as well as pointing us to the relevant literature. Further, we thank Robert-Jan Schlimbach and Jayesh Gupta for their help with computational infrastructure and scaling up the experiments. This work used the Dutch national e-infrastructure with the support of the SURF Cooperative using grant no. EINF-5757 as well as the DAS-5 and DAS-6 clusters [BEd L+16]. Broader Impact Statement The advantages of more effective equivariant graph neural networks could be vast, particularly in the physical sciences. Improved equivariant neural networks could revolutionize fields like molecular biology, astrophysics, and materials science by accurately predicting system behaviors under various transformations. They could further enable more precise simulations of physical systems. Such new knowledge could drive forward scientific discovery, enabling innovations in healthcare, materials engineering, and our understanding of the universe. However, a significant drawback of relying heavily on advanced equivariant neural networks in physical sciences is the potential for unchecked errors or biases to propagate when these systems are not thoroughly cross-validated using various methods. This could potentially lead to flawed scientific theories or dangerous real-world applications. Reproducibility Statement We have released the code to reproduce the experimental results, including hyperparameters, network architectures, and so on, at https://github.com/David Ruhe/clifford-group-equivariant-neural-networks. Computational Resources The volumetric quantities and regression experiments were carried out on 1 11 GB NVIDIA Ge Force GTX 1080 Ti and 1 11 GB NVIDIA Ge Force GTX 2080 Ti instances. The n-body experiment ran on 1 24 GB NVIDIA Ge Force RTX 3090 and 1 24 GB NVIDIA RTX A5000 nodes. Finally, the top tagging experiment was conducted on 4 40 GB NVIDIA Tesla A100 Ampere instances. We would like to thank the scientific software development community, without whom this work would not be possible. This work made use of Python (PSF License Agreement), Num Py [VDWCV11] (BSD-3 Clause), Py Torch [PGM+19] (BSD-3 Clause), CUDA (proprietary license), Weights and Biases [Bie20] (MIT), Scikit-Learn [PVG+11] (BSD-3 Clause), Seaborn [Was21] (BSD-3 Clause), Matplotlib [Hun07] (PSF), [VGO+20] (BSD-3). The top tagging dataset is licensed under CC-By Attribution 4.0 International. 1 Introduction 1 2 The Clifford Algebra 3 3 Theoretical Results 3 3.1 The Clifford Group and its Clifford Algebra Representations . . . . . . . . . . . . . . . . . . 4 3.2 Orthogonal Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 4 Methodology 6 5 Experiments 8 5.1 Estimating Volumetric Quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 5.2 O(5) Experiment: Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 5.3 E(3) Experiment: n-Body System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 5.4 O(1, 3) Experiment: Top Tagging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 6 Conclusion 10 References 11 Acknowledgements 17 Broader Impact Statement 17 Reproducibility Statement 17 Computational Resources 17 Licences 17 A Glossary 20 B Supplementary Material: Introduction 22 C Quadratic Vector Spaces and the Orthogonal Group 23 D The Clifford Algebra and Typical Constructions 26 D.1 The Clifford Algebra and its Universal Property . . . . . . . . . . . . . . . . . . . . . . . . 26 D.2 The Multivector Filtration and the Grade . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 D.3 The Parity Grading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 D.4 The Dimension of the Clifford Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 D.5 Extending the Quadratic Form to the Clifford Algebra . . . . . . . . . . . . . . . . . . . . . 32 D.6 The Multivector Grading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 D.7 The Radical Subalgebra of the Clifford Algebra . . . . . . . . . . . . . . . . . . . . . . . . . 44 D.8 The Center of the Clifford Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 D.9 The Twisted Center of the Clifford Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 E The Clifford Group and its Clifford Algebra Representations 49 E.1 Adjusting the Twisted Conjugation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 E.2 The Clifford Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54 E.3 The Structure of the Clifford Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 E.4 Orthogonal Representations of the Clifford Group . . . . . . . . . . . . . . . . . . . . . . . 60 E.5 The Spinor Norm and the Clifford Norm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 E.6 The Pin Group and the Spin Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Notation Meaning O(n) The orthogonal group acting on an n-dimensional vector space. SO(n) The special orthogonal group acting on an n-dimensional vector space. E(n) The Euclidean group acting on an n-dimensional vector space. SE(n) The special Euclidean group acting on an n-dimensional vector space. GL(n) The general linear group acting on an n-dimensional vector space. O(V, q) The orthogonal group of vector space V with respect to a quadratic form q. OR(V, q) The orthogonal group of vector space V with respect to a quadratic form q that acts as the identity on R. V A vector space. R The radical vector subspace of V such that for any f R, b(f, v) = 0 for all v V . F A field. R The real numbers. N The natural numbers. Z The integers. [n] The set of integers {1, . . . , n}. q A quadratic form, q : V F. b A bilinear form, b : V V F. Cl(V, q) The Clifford algebra over a vector space V with quadratic form q. Cl (V, q) The group of invertible Clifford algebra elements. Cl[ ](V, q) The group of invertible parity-homogeneous Clifford algebra elements. Cl[ ](V, q) := {w Cl (V, q) | η(w) { 1}}. Cl[0](V, q) The even subalgebra of the Clifford algebra. Cl[0](V, q) := Ln m even Cl(m)(V, q). Cl[1](V, q) The odd part of the Clifford algebra. Cl[1](V, q) := Ln m odd Cl(m)(V, q). Cl(m)(V, q) The grade-m subspace of the Clifford algebra. (_)(m) Grade projection, (_)(m) : Cl(V, q) Cl(m)(V, q). ζ Projection onto the zero grade, ζ : Cl(V, q) Cl(0)(V, q). ei A basis vector, ei V . fi A basis vector of R. e A A Clifford basis element (product of basis vectors) with e A Cl(V, q), A {1, . . . , n}. b The extended bilinear form on the Clifford algebra, b : Cl(V, q) Cl(V, q) F. q The extended quadratic form on the Clifford algebra, q : Cl(V, q) F. α Clifford main involution α : Cl(V, q) Cl(V, q). β Main Clifford anti-involution, also known as reversion, β : Cl(V, q) Cl(V, q). γ Clifford conjugation γ : Cl(V, q) Cl(V, q). η Coboundary of α. For w Cl (V, q), η(w) { 1} if and only if w is parityhomogeneous. ρ(w) The (adjusted) twisted conjugation, used as the action of the Clifford group. ρ(w) : Cl(V, q) Cl(V, q), w Cl (V, q). Γ(V, q) The Clifford group of Cl(V, q). Notation Meaning Γ[0](V, q) The special Clifford group. It excludes orientation-reversing (odd) elements. Γ[0](V, q) := Γ(V, q) Cl[0](V, q). V(R) The radical subalgebra of Cl(V, q). I.e., those elements that have zero q. It is equal to the exterior algebra of R. V[i](R) The even or odd (i {0, 1}) subalgebra of V(R). V[i](R) := V(R) Cl[i](V, q). V( 1)(R) The subalgebra of V(R) with grade greater than or equal to one. V( 1)(R) := span{f1 . . . fk | k 1, fl R}. V (R) The group of invertible elements of V(R). V (R) = F + V( 1)(R). V[ ](R) The group of invertible elements of V(R) with even grades. V[ ](R) := F + span{f1 . . . fk | k 2 even, fl R}. V (R) Same as V (R), but with F set to 1. V[ ](R) Same as V[ ](R), but with F set to 1. SN Spinor norm, SN : Cl(V, q) F. CN Clifford norm, CN : Cl(V, q) F. B Supplementary Material: Introduction Existing literature on Clifford algebra uses varying notations, conventions, and focuses on several different applications. We gathered previous definitions and included independently derived results to achieve the desired outcomes for this work and to provide proofs for the most general cases, including, e.g., potential degeneracy of the metric. As such, this appendix acts as a comprehensive resource on quadratic spaces, orthogonal groups, Clifford algebras, their constructions, and specific groups represented within Clifford algebras. We start in Appendix C with a primer to quadratic spaces and the orthogonal group. In particular, we specify how the orthogonal group is defined on a quadratic space, and investigate its action. We pay special attention to the case of spaces with degenerate quadratic forms, and what we call radical-preserving orthogonal automorphisms. The section concludes with the presentation of the Cartan-Dieudonné theorem. In Appendix D, we introduce the definition of the Clifford algebra as a quotient of the tensor algebra. We investigate several key properties of the Clifford algebra. Then, we introduce its parity grading, which in turn is used to prove that the dimension of the algebra is 2n, where n is the dimension of the vector space on which it is defined. This allows us to construct an algebra basis. Subsequently, we extend the quadratic and bilinear forms of the vector space to their Clifford algebra counterparts. This then leads to the construction of an orthogonal basis for the algebra. Furthermore, the multivector grading of the algebra is shown to be basis independent, leading to the orthogonal sum decomposition into the usual subspaces commonly referred to as scalars, vectors, bivectors, and so on. Finally, we investigate some additional properties of the algebra, such as its center, its radical subalgebra, and its twisted center. In Appendix E we introduce, motivate, and adjust the twisted conjugation map. We show that it comprises an algebra automorphism, thereby respecting the algebra s vector space and also its multiplicative properties. Moreover, it can serve as a representation of the Clifford algebra s group of invertible elements. We then identify what we call the Clifford group, whose action under the twisted conjugation respects also the multivector decomposition. We investigate properties of the Clifford group and its action. Specifically, we show that its action yields a radical-preserving orthogonal automorphism. Moreover, it acts orthogonally on each individual subspace. Finally, after introducing the Spinor and Clifford norm, the section concludes with the pursuit of a general definition for the Pin and Spin group, which are used in practice more often than the Clifford group. This, however, turns out to be somewhat problematic when generalizing to fields beyond R. We motivate several choices, and outline their (dis)advantages. For R, we finally settle on definitions that are compatible with existing literature. C Quadratic Vector Spaces and the Orthogonal Group We provide a short introduction to quadratic spaces and define the orthogonal group of a (non-definite) quadratic space. We use these definitions in our analysis of how the Clifford group relates to the orthogonal group. We will always denote with F a field of a characteristic different from 2, char(F) = 2. Sometimes we will specialize to the real numbers F = R. Let V be a vector space over F of finite dimension dim F V = n. We will follow [Ser12]. Definition C.1 (Quadratic forms and quadratic vector spaces). A map q : V F will be called a quadratic form of V if for all c F and v V : q(c v) = c2 q(v), (17) b(v1, v2) := 1 2 (q(v1 + v2) q(v1) q(v2)) , (18) is a bilinear form over F in v1, v2 V , i.e., it is separately F-linear in each of the arguments v1 and v2 when the other one is fixed. The tuple (V, q) will then be called a quadratic (vector) space. Remark C.2. 1. Note that we explicitely do not make assumptions about the non-degeneracy of b. Even the extreme case with constant q = 0 is allowed and of interest. 2. Further note, that b will automatically be a symmetric bilinear form. Definition C.3 (The radical subspace). Now consider the quadratic space (V, q). We then call the subspace: R := {f V | v V. b(f, v) = 0} , (19) the radical subspace of (V, q). Remark C.4. 1. The radical subspace of (V, q) is the biggest subspace of V where q is degenerate. Note that this space is orthogonal to all other subspaces of V . 2. If W is any complementary subspace of R in V , so that V = R W, then q restricted to W is non-degenerate. Definition C.5 (Orthogonal basis). A basis e1, . . . , en of V is called orthogonal basis of V if for all i = j we have: b(ei, ej) = 0. (20) It is called an orthonormal basis if, in addition, q(ei) { 1, 0, +1} for all i = 1, . . . , n. Remark C.6. Note that every quadratic space (V, q) has an orthogonal basis by [Ser12] p. 30 Thm. 1, but not necessarily a orthonormal basis. However, if F = R then (V, q) has an orthonormal basis by Sylvester s law of inertia, see [Syl52]. Definition C.7 (The orthogonal group). For a quadratic space (V, q) we define the orthogonal group of (V, q) as follows: O(q) := O(V, q) := Φ : V V Φ F-linear automorphism8, s.t. v V. q(Φ(v)) = q(v) . (21) If (V, q) = R(p,q,r), i.e. if F = R and V = Rn and q has the signature (p, q, r) with p + q + r = n then we define the group of orthogonal matrices of signature (p, q, r) as follows: O(p, q, r) := O GL(n) O (p,q,r)O = (p,q,r) , (22) where we used the (n n)-diagonal signature matrix: (p,q,r) := diag(+1, . . . , +1 | {z } p-times , 1, . . . , 1 | {z } q-times , 0, . . . , 0 | {z } r-times Theorem C.8 (See [YCo20]). Let (V, q) be a finite dimensional quadratic space over a field F with char(F) = 2. Let R V be the radical subspace, r := dim R, and, W V a complementary subspace, m := dim W. Then we get an isomorphism: O(V, q) = O(W, q|W ) 0m r M(r, m) GL(r) = O 0m r M G O O(W, q|W ), M M(r, m), G GL(r) , where M(r, m) := Fr m is the additive group of all (r m)-matrices with coefficients in F and where GL(r) is the multiplicative group of all invertible (r r)-matrices with coefficients in F. Proof. Let e1, . . . , em be an orthogonal basis for W and f1, . . . , fr be a basis for R, then the associated bilinear form b of q has the following matrix representation: Q 0 0 0 with an invertible diagonal (m m)-matrix Q. For the matrix of any orthogonal automorphism Φ of V we get the necessary and sufficient condition: Q 0 0 0 = A QA A QB B QA B QB This is equivalent to the conditions: Q = A QA, 0 = A QB, 0 = B QB. (28) This shows that A is the matrix of an orthogonal automorphism of W (w.r.t. b|W ), which, since Q is invertible, is also invertible. The second equation then shows that necessarily B = 0. Since the whole matrix needs to be invertible also D must be invertible. Furthermore, there are no constraints on C. If all those conditions are satisfied the whole matrix satisfies the orthogonality constraints from above. Corollary C.9. For R(p,q,r) we get: O(p, q, r) = O(p, q) 0(p+q) r M(r, p + q) GL(r) where O(p, q) := O(p, q, 0). Remark C.10. Note that the composition Φ1 Φ2 of orthogonal automorphisms of (V, q) corresponds to the matrix multiplication as follows: O1 0 M1 G1 = O1O2 0 M1O2 + G1M2 G1G2 Definition C.11 (Radical preserving orthogonal automorphisms). For a quadratic space (V, q) with radical subspace R V we define the group of radical preserving orthogonal automorphisms as follows: OR(V, q) := {Φ O(V, q) | Φ|R = id R} . (31) If (V, q) = R(p,q,r), i.e. if F = R and V = Rn and q has the signature (p, q, r) with p + q + r = n then we define the group of radical preserving orthogonal matrices of signature (p, q, r) as follows: OR(p, q, r) := {O O(p, q, r) | bottom right corner of O = Ir} = O(p, q) 0(p+q) r M(r, p + q) Ir where Ir is the (r r)-identity matrix. Remark C.12. Note that the matrix representation of Φ OR(q) w.r.t. an orthogonal basis like in the proof of theorem C.8 is of the form: O 0m r M Ir with O O(W, q|W ), M M(r, m) and where Ir is the (r r)-identity matrix. The composition Φ1 Φ2 of Φ1 and Φ2 OR(q) is then given by the corresponding matrix multiplication: O1 0 M1 Ir = O1O2 0 M1O2 + M2 Ir By observing the left column (the only part that does not transform trivially), we see that O(q|W ) acts on M(r, m) just by matrix multiplication from the right (in the corresponding basis): (O1, M1) (O2, M2) = (O1O2, M1O2 + M2). (35) This immediately shows that we can write OR(q) as the semi-direct product: OR(q) = O(q|W ) M(r, m). (36) In the special case of R(p,q,r) we get: OR(p, q, r) = O(p, q) M(r, p + q). (37) We conclude this chapter by citing the Theorem of Cartan and Dieudonné about the structure of orthogonal groups in the non-degnerate case, but still for arbitrary fields F with char(F) = 2. Theorem C.13 (Theorem of Cartan-Dieudonné, see [Art57] Thm. 3.20). Let (V, q) be a nondegenerate quadratic space of finite dimension dim V = n < over a field F of char(F) = 2. Then every element g O(V, q) can be written as: g = r1 rk, (38) with 1 k n, where ri are reflections w.r.t. non-singular hyperplanes. D The Clifford Algebra and Typical Constructions In this section, we provide the required definitions, constructions, and derivations leading to the results stated in the main paper. We start with a general introduction to the Clifford algebra. D.1 The Clifford Algebra and its Universal Property We follow [LS09, Cru90]. Let (V, q) be a finite dimensional quadratic space over a field F (also denoted an F-vector space) with char(F) = 2. We abbreviate the corresponding bilinear form b on vectors v1, v2 V as follows: b(v1, v2) := 1 2 (q(v1 + v2) q(v1) q(v2)) . (39) Definition D.1. To define the Clifford algebra Cl(V, q) we first consider the tensor algebra of V : m=0 V m = span {v1 vm | m 0, vi V } , (40) V m := V V | {z } m-times , V 0 := F, (41) and the following two-sided ideal9: I(q) := v v q(v) 1T(V ) v V . (42) Then we define the Clifford algebra Cl(V, q) as the following quotient: Cl(V, q) := T(V )/I(q). (43) In words, we identify the square of a vector with its quadratic form. We also denote the canonical algebra quotient map as: π : T(V ) Cl(V, q). (44) Remark D.2. 1. It is not easy to see, but always true, that dim Cl(V, q) = 2n, where n := dim V , see Theorem D.15. 2. If e1, . . . , en is any basis of (V, q) then (e A)A [n] is a basis for Cl(V, q), where we put for a subset A [n] := {1, . . . , n}: i A ei, e := 1Cl(V,q). (45) where the product is taken in increasing order of the indices i A, see Corollary D.16. 3. If e1, . . . , en is any orthogonal basis of (V, q), then one can even show that (e A)A [n] is an orthogonal basis for Cl(V, q) w.r.t. an extension of the bilinear form b from V to Cl(V, q), see Theorem D.26. Lemma D.3 (The fundamental identity). Note that for v1, v2 V , we always have the fundamental identity in Cl(V, q): v1v2 + v2v1 = 2b(v1, v2). (46) Proof. By definition of the Clifford algebra we have the identities: q(v1) + v1v2 + v2v1 + q(v2) = v1v1 + v1v2 + v2v1 + v2v2 (47) = (v1 + v2)(v1 + v2) (48) = q(v1 + v2) (49) = b(v1 + v2, v1 + v2) (50) = b(v1, v1) + b(v1, v2) + b(v2, v1) + b(v2, v2) (51) = q(v1) + 2b(v1, v2) + q(v2). (52) Substracting q(v1) + q(v2) on both sides gives the claim. 9The ideal ensures that for all elements containing v v (i.e., linear combinations, multiplications on the left, and multiplications on the right), v v is identified with q(v); e.g. x v v y x (q(v) 1T(V )) y for every x, y T(V ). Further, the Clifford algebra Cl(V, q) is fully characterized by the following property. Theorem D.4 (The universal property of the Clifford algebra). For every F-algebra10 (an algebra over field F) A and every F-linear map f : V A such that for all v V we have: f(v)2 = q(v) 1A, (53) there exists a unique F-algebra homomorphism11 f : Cl(V, q) A such that f(v) = f(v) for all v V . More explicitely, if f satisfies equation 53 and x Cl(V, q). Then we can take any representation of x of the following form: i I ci vi,1 vi,ki, (54) with finite index sets I and ki N and coefficients c0, ci F and vectors vi,j V , j = 1, . . . , ki, i I, and, then we can compute f(x) by the following formula (without ambiguity): f(x) = c0 1A + X i I ci f(vi,1) f(vi,ki). (55) In the following, we will often denote f again with f without further indication. D.2 The Multivector Filtration and the Grade Note that the Clifford algebra is a filtered algebra. Definition D.5. We define the multivector filtration12 of Cl(V, q) for grade m N0 as follows: Cl( m)(V, q) := π T( m)(V ) , T( m)(V ) := l=0 V l. (56) Remark D.6. Note that we really get a filtration on the space Cl(V, q): F = Cl( 0)(V, q) Cl( 1)(V, q) Cl( 2)(V, q) . . . Cl( n)(V, q) = Cl(V, q). (57) Furthermore, note that this is compatible with the algebra structure of Cl(V, q), i.e. for i, j 0 we get: x Cl( i)(V, q) y Cl( j)(V, q) = xy Cl( i+j)(V, q). (58) Together with the equality: π(T( m)(V )) = Cl( m)(V, q) for all m, we see that the natural map: π : T(V ) Cl(V, q), (59) is a surjective homomorphism of filtered algebras. Indeed, since Cl(V, q) is a well-defined algebra, since we modded out a two-sided ideal, π is clearly a homomorphism of filtered algebras. As a quotient map π is automatically surjective. Definition D.7 (The grade of an element). For x Cl(V, q) \ {0} we define its grade through the following condition: grd x := k such that x Cl( k)(V, q) \ Cl( k 1)(V, q), (60) grd 0 := . (61) 10For the purpose of this text an algebra is always considered to be associative and unital (containing an identity element), but not necessarily commutative. 11An algebra homomorphism is both linear and multiplicative. 12A filtration F is an indexed family (Ai)i I (I is an ordered index set) of subsets of an algebraic structure A such that for i j : Ai Aj. D.3 The Parity Grading In this section, we introduce the parity grading of the Clifford algebra. We will use the parity grading to later construct the (adjusted) twisted conjugation map, which will be used as a group action on the Clifford algebra. Definition D.8 (The main involution). The linear map: α : V Cl(V, q), α(v) := v, (62) satisfies ( v)2 = v2 = q(v). The universal property of Cl(V, q) thus extends α to a unique algebra homomorphism: α : Cl(V, q) Cl(V, q), α i I ci vi,1 vi,ki i I ci α(vi,1) α(vi,ki) (64) i I ( 1)ki ci vi,1 vi,ki, (65) for any finite sum representation with vi,j V and ci F. This extension α will be called the main involution13 of Cl(V, q). Definition D.9 (Parity grading). The main involution α of Cl(V, q) now defines the parity grading of Cl(V, q) via the following homogeneous parts: Cl[0](V, q) := {x Cl(V, q) | α(x) = x} , (66) Cl[1](V, q) := {x Cl(V, q) | α(x) = x} . (67) With this we get the direct sum decomposition: Cl(V, q) = Cl[0](V, q) Cl[1](V, q), (68) x = x[0] + x[1], x[0] := 1 2(x + α(x)), x[1] := 1 2(x α(x)), (69) with the homogeneous parts x[0] Cl[0](V, q) and x[1] Cl[1](V, q). We define the parity of an (homogeneous) element x Cl(V, q) as follows: ( 0 if x Cl[0](V, q), 1 if x Cl[1](V, q). (70) Definition D.10 (Z/2Z-graded algebras). An F-algebra (A, +, ) together with a direct sum decomposition of sub-vector spaces: A = A[0] A[1], (71) is called a Z/2Z-graded algebra if for every i, j Z/2Z we always have: x A[i] y A[j] = x y A[i+j]. (72) Note that [i + j] is meant here to be computed modulo 2. The Z/2Z-grade of an (homogeneous) element x A will also be called the parity of x: prt(x) := 0 if x A[0], 1 if x A[1]. (73) The above requirement then implies for homogeneous elements x, y A the relation: prt(x y) = prt(x) + prt(y) mod 2. (74) We can now summarize the results of this section as follows: Theorem D.11. The Clifford algebra Cl(V, q) is a Z/2Z-graded algebra in its parity grading. 13An involution is a map that is its own inverse. D.4 The Dimension of the Clifford Algebra In this subsection we determine the dimension of the Clifford algebra, which allows us to construct bases for the Clifford algebra. In the following, we again let F be any field of char(F) = 2. Definition/Lemma D.12 (The twisted tensor product of Z/2Z-graded F-algebras). Let A and B be two Z/2Z-graded algebras over F. Then their twisted tensor product Aˆ B is defined via the usual tensor product A B of F-vector spaces, but where the product is defined on homogeneous elements a1, a2 A, b1, b2 B via: (a1 ˆ b1) (a2 ˆ b2) := ( 1)prt(b1) prt(a2)(a1a2)ˆ (b1b2). (75) This turns Aˆ B also into a Z/2Z-graded algebra over F with the Z/2Z-grading: (Aˆ B)[0] := A[0] B[0] A[1] B[1] , (76) (Aˆ B)[1] := A[0] B[1] A[1] B[0] . (77) In particular, if a A, b B are homogeneous elements then we have: prt A ˆ B(aˆ b) = prt A(a) + prt B(b) mod 2. (78) Proof. By definition we already know that Aˆ B is an F-vector space. So we only need to investigate the multiplication and Z/2Z-grading. For homogenous elements a1, a2, a3 A, b1, b2, b3 B we have: (a1 ˆ b1) (a2 ˆ b2) (a3 ˆ b3) (79) = ( 1)prt(b1) prt(a2) (a1a2)ˆ (b1b2) (a3 ˆ b3) (80) = ( 1)prt(b1) prt(a2) ( 1)prt(b1b2) prt(a3)(a1a2a3)ˆ (b1b2b3) (81) = ( 1)prt(b1) prt(a2)+prt(b1) prt(a3)+prt(b2) prt(a3)(a1a2a3)ˆ (b1b2b3) (82) = ( 1)prt(b1) prt(a2a3) ( 1)prt(b2) prt(a3)(a1a2a3)ˆ (b1b2b3) (83) = ( 1)prt(b2) prt(a3)(a1 ˆ b1) (a2a3)ˆ (b2b3) (84) = (a1 ˆ b1) (a2 ˆ b2) (a3 ˆ b3) . (85) This shows associativity of multiplication on homogeneous elements, which extends by linearity to general elements. The distributive law is clear. To check that we have a Z/2Z-grading, note that for homogeneous elements a1, a2 A, b1, b2 B we have: prt A ˆ B (a1 ˆ b1) (a2 ˆ b2) = prt A ˆ B (( 1)prt(a2) prt(b1)a1a2)ˆ (b1b2) (86) = prt A(a1a2) + prt B(b1b2) (87) = prt A(a1) + prt A(a2) + prt B(b1) + prt B(b2) (88) = prt A ˆ B a1 ˆ b1 + prt A ˆ B a2 ˆ b2 mod 2. (89) The general case follows by linear combinations. Remark D.13 (The universal property of the twisted tensor product of Z/2Z-graded F-algebras). Let A1, A2, B be Z/2Z-graded F-algebras. Consider Z/2Z-graded F-algebra homomorphisms: ψ1 : A1 B, ψ2 : A2 B, (90) such that for all homogeneous elements a1 A1 and a2 A2 we have: ψ1(a1) ψ2(a2) = ( 1)prt(a1) prt(a2) ψ2(a2) ψ1(a1). (91) Then there exists a unique Z/2Z-graded F-algebra homomorphism: ψ : A1 ˆ A2 B, (92) ψ ϕ1 = ψ1, ψ ϕ2 = ψ2, (93) where ϕ1, ϕ2 are the following Z/2Z-graded F-algebra homomorphisms: ϕ1 : A1 A1 ˆ A2, a1 7 a1 ˆ 1, (94) ϕ2 : A2 A1 ˆ A2, a2 7 1ˆ a2, (95) which satisfy for all homogeneous elements a1 A1 and a2 A2: ϕ1(a1) ϕ2(a2) = ( 1)prt(a1) prt(a2) ϕ2(a2) ϕ1(a1). (96) Furthermore, A1 ˆ A2 together with ϕ1, ϕ2 is uniquely characterized as a Z/2Z-graded F-algebra by the above property (when considering all possible such B and ψ1, ψ2). Proposition D.14. Let (V, q) be a finite dimensional quadratic vector space over F, char(F) = 2, with an orthogonal sum decomposition: (V, q) = (V1, q1) (V2, q2). (97) Then the inclusion maps: Cl(Vi, qi) Cl(V, q) induce an isomorphism of Z/2Z-graded F-algebras: Cl(V, q) = Cl(V1, q1)ˆ Cl(V2, q2). (98) In particular: dim Cl(V, q) = dim Cl(V1, q1) dim Cl(V2, q2). (99) Proof. First consider the canonical inclusion maps, l = 1, 2: ϕl : Cl(Vl, ql) Cl(V, q), xl 7 xl. (100) These satisfy for all (parity) homogeneous elements xl Cl(Vl, ql), l = 1, 2, the condition: ϕ1(x1)ϕ2(x2) = x1x2 != ( 1)prt(x1) prt(x2) x2x1 = ( 1)prt(x1) prt(x2) ϕ2(x2)ϕ1(x1). (101) Indeed, we can by linearity reduce to the case that x1 and x2 are products of vectors v1 V1 and v2 V2, resp. Since V1 and V2 are orthogonal to each other by assumption, for those elements we get by the identities D.3: v1v2 = v2v1 + 2 b(v1, v2) | {z } =0 = v2v1. (102) This shows the condition above for x1 and x2. We then define the F-bilinear map: Cl(V1, q1) Cl(V2, q2) Cl(V, q), (x1, x2) 7 x1x2, (103) which thus factorizes through the F-linear map: ϕ : Cl(V1, q1)ˆ Cl(V2, q2) Cl(V, q), x1 ˆ x2 7 x1x2. (104) We now show that ϕ also respects multiplication. For this let x1, y1 Cl(V1, q1) and x2, y2 Cl(V2, q2) (parity) homogeneous elements. We then get: ϕ (x1 ˆ x2) (y1 ˆ y2) (105) = ϕ ( 1)prt(x2) prt(y1) (x1y1)ˆ (x2y2) (106) = ( 1)prt(x2) prt(y1) x1y1x2y2 (107) = x1x2y1y2 (108) = ϕ(x1 ˆ x2) ϕ(y1 ˆ y2). (109) This shows the multiplicativity of ϕ on homogeneous elements. The general case follows by the F-linearity of ϕ. Note that ϕ also respects the parity grading. Now consider the Z/2Z-graded F-algebra Cl(V1, q1)ˆ Cl(V2, q2) and Z/2Z-graded F-algebra homomorphisms: ψ1 : Cl(V1, q1) Cl(V1, q1)ˆ Cl(V2, q2), x1 7 x1 ˆ 1, (110) ψ2 : Cl(V2, q2) Cl(V1, q1)ˆ Cl(V2, q2), x2 7 1ˆ x2. (111) Note that for all homogeneous elements xl Cl(Vl, ql), l = 1, 2, we have: ψ2(x2) ψ1(x1) = (1ˆ x2) (x1 ˆ 1) (112) = ( 1)prt(x1) prt(x2) x1 ˆ x2 (113) = ( 1)prt(x1) prt(x2) (x1 ˆ 1) (1ˆ x2) (114) = ( 1)prt(x1) prt(x2) ψ1(x1) ψ2(x2). (115) We then define the F-linear map: ψ : V = V1 V2 Cl(V1, q1)ˆ Cl(V2, q2), v = v1 + v2 7 ψ1(v1) + ψ2(v2) =: ψ(v). (116) Note that we have: ψ(v)2 = (ψ1(v1) + ψ2(v2)) (ψ1(v1) + ψ2(v2)) (117) = ψ1(v1)2 + ψ2(v2)2 + ψ1(v1) ψ2(v2) + ψ2(v2) ψ1(v1) (118) = ψ1(v1)2 + ψ2(v2)2 + ψ1(v1) ψ2(v2) + ( 1)prt(v1) prt(v2)ψ1(v1) ψ2(v2) | {z } =0 = ψ1(v2 1) + ψ2(v2 2) (120) = q1(v1) ψ1(1Cl(V1,q1)) + q2(v2) ψ2(1Cl(V2,q2)) (121) = q1(v1) 1Cl(V1,q1) ˆ Cl(V2,q2) + q2(v2) 1Cl(V1,q1) ˆ Cl(V2,q2) (122) = (q1(v1) + q2(v2)) 1Cl(V1,q1) ˆ Cl(V2,q2) (123) = q(v) 1Cl(V1,q1) ˆ Cl(V2,q2). (124) By the universal property of the Clifford algebra ψ uniquely extends to an F-algebra homomorphism: ψ : Cl(V, q) Cl(V1, q1)ˆ Cl(V2, q2), (125) vi1 vik 7 (ψ1(vi1,1) + ψ2(vi1,2)) (ψ1(vik,1) + ψ2(vik,2)). (126) One can see from this, by explicit calculation, that ψ also respects the Z/2Z-grading. Furthermore, we see that ψ ϕl = ψl for l = 1, 2, and, also, ϕ ψl = ϕl for l = 1, 2. One easily sees that ϕ and ψ are inverse to each other and the claim follows. Theorem D.15 (The dimension of the Clifford algebra). Let (V, q) be a finite dimensional quadratic vector space over F, char(F) = 2, n := dim V < . Then we have: dim Cl(V, q) = 2n. (127) Proof. Let e1, . . . , en be an orthogonal basis of (V, q) and Vi := span(ei) V , qi := q|Vi. Then we get the orthogonal sum decomposition: (V, q) = (V1, q1) (Vn, qn), (128) and thus by Proposition D.14 the isomorphism of Z/2Z-graded algebras: Cl(V, q) = Cl(V1, q1)ˆ ˆ Cl(Vn, qn), (129) dim Cl(V, q) = dim Cl(V1, q1) dim Cl(Vn, qn), (130) Since dim Vi = 1 and thus: dim Cl(Vi, qi) = dim (F Vi) = 2, (131) for i = 1, . . . , n, we get the claim: dim Cl(V, q) = 2n. (132) Corollary D.16 (Bases for the Clifford algebra). Let (V, q) be a finite dimensional quadratic vector space over F, char(F) = 2, n := dim V < . Let e1, . . . , en be any basis of V , then (e A)A [n] is a basis for Cl(V, q), where we put for a subset A [n] := {1, . . . , n}: i A ei, e := 1Cl(V,q). (133) where the product is taken in increasing order of the indices i A. Proof. Since Cl(V, q) = T(V )/I(q) and T(V ) = span {ei1 eim | m 0, ij [n], j [m]} , (134) we see that: Cl(V, q) = span {ei1 eim | m 0, ij [n], j [m]} . (135) By several applications of the fundamental identities D.3: eikeil = eileik + 2b(eik, eil), eikeik = q(eik), (136) we can turn products ei1 eim into sums of smaller products if some of the occuring indices agree, say ik = il. Furthermore, we can also use those identities to turn the indices in increasing order. This then shows: Cl(V, q) = span {e A | A [n]} . (137) Since # {A [n]} = 2n and dim Cl(V, q) = 2n by Theorem D.15 we see that {e A | A [n]} must already be a basis for Cl(V, q). D.5 Extending the Quadratic Form to the Clifford Algebra We provide the extension of the quadratic form from the vector space to the Clifford algebra, which will lead to the constrution of an orthogonal basis of the Clifford algebra. Definition D.17 (The opposite algebra of an algebra). Let (A, +, ) be an algebra. The opposite algebra (Aop, +, ) is defined to consist of the same underlying vector space (Aop, +) = (A, +), but where the multiplication is reversed in comparison to A, i.e. for x, y Aop we have: x y := y x. (138) Note that this really turns (Aop, +, ) into an algebra. Definition D.18 (The main anti-involution of the Clifford algebra). Consider the following linear map: β : V Cl(V, q)op, v 7 v, (139) which also satisfies v v = vv = q(v). By the universal property of the Clifford algebra we get a unique extension to an algebra homomorphism: β : Cl(V, q) Cl(V, q)op, β i I ci vi,1 vi,ki i I ci vi,1 vi,ki (141) i I ci vi,ki vi,1, (142) for any finite sum representation with vi,j V and ci F. We call β the main anti-involution of Cl(V, q). Definition D.19 (The combined anti-involution of the Clifford algebra). The combined anti-involution or Clifford conjugation of Cl(V, q) is defined to be the F-algebra homomorphism: γ : Cl(V, q) Cl(V, q)op, γ(x) := β(α(x)). (143) More explicitely, it is given by the formula: i I ci vi,1 vi,ki i I ( 1)ki ci vi,ki vi,1, (144) for any finite sum representation with vi,j V and ci F. Remark D.20. If e1, . . . , en V is an orthogonal basis for (V, q). Then we have for A [n]: α(e A) = ( 1)|A|e A, β(e A) = ( 1)( |A| 2 )e A, γ(e A) = ( 1)( |A|+1 2 )e A. (145) Recall the definition of a trace of a linear map: Definition D.21 (The trace of an endomorphism). Let Y be a vector space over a field F of dimension dim Y = m < and Φ : Y Y a vector space endomorphism14. Let B = {b1, . . . , bm} be a basis for Y and B = {b 1, . . . , b m} be the corresponding dual basis of Y , defined via: b j(bi) := δi,j. Let A = (ai,j)i=1,...,m, j=1,...,m be the matrix representation of Φ w.r.t. B: j [m]. Φ(bj) = i=1 ai,jbi. (146) Then the trace of Φ is defined via: j=1 b j(Φ(bj)) F. (147) It is a well known fact that Tr(Φ) is not dependent on the initial choice of the basis B. Furthermore, Tr is a well-defined F-linear map (homomorphism of vector spaces): Tr : End F(Y) F, Φ 7 Tr(Φ). (148) We now want to define the projection of x Cl(V, q) onto its zero component x(0) F in a basis independent way. Definition D.22 (The projection onto the zero component). We define the F-linear map: ζ : Cl(V, q) F, ζ(x) := 2 n Tr(x), (149) where n := dim V and Tr(x) := Tr(Lx), where Lx is the endomorphism of Cl(V, q) given by left multiplication with x: Lx : Cl(V, q) Cl(V, q), y 7 Lx(y) := xy. (150) We call ζ the projection onto the zero component. We also often write for x Cl(V, q): x(0) := ζ(x). (151) The name is justified by following property: Lemma D.23. Let e1, . . . , en be a fixed orthogonal basis of (V, q). Then we know that (e A)A [n] is a basis for Cl(V, q). So we can write every x Cl(V, q) as: A [n] x A e A, (152) with x A F, A [n]. The claim is now that we have: ζ(x) != x . (153) Proof. By the linearity of the trace we only need to investigate Tr(e A) for A [n]. For A = , we have e = 1 and we get: B [n] e B(1 e B) = X B [n] 1 = 2n, (154) 14A map from a mathematical object space to itself. which shows: ζ(1) = 2 n Tr(1) = 1. Now, consider A [n] with A = . Let denote the symmetric difference of two sets. Further, we can write to refrain from distinguishing between signs, which will not affect the result. Tr(e A) = X B [n] e B(e Ae B) (155) i A B q(ei) e B(e A B) (156) i A B q(ei) δB,A B (157) i A B q(ei) δ ,A (158) where the third equality follows from the fact that B = A B holds if and only if A = , regardless of B. However, A = was ruled out by assumption, so then in the last equality we always have δ ,A = 0. So for A = we have: ζ(e A) = 2 n Tr(e A) = 0. Altogether we get: ζ(e A) = δA, = 1, if A = , 0, else. (160) With this and linearity we get: A [n] x A e A A [n] x A ζ(e A) = X A [n] x A δA, = x . (161) This shows the claim. Definition D.24 (The bilinear form on the Clifford algebra). For our quadratic F-vector space (V, q) with corresponding bilinear form b and corresponding Clifford algebra Cl(V, q) we now define the following F-bilinear form on Cl(V, q): b : Cl(V, q) Cl(V, q) F, b(x, y) := ζ(β(x)y). (162) We also define the corresponding quadratic form on Cl(V, q) via: q : Cl(V, q) F, q(x) := b(x, x) = ζ(β(x)x). (163) We will see below that b and q will agree with b and q, resp., when they are restricted to V . From that point on we will denote b just by b, and q with q, resp., without (much) ambiguity. Lemma D.25. For v, w V we have: b(v, w) = b(v, w). (164) Proof. We pick an orthogonal basis e1, . . . , en for V and write: i=1 ai ei, w = j=1 cj ej. (165) We then get by linearity: j=1 aicj b(ei, ej) (166) j=1 aicj ζ(β(ei)ej) (167) j=1 aicj ζ(eiej) (168) i =j aicj ζ(eiej) | {z } =0 i=j aicj ζ(eiej) (169) i=1 aici q(ei) ζ(1) |{z} =1 q(ei) δi,j= z }| { b(ei, ej) (171) = b(v, w). (173) This shows the claim. Theorem D.26. Let e1, . . . , en be an orthogonal basis for (V, q) then (e A)A [n] is an orthogonal basis for Cl(V, q) w.r.t. the induced bilinear form b. Furthermore, for x, y Cl(V, q) of the form: A [n] x A e A, y = X A [n] y A e A, (174) with x A, y A F we get: b(x, y) = X A [n] x A y A Y i A q(ei), q(x) = X A [n] x2 A Y i A q(ei). (175) Note that: q(e ) = q(1) = 1. Proof. We already know that (e A)A [n] is a basis for Cl(V, q). So we only need to check the orthogonality condition. First note that for e C = ei1 eir we get: q(e C) = ζ(β(e C)e C) = ζ(eir ei1 ei1 eir) = q(ei1) q(eir). (176) Now let A, B [n] with A = B, i.e. with A B = . We then get: b(e A, e B) = ζ(β(e A)e B) = Y i A B q(ei) ζ(e A B) = 0. (177) This shows that (e A)A [n] is an orthogonal basis for Cl(V, q). For x, y Cl(V, q) from above we get: b(x, y) = b A [n] x A e A, X B [n] y B e B B [n] x A y B b(e A, e B) (179) B [n] x A y B q(e A) δA,B (180) A [n] x A y A q(e A) (181) A [n] x A y A Y i A q(ei). (182) This shows the claim. D.6 The Multivector Grading Now that we have an orthogonal basis for the algebra, we show that the Clifford algebra allows a vector space grading that is independent of the chosen orthogonal basis. Let (V, q) be a quadratic space over a field F with char(F) = 2 and dim V = n < . In the following we present a technical proof that works for fields F with char(F) = 2. An alternative, simpler and more structured proof, but for the more restrictive case of char(F) = 0, can be found in Theorem D.37 later. Theorem D.27 (The multivector grading of the Clifford algebra). Let e1, . . . , en be an orthogonal basis of (V, q). Then for every m = 0, . . . , n we define the following sub-vector space of Cl(V, q): Cl(m)(V, q) := span {ei1 eim | 1 i1 < < im n} (183) = span {e A | A [n], |A| = m} , (184) where Cl(0)(V, q) := F. Then the sub-vector spaces Cl(m)(V, q), m = 0, . . . , n, are independent of the choice of the orthogonal basis, i.e. if b1, . . . , bn is another orthogonal basis of (V, q), then: Cl(m)(V, q) = span {bi1 bim | 1 i1 < < im n} . (185) Proof. First note that by the orthogonality and the fundamental relation of the Clifford algebra we have for all i = j: eiej = ejei, bibj = bjbi. (186) We now abbreviate: B(m) := span {bj1 bjm | 1 j1 < < jm n} , (187) and note that: Cl(m)(V, q) = span {ei1 eim | 1 i1 < < im n} (188) = span {ei1 eim | 1 i1, . . . , im n, s = t. is = it} . (189) We want to show that for 1 j1 < < jm n we have that: bj1 bjm Cl(m)(V, q). (190) Since we have two bases we can write an orthogonal change of basis: i=1 ai,jei V = Cl(1)(V, q). (191) Using this, we can now write the above product as the sum of two terms: bj1 bjm = X i1,...,im ai1,j1 aim,jm ei1 eim (192) i1,...,im s =t. is =it ai1,j1 aim,jm ei1 eim + X i1,...,im s =t. is=it ai1,j1 aim,jm ei1 eim. Our claim is equivalent to the vanishing of the second term. Note that the above equation for bj already shows the claim for m = 1. The case m = 0 is trivial. We now prove the claim for m = 2 by hand before doing induction after. Recall that j1 = j2: i1,i2 i1 =i2 ai1,j1ai2,j2 ei1ei2 + i=1 ai,j1ai,j2 eiei (194) i1,i2 i1 =i2 ai1,j1ai2,j2 ei1ei2 + i=1 ai,j1 ai,j2 q(ei) | {z } =b(ei,bj2) i1,i2 i1 =i2 ai1,j1ai2,j2 ei1ei2 + b(bj1, bj2) | {z } =0 i1,i2 i1 =i2 ai1,j1ai2,j2 ei1ei2 (197) Cl(2)(V, q). (198) This shows the claim for m = 2. By way of induction we now assume that we have shown the claim until some m 2, i.e. we have: bj1 bjm = X i1,...,im k =l. ik =il ai1,j1 aim,jmei1 eim Cl(m)(V, q). (199) Now consider another bj with j := jm+1 = jk, k = 1, . . . , m. We then get: bj1 bjmbj = i1,...,im k =l. ik =il ai1,j1 aim,jmei1 eim i1,...,im k =l. ik =il ai1,j1 aim,jmai,jei1 eimei (201) i1,...,im k =l. ik =il i/ {i1,...,im} ai1,j1 aim,jmai,jei1 eimei (202) i1,...,im k =l. ik =il i {i1,...,im} ai1,j1 aim,jmai,jei1 eimei (203) i1,...,im,im+1 k =l. ik =il ai1,j1 aim,jmaim+1,jm+1ei1 eimeim+1 (204) i1,...,im k =l. ik =il is=i ai1,j1 aim,jmai,jei1 eimei. (205) We have to show that the last term vanishes. The last term can be written as: n X i1,...,im k =l. ik =il is=i ai1,j1 aim,jmai,j ei1 eimei (206) i1,...,im k =l. ik =il is=i ai1,j1 ais,js aim,jmai,j ei1 eis eimei (207) i1,...,im k =l. ik =il is=i ai1,j1 ais,js aim,jmai,j ei1 eis eimeisei ( 1)m s (208) i1,...,im k =l. ik =il is=i ai1,j1 ais,js aim,jmai,j ei1 eis eim ( 1)m sq(ei) (209) i1,...,im k =l. ik =il is=i ai1,j1 ais,js aim,jmai,j ei1 eis eim ( 1)m sq(ei) (210) i1,...,im k =l. ik =il ai1,j1 ais,js aim,jmais,j ei1 eis eim ( 1)m sq(eis) (211) i1,..., is,...,im k =l. ik =il is=1 is / {i1,., is,.,im} ai1,j1 ais,js aim,jmais,j ei1 eis eim ( 1)m sq(eis) i1,..., is,...,im k =l. ik =il is=1 ai1,j1 ais,js aim,jmais,j ei1 eis eim ( 1)m sq(eis) (213) i1,..., is,...,im k =l. ik =il is {i1,., is,.,im} ai1,j1 ais,js aim,jmais,j ei1 eis eim ( 1)m sq(eis) i1,..., is,...,im k =l. ik =il ai1,j1 ais,js aim,jm ei1 eis eim ( 1)m s n X is=1 ais,jsais,jq(eis) | {z } =b(bj,bjs)=0 (215) i1,..., is,...,im k =l. ik =il is {i1,., is,.,im} ai1,j1 ais,js aim,jmais,j ei1 eis eim ( 1)m sq(eis) i1,..., is,...,im k =l. ik =il i {i1,., is,.,im} ai1,j1 ai,js aim,jmai,j ei1 eis eim ( 1)m sq(ei) i1,., is,.,it,.,im k =l. ik =il ai1,j1 ait,js aim,jmait,j ei1 eis eim ( 1)m sq(eit). Note that the positional index t can occure before or after the positional index s. Depending of its position the elements eit will appear before or after the element eis in the product. We look at both cases separately and suppress the dots in between for readability. First consider t > s: i1,., is,.,it,.,im k =l. ik =il ai1,j1 ait,js ait,jt aim,jmait,j ei1 eis eit eim ( 1)m sq(eit) (219) i1,., is,.,it,.,im k =l. ik =il ai1,j1 ait,js ait,jt aim,jmait,j ( 1)t 2eit ei1 eis eit eim ( 1)m sq(eit) it=1 ait,js ait,jt ait,j( 1)m sq(eit)( 1)teit πs,t(it) (221) with πs,t(i) := X i1,., is,., it,.,im k =l. ik =il k. ik =i ai1,j1 ait,js ait,jt aim,jm ei1 eis eit eim (222) i=1 ( 1)m+s+tai,js ai,jt ai,j q(ei) ei πs,t(i) (223) y(s, t), (224) with y(s, t) := i=1 ( 1)m+s+tai,js ai,jt ai,j q(ei) ei πs,t(i). (225) It is important to note that for all s = t we have: y(s, t) = y(t, s). (226) Now consider t < s: i1,.,it,., is,.,im k =l. ik =il ai1,j1 ait,jt ait,js aim,jmait,j ei1 eit eis eim ( 1)m sq(eit) (227) i1,.,it,., is,.,im k =l. ik =il ai1,j1 ait,jt ait,js aim,jmait,j ( 1)t 1eit ei1 eit eis eim ( 1)m sq(eit) it=1 ait,js ait,jt ait,j( 1)m sq(eit)( 1)t 1eit πs,t(it) (229) i=1 ( 1)m+s+t+1ai,js ai,jt ai,j q(ei) ei πs,t(i) (230) y(s, t) (231) y(t, s) (232) y(s, t) (233) y(s, t). (234) In total we see that both terms appear with a different sign and thus cancel out. This shows the claim. Corollary D.28. Let e1, . . . , en and b1, . . . , bn be two orthogonal bases of (V, q) with basis transition matrix C = (ci,j)i [n],j [n]: j [n]. bj = X i [n] ci,j ei. (235) Then C is invertible and we have the following matrix relations: diag(q(b1), . . . , q(bn)) = C diag(q(e1), . . . , q(en))C. (236) Furthermore, for every subset J [n] we have the formula: I [n] |I|=|J| det CI,J e I, (237) with the submatrix: CI,J = (ci,j)i I,j J. Proof. For j, l [n] we have: diag(q(b1), . . . , q(bn))j,l = q(bj) δj,l (238) = b(bj, bl) (239) k=1 ci,j ck,l b(ei, ek) (240) k=1 ci,j ck,l q(ei) δi,k (241) = C diag(q(e1), . . . , q(en))C j,l . (242) This shows the matrix identity: diag(q(b1), . . . , q(bn)) = C diag(q(e1), . . . , q(en))C. (243) Furthermore, we have the following identites: b J = bj1 bjm (244) i1 [n] ci1,j1 ei1 im [n] cim,jm eim i1 [n],...,im [n] (ci1,j1 cim,jm) (ei1 eim) (246) i1 [n],...,im [n] |{i1,...,im}|=m (ci1,j1 cim,jm) (ei1 eim) (247) i1 [n],...,im [n] |{i1,...,im}|=m sgn(i1, . . . , im) (ci1,j1 cim,jm) e{i1,...,im} (248) i1,...,im [n] i1<