# brauers_group_equivariant_neural_networks__2bf9afbe.pdf

Brauer s Group Equivariant Neural Networks

Edward Pearce Crump 1

We provide a full characterisation of all of the possible group equivariant neural networks whose layers are some tensor power of Rn for three symmetry groups that are missing from the machine learning literature: O(n), the orthogonal group; SO(n), the special orthogonal group; and Sp(n), the symplectic group. In particular, we find a spanning set of matrices for the learnable, linear, equivariant layer functions between such tensor power spaces in the standard basis of Rn when the group is O(n) or SO(n), and in the symplectic basis of Rn when the group is Sp(n).

1. Introduction

Finding neural network architectures that are equivariant to a symmetry group has been an active area of research ever since it was first shown how convolutional neural networks, which are equivariant to translations, could be used to learn from images. Unlike with multilayer perceptrons, however, the requirement for the overall network to be equivariant to the symmetry group typically restricts the form of the network itself. Moreover, since these networks exhibit parameter sharing within each layer, ordinarily far fewer parameters appear in these networks than in multilayer perceptrons. This usually results in simpler, more interpretable models that generalise better to unseen data.

Symmetry groups naturally appear in problems coming from physics, where the data that is generated by a physical process often comes with a certain type of symmetry that is baked into the data itself. The data is typically high dimensional, and it can often be represented in the form of a high order tensor so that complex relationships can be captured between different features in the data. Consequently, it is important to be able to construct neural networks that can learn efficiently from such data.

1Department of Computing, Imperial College London, United Kingdom. Correspondence to: Edward Pearce Crump <ep1011@ic.ac.uk>.

Proceedings of the 40 th International Conference on Machine Learning, Honolulu, Hawaii, USA. PMLR 202, 2023. Copyright 2023 by the author(s).

There are two approaches that are typically used for constructing group equivariant neural networks. The first employs a universal approximation theorem to learn functions that are approximately equivariant, such as in (Kumagai & Sannai, 2020). The second involves decomposing tensor product representations of the symmetry group in question into irreducible representations. For example, neural networks that are equivariant to the special orthogonal group SO(3) (Kondor et al., 2018), the special Euclidean group SE(3) (Weiler et al., 2018), and the proper orthochronous Lorentz group SO+(1, 3) (Bogatskiy et al., 2020) all use irreducible decompositions and the resulting change of basis transformations into Fourier space in their implementations. However, for most groups, finding this irreducible decomposition is not trivial, since the relevant Clebsch Gordan coefficients are typically unknown. Furthermore, even if such a decomposition can be found, the resulting neural networks are often inefficient since forward and backward Fourier transforms are usually required to perform the calculations, which come with a high computational cost.

In this paper, we take an entirely different approach, one which results in a full characterisation of all of the possible group equivariant neural networks whose layers are some tensor power of Rn for the following three symmetry groups: O(n), the orthogonal group; SO(n), the special orthogonal group; and Sp(n), the symplectic group. Our approach is motivated by a mathematical concept that, to the best of our knowledge, has not appeared in any of the machine learning literature to date, other than in (Pearce-Crump, 2022). In that paper, they use that the symmetric group is in Schur Weyl duality with the partition algebra to provide a full characterisation of all of the possible permutation equivariant neural networks whose layers are some tensor power of Rn. In this paper, we use as our motivation the results given in On Algebras Which are Connected with the Semisimple Continuous Groups (Brauer, 1937). Brauer showed that the orthogonal group is in Schur Weyl duality with an algebra called the Brauer algebra; that the symplectic group is also in Schur Weyl duality with the Brauer algebra, and that the special orthogonal group is in Schur Weyl duality with an algebra which we have termed the Brauer Grood algebra. By adapting the combinatorial diagrams that form a basis for these algebras, we are able to find a spanning set of matrices for the learnable, linear, equivariant layer functions between

Brauer s Group Equivariant Neural Networks

tensor power spaces of Rn in the standard basis of Rn when the group is O(n) or SO(n), and in the symplectic basis of Rn when the group is Sp(n). In doing so, we avoid having to calculate any irreducible decompositions for the tensor power spaces, and therefore avoid having to perform any Fourier transforms to change the basis accordingly.

The main contributions of this paper are as follows:

1. We are the first to show how the combinatorics underlying the Brauer and Brauer Grood vector spaces, adapted from the Schur Weyl dualities established by Brauer (1937), provides the theoretical background for constructing group equivariant neural networks for the orthogonal, special orthogonal, and symplectic groups when the layers are some tensor power of Rn.

2. We find a spanning set of matrices for the learnable, linear, equivariant layer functions between such tensor power spaces in the standard basis of Rn when the group is O(n) or SO(n), and in the symplectic basis of Rn when the group is Sp(n).

3. We generalise our diagrammatical approach to show how to construct neural networks that are equivariant to local symmetries.

4. We suggest that Schur Weyl duality is a powerful mathematical concept that could be used to characterise other group equivariant neural networks beyond those considered in this paper.

2. Preliminaries

We choose our field of scalars to be R throughout. Tensor products are also taken over R, unless otherwise stated. Also, we let [n] represent the set {1, . . . , n}.

Recall that a representation of a group G is a choice of vector space V over R and a group homomorphism

ρ : G GL(V ) (1)

We choose to focus on finite-dimensional vector spaces V that are some tensor power of Rn in this paper.

We often abuse our terminology by calling V a representation of G, even though the representation is technically the homomorphism ρ. When the homomorphism ρ needs to be emphasised alongside its vector space V , we will use the notation (V, ρ).

3. Group Equivariant Neural Networks

Group equivariant neural networks are constructed by alternately composing linear and non-linear G-equivariant maps between representations of a group G. The following is based on the material presented in (Lim & Nelson, 2022).

We first define G-equivariance: Definition 3.1. Suppose that (V, ρV ) and (W, ρW ) are two representations of a group G.

A map ϕ : V W is said to be G-equivariant if, for all g G and v V ,

ϕ(ρV (g)[v]) = ρW (g)[ϕ(v)] (2)

The set of all linear G-equivariant maps between V and W is denoted by Hom G(V, W). When V = W, we write this set as End G(V ). It can be shown that Hom G(V, W) is a vector space over R, and that End G(V ) is an algebra over R. See (Segal, 2014) for more details.

A special case of G-equivariance is G-invariance: Definition 3.2. The map ϕ given in Definition 3.1 is said to be G-invariant if ρW is defined to be the 1-dimensional trivial representation of G. As a result, W = R.

We can now define the type of neural network that is the focus of this paper: Definition 3.3. An L-layer G-equivariant neural network f NN is a composition of layer functions

f NN := f L . . . fl . . . f1 (3)

such that the lth layer function is a map of representations of G fl : (Vl 1, ρl 1) (Vl, ρl) (4)

that is itself a composition

fl := σl ϕl (5)

of a learnable, linear, G-equivariant function ϕl : (Vl 1, ρl 1) (Vl, ρl) together with a fixed, non-linear activation function σl : (Vl, ρl) (Vl, ρl) such that

1. σl is a G-equivariant map, as in (2), and

2. σl acts pointwise (after a basis has been chosen for each copy of Vl in σl.)

We focus on the learnable, linear, G-equivariant functions in this paper because the non-linear functions are fixed. Remark 3.4. The entire neural network f NN is itself a Gequivariant function because it can be shown that the composition of any number of G-equivariant functions is itself G-equivariant. Remark 3.5. One way of making a neural network of the form given in Definition 3.3 G-invariant is by choosing the representation in the final layer to be the 1-dimensional trivial representation of G.

Brauer s Group Equivariant Neural Networks

4. The groups G = O(n), SO(n), and Sp(n)

We consider throughout the real vector space Rn.

Let GL(n) be the group of invertible linear transformations from Rn to Rn. If we pick a basis for each copy of Rn, then for each linear map in GL(n) we obtain its matrix representation in the bases of Rn that were chosen. Let SL(n) be the subgroup of GL(n) consisting of all invertible linear transformations from Rn to Rn whose determinant is +1.

We can associate to Rn one of the following two bilinear forms:

1. a non-degenerate, symmetric bilinear form ( , ) : Rn Rn R.

2. a non-degenerate, skew-symmetric bilinear form , : Rn Rn R. In this case, n must be even, say n = 2m, as a result of applying Jacobi s Theorem. See page 6 of Goodman and Wallach (2009) for more details.

Then we can define the groups O(n), SO(n), and Sp(n) as follows:

1. O(n) := g GL(n) (gx, gy) = (x, y) for all x, y Rn

2. SO(n) := O(n) SL(n)

3. Sp(n) := g GL(n) gx, gy = x, y for all x, y Rn

It can be shown that each of these groups are subgroups of GL(n).

There are special bases of Rn with respect to each of the forms given above.

Firstly, for the form ( , ), by Lemma 1.1.2 on page 4 of Goodman and Wallach (2009), we may assume that there is an ordered basis

B := {e1, e2, . . . , en} (6)

of Rn, where ei has a 1 in the ith position, and a 0 elsewhere, which satisfies the relations

(ei, ej) = δi,j (7)

with respect to the form ( , ). The basis B is called the standard basis for Rn, and by (7), it is an orthonormal basis of Rn. It is clear that the matrix representation of the form ( , ) in the basis B is the n n identity matrix.

Hence the form ( , ) in the basis B is the Euclidean inner product (x, y) = x y for all x, y Rn (8)

where x is the column vector (x1, x2, . . . , xn) and y is the column vector (y1, y2, . . . , yn) when expressed in the basis B.

Secondly, for the form , , where n = 2m, by Lemma 1.1.5 on page 7 of Goodman and Wallach (2009), we may assume that there is an ordered basis

e B := {e1, e1 , . . . , em, em } (9)

of Rn, where the ith basis vector in the set has a 1 in the ith

position and a 0 elsewhere, which satisfies the relations

eα, eβ = eα , eβ = 0 (10)

eα, eβ = eα , eβ = δα,β (11)

with respect to the form , . The basis e B is called the symplectic basis for Rn.

Hence the form , in the basis e B is the skew product

r=1 (xryr xr yr) = X

i,j ϵi,jxiyj (12)

for all x, y Rn, where x is the column vector (x1, x1 , . . . , xm, xm ) expressed in the basis e B, y is the column vector (y1, y1 , . . . , ym, ym ) expressed in the basis e B, and ϵα,β = ϵα ,β = 0 (13)

ϵα,β = ϵα ,β = δα,β (14)

5. The space (Rn) k as a representation of G

Let G be any of the groups O(n), SO(n), and Sp(n) (where n = 2m for Sp(n)).

Then, for any positive integer k, the space (Rn) k is a representation of G, denoted by ρk, where

ρk(g)(v1 vk) := gv1 gvk (15)

for all g G and for all vectors vi Rn. We call (Rn) k

the k-order tensor power space of Rn.

Moreover, each form Rn Rn R induces a nondegenerate bilinear form (Rn) k (Rn) k R, given by

(v1 vk, w1 wk) :=

r=1 (vr, wr) (16)

for the symmetric case, and

v1 vk, w1 wk :=

r=1 vr, wr (17)

Brauer s Group Equivariant Neural Networks

for the skew-symmetric case.

Consequently, there is a standard basis for (Rn) k that is induced from the standard basis for Rn for the symmetric case, and, similarly, there is a symplectic basis for (Rn) k

that is induced from the symplectic basis for Rn for the skew-symmetric case.

Our goal is to characterise all of the possible learnable, linear G-equivariant layer functions between any two tensor power spaces of Rn. In doing so, we will be able to characterise and implement all of the possible G-equivariant neural networks whose layers are a tensor power space of Rn.

Specifically, we want to find, ideally, a basis, or, at the very least, a spanning set, of matrices for Hom G((Rn) k, (Rn) l) when either the standard basis (for G = O(n), SO(n)) or the symplectic basis (for G = Sp(n), n = 2m) is chosen for Rn.

Note that the G-invariance case is encapsulated within this, since this occurs when l = 0.

Remark 5.1. We assume throughout most of the paper that the feature dimension for all of our representations is one. This is because the group G does not act on the feature space. We relax this assumption in Section 7.

Also, the layer functions under consideration do not take into account any bias terms, but we will show in Section 7 that these can be easily introduced.

6. A Spanning Set for Hom G((Rn) k, (Rn) l)

Brauer s (1937) paper On Algebras Which are Connected with the Semisimple Continuous Groups focused, in part, on calculating a spanning set of matrices for End G((Rn) k) in the standard basis of Rn for the groups G = O(n) and SO(n), and in the symplectic basis of Rn for Sp(n).

Brauer achieved this by applying the First Fundamental Theorem of Invariant Theory for each of the groups in question (see the Technical Appendix) to a spanning set of invariants, one for each group G, that he showed are in bijective correspondence with the spanning set of matrices for End G((Rn) k).

In particular, for each group G, he associated a diagram with each element of the spanning set of invariants, which we describe in further detail below. In doing so, he constructed a bijective correspondence between a set of such diagrams and a spanning set of matrices for End G((Rn) k) in the standard/symplectic basis of Rn.

We want to find, instead, a spanning set of matrices for Hom G((Rn) k, (Rn) l) in the standard/symplectic basis of Rn. The results that we describe in the following will contain, as a special case, Brauer s results, since when l = k,

5 6 7 8 9 10

5 6 7 8 9 10

Figure 1. a) The diagram dβ corresponding to the (6, 4) Brauer partition β given in (19). b) The diagram dα corresponding to the (4 + 6)\6 partition α given in (20).

we see that

Hom G((Rn) k, (Rn) l) = End G((Rn) k) (18)

Brauer considered two types of diagrams coming from certain set partitions, whose definitions we adapt for our purposes. The first is defined as follows: Definition 6.1. For any k, l Z 0, a (k, l) Brauer partition β is a partitioning of the set [l + k] into a disjoint union of pairs. We call each pair a block. Clearly, if l + k is odd, then no such partitions exist. Therefore, in the following, we assume that l + k is even. By convention, if k = l = 0, then there is just one (0, 0) Brauer partition.

We can represent each (k, l) Brauer partition β by a diagram dβ, called a (k, l) Brauer diagram, consisting of two rows of vertices and edges between vertices such that there are 1) l vertices in the top row, labelled left to right by 1, . . . , l; 2) k vertices in the bottom row, labelled left to right by l + 1, . . . , l + k; and 3) the edges between vertices correspond to the blocks of β. In particular, this means that each vertex is incident to exactly one edge; hence there are precisely l+k

2 edges in total in dβ.

It is clear that the number of (k, l) Brauer diagrams, if l +k is even, is (l + k 1)!! := (l + k 1)(l + k 3) 5 3 1, and is 0 otherwise.

For example, we see that

β := {1, 5 | 2, 8 | 3, 4 | 6, 7 | 9, 10} (19)

is a valid (6, 4) Brauer partition. Figure 1a) shows the (6, 4) Brauer diagram dβ corresponding to β.

The second type of diagram is what we have decided to call an (l + k)\n diagram (pronounced l plus k without n ). These diagrams were originally hinted at by Brauer (1937) in the case where l = k and were looked at in greater detail by Grood (1999), but again only in the case where l = k. In their paper, they called these diagrams k\m diagrams, since they only considered the situation where l = k and n = 2m. We will see that the definition below is equivalent to their definition in this case. Our naming convention for the diagrams makes more sense since they are a generalisation of those considered by Brauer and Grood. Definition 6.2. For any k, l and n Z 0, an (l + k)\n partition is a partitioning of the set [l + k] with some n

Brauer s Group Equivariant Neural Networks

elements removed into a disjoint union of pairs. Once again, each pair is called a block.

An (l+k)\n diagram is the representation of an (l+k)\n partition in its diagram form, constructed in a similar way to the (k, l) Brauer diagrams above, where there is still a top row consisting of l vertices and a bottom row consisting of k vertices, but now there are only l+k n

2 edges between pairs of vertices. In this case, the n vertices removed from the set [l + k] will not be incident to any edge, and an edge exists between any two vertices that are in the same pair. We call the n vertices whose labels have been removed from [l + k] free vertices.

It is clear that if n > l + k, then no (l + k)\n diagrams exist. Also, no such diagrams exist if n is odd and l + k is even, or if n is even and l + k is odd.

Otherwise, that is, if n l + k and either n is even and l + k is even, or n is odd and l + k is odd, then the number of (l +k)\n diagrams is l+k n (l +k n 1)!!, since there are l+k n ways to pick n free vertices, and (l + k n 1)!! ways to pair up the remaining l + k n vertices.

For example, we see that, if k = 6 and l = 4, then

α := {2, 6 | 8, 9} (20)

is a (4 + 6)\6 partition. Figure 1b) shows the (4 + 6)\6 diagram dα corresponding to α. Remark 6.3. We choose throughout to focus on (k, l) Brauer and (l + k)\n diagrams over their equivalent set partition form. This is because both the diagrams and the matrices that they correspond to have matching shapes. In fact, it will become clear that, using these diagrams, we can view the matrix multiplication of a spanning set element in Hom G((Rn) k, (Rn) l) with an input vector in (Rn) k as a process represented by the corresponding diagram, where the input vector is passed into the bottom row of the diagram, and an output vector is returned from the top row.

From the two types of diagrams defined above, we form the following two vector spaces. Definition 6.4. We define the Brauer vector space, Bl k(n), which exists for any integer n Z 1 and for any k, l Z 0, as follows. Let B0 0(n) := R. Otherwise, define Bl k(n) to be the R-linear span of the set of all (k, l) Brauer diagrams. Definition 6.5. We define the Brauer Grood vector space, Dl k(n), which exists for any integer n Z 1 and for any k, l Z 0, as follows. Let D0 0(n) := R. Otherwise, define Dl k(n) to be the R-linear span of the set of all (k, l) Brauer diagrams together with the set of all (l + k)\n diagrams.

Clearly, if n > l + k, or if n is odd and l + k is even, or if n is even and l + k is odd, then Dl k(n) = Bl k(n).

With these two vector spaces, we are now able to find a spanning set of matrices for Hom G((Rn) k, (Rn) l) in the

standard basis (for G = O(n), SO(n)) or the symplectic basis (for G = Sp(n), n = 2m) of Rn.

In order to explicitly state what these spanning sets are, we note that, for any k, l Z 0, as a result of picking the standard/symplectic basis for Rn, the vector space Hom((Rn) k, (Rn) l) has a standard basis of matrix units

{EI,J}I [n]l,J [n]k (21)

where I is a tuple (i1, i2, . . . , il) [n]l, J is a tuple (j1, j2, . . . , jk) [n]k and EI,J has a 1 in the (I, J) position and is 0 elsewhere. If one or both of k, l is equal to 0, then we replace the tuple that indexes the matrix by a 1. For example, when k = 0 and l Z 1, (21) becomes {EI,1}I [n]l.

We obtain the following results, which are given in the following three theorems.

Theorem 6.6 (Spanning set when G = O(n)). For any k, l Z 0 and any n Z 1, there is a surjection of vector spaces

Φl k,n : Bl k(n) Hom O(n)((Rn) k, (Rn) l) (22)

which is defined as follows.

If l + k is odd, then we map the empty set onto the empty set. Hence, in this case, Hom O(n)((Rn) k, (Rn) l) = .

Otherwise, for any k, l Z 0 and for each (k, l) Brauer diagram dβ, associate the indices i1, i2, . . . , il with the vertices in the top row of dβ, and j1, j2, . . . , jk with the vertices in the bottom row of dβ. Then, for any n Z 1, define

I [n]l,J [n]k δr1,u1δr2,u2 . . . δr l+k

2 EI,J (23)

where r1, u1, . . . , r l+k

2 is any permutation of the indices i1, i2, . . . , il, j1, j2, . . . , jk such that the vertices corresponding to rp, up are in the same block of β.

The adapted version of Brauer s Invariant Argument, given in the Technical Appendix, shows that (23) defines a bijective correspondence between the set of all (k, l) Brauer diagrams and a spanning set for Hom O(n)((Rn) k, (Rn) l).

Consequently, when l + k is even, the surjection of vector spaces given in (22) is defined by

dβ 7 Eβ (24)

for all (k, l) Brauer diagrams dβ, and is extended linearly on the basis of such diagrams for Bl k(n).

Hence the set

{Eβ | dβ is a (k, l) Brauer diagram} (25)

Brauer s Group Equivariant Neural Networks

is a spanning set for Hom O(n)((Rn) k, (Rn) l) in the standard basis of Rn, of size 0 when l + k is odd, and of size (l + k 1)!! when l + k is even.

Lehrer and Zhang (2012) showed that when 2n l + k, Φl k,n is an isomorphism of vector spaces, and so the set (25) forms a basis of Hom O(n)((Rn) k, (Rn) l) in this case.

Theorem 6.7 (Spanning set when G = Sp(n), n = 2m). For any k, l Z 0 and any n Z 2 such that n = 2m, there is a surjection of vector spaces

Xl k,n : Bl k(n) Hom Sp(n)((Rn) k, (Rn) l) (26)

which is defined as follows.

If l + k is odd, then we map the empty set onto the empty set. Hence, in this case, Hom Sp(n)((Rn) k, (Rn) l) = .

Otherwise, for any k, l Z 0 and for each (k, l) Brauer diagram dβ, associate the indices i1, i2, . . . , il with the vertices in the top row of dβ, and j1, j2, . . . , jk with the vertices in the bottom row of dβ. Then, for any n Z 2, define

I,J γr1,u1γr2,u2 . . . γr l+k

2 EI,J (27)

where the indices ip, jp range over 1, 1 , . . . , m, m , where r1, u1, . . . , r l+k

2 is any permutation of the indices i1, i2, . . . , il, j1, j2, . . . , jk such that the vertices corresponding to rp, up are in the same block of β, and

δrp,up if the vertices corresponding to rp, up are in different rows of dβ

ϵrp,up if the vertices corresponding to rp, up are in the same row of dβ (28) where ϵrp,up was defined in (13) and (14).

The adapted version of Brauer s Invariant Argument, given in the Technical Appendix, shows that (27) defines a bijective correspondence between the set of all (k, l) Brauer diagrams and a spanning set for Hom Sp(n)((Rn) k, (Rn) l).

Consequently, when l + k is even, the surjection of vector spaces given in (26) is defined by

dβ 7 Fβ (29)

for all (k, l) Brauer diagrams dβ, and is extended linearly on the basis of such diagrams for Bl k(n).

Hence the set

{Fβ | dβ is a (k, l) Brauer diagram} (30)

is a spanning set for Hom Sp(n)((Rn) k, (Rn) l), for n = 2m, in the symplectic basis of Rn, of size 0 when l + k is odd, and of size (l + k 1)!! when l + k is even.

Lehrer and Zhang (2012) showed that when n l + k, Xl k,n is an isomorphism of vector spaces, and so the set (30) forms a basis of Hom Sp(n)((Rn) k, (Rn) l), for n = 2m, in this case.

Theorem 6.8 (Spanning set when G = SO(n)). For any k, l Z 0 and any n Z 1, we construct a surjection of vector spaces

Ψl k,n : Dl k(n) Hom SO(n)((Rn) k, (Rn) l) (31)

as follows.

If n > l + k, or if n is odd and l + k is even, or if n is even and l + k is odd, then we saw that Dl k(n) = Bl k(n). Hence, in these cases, Ψl k,n = Φl k,n, and so Hom SO(n)((Rn) k, (Rn) l) = Hom O(n)((Rn) k, (Rn) l).

Otherwise, that is, if n l + k, and either n is even and l + k is even, or n is odd and l + k is odd, there exist (l + k)\n diagrams.

For each such diagram dα, again associate the indices i1, i2, . . . , il with the vertices in the top row of dα, and j1, j2, . . . , jk with the vertices in the bottom row of dα. Suppose that there are s free vertices in the top row. Then there are n s free vertices in the bottom row. Relabel the s free indices in the top row (from left to right) by t1, . . . , ts, and the n s free indices in the bottom row (from left to right) by b1, . . . , bn s.

Then, define χ 1 2 s s+1 n t1 t2 ts b1 bn s as follows: it is 0 if the elements t1, . . . , ts, b1, . . . , bn s are not distinct, otherwise, it is sgn 1 2 s s+1 n t1 t2 ts b1 bn s , considered as

a permutation of [n].

As a result, for any n Z 1, define Hα to be

I [n]l,J [n]k χ 1 2 s s+1 n t1 t2 ts b1 bn s δ(r, u)EI,J (32)

δ(r, u) := δr1,u1δr2,u2 . . . δr l+k n

Here, r1, u1, . . . , r l+k n

2 , u l+k n

2 is any permutation of the indices

{i1, . . . , il, j1, . . . , jk}\{t1, . . . , ts, b1, . . . , bn s} (34)

such that the vertices corresponding to rp, up are in the same block of α.

The adapted version of Brauer s Invariant Argument, given in the Technical Appendix, shows that (23) and (32) defines a bijective correspondence between the set of all (k, l) Brauer diagrams together with the set

Brauer s Group Equivariant Neural Networks

of all (l + k)\n diagrams, and a spanning set for Hom SO(n)((Rn) k, (Rn) l).

Consequently, when n l + k and n is even and l + k is even, the surjection of vector spaces (31) is given by

dβ 7 Eβ (35)

if dβ is a (k, l) Brauer diagram, where Eβ was defined in Theorem 6.6, and by

dα 7 Hα (36)

if dα is an (l + k)\n diagram, and is extended linearly on the basis of such diagrams for Dl k(n).

When n l+k and n is odd and l+k is odd, the surjection of vector spaces (31) is given solely by (36), since no (k, l) Brauer diagrams exist in this case.

Hence, in all cases, the set

{Eβ}β {Hα}α (37)

where dβ is a (k, l) Brauer diagram and dα is an (l+k)\n diagram, is a spanning set for Hom SO(n)((Rn) k, (Rn) l) in the standard basis of Rn.

7. Adding Features and Biases

7.1. Features

In Section 6, we made the assumption that the feature dimension for all of the layers appearing in the neural network was one. This simplified the analysis for the results seen in that section. All of these results can be adapted for the case where the feature dimension of the layers is greater than 1.

Suppose that an r-order tensor has a feature space of dimension dr. We now wish to find a spanning set for

Hom G((Rn) k Rdk, (Rn) l Rdl) (38)

in the standard basis of Rn for G = O(n) and SO(n), and in the symplectic basis of Rn for G = Sp(n), where n = 2m.

A spanning set in each case can be found by making the following substitutions in Theorems 6.6, 6.7, and 6.8 of Section 6, where now i [dl] and j [dk]:

replace EI,J by EI,i,J,j, EI,1 by EI,i,1,j, E1,J by E1,i,J,j, and E1,1 by E1,i,1,j, and

relabel Eβ by Eβ,i,j, Fβ by Fβ,i,j, and Hα by Hα,i,j.

Consequently, a spanning set for (38) in the standard/symplectic basis of Rn, is given by

{Eβ,i,j}β,i,j for G = O(n),

{Fβ,i,j}β,i,j for G = Sp(n) (where n = 2m), and

{Eβ,i,j}β,i,j {Hα,i,j}α,i,j for G = SO(n),

where dα is an (l + k)\n diagram, dβ is a (k, l) Brauer diagram, i [dl], and j [dk].

7.2. Biases

Including bias terms in the layer functions of a Gequivariant neural network is harder, but it can be done. For the learnable linear layers of the form Hom G((Rn) k, (Rn) l), Pearce Crump (2022) shows that the G-equivariance of the bias function, β : ((Rn) k, ρk) ((Rn) l, ρl), needs to satisfy

c = ρl(g)c (39)

for all g G and c (Rn) l.

Since any c (Rn) l satisfying (39) can be viewed as an element of Hom G(R, (Rn) l), to find the matrix form of c, all we need to do is to find a spanning set for Hom G(R, (Rn) l).

But this is simply a matter of applying the results of Section 6, namely Theorem 6.6 for G = O(n), Theorem 6.7 for G = Sp(n), with n = 2m, and Theorem 6.8 for G = SO(n), setting k = 0.

8. Equivariance to Local Symmetries

We can extend our results to looking at linear layer functions that are equivariant to a direct product of groups. In this scenario, the data is given on a collection of some p sets of sizes n1, . . . , np, and we require equivariance to the group G(ni) for the data given on the ith set. Neural networks that are constructed using these layer functions are said to be equivariant to local symmetries.

Specifically, we wish to find a spanning set for

Hom G(n1) G(np)(V, W) (40)

where V := (Rn1) k1 (Rnp) kp (41)

W := (Rn1) l1 (Rnp) lp (42)

and is the external tensor product.

The Hom-space given in (40) is isomorphic to

r=1 Hom G(nr)((Rnr) kr, (Rnr) lr) (43)

Consequently, we can construct a surjection of vector spaces, denoted by Np r=1 Θlr kr,nr, from Np r=1 Alr kr(nr) to the Hom space given in (43), where

Brauer s Group Equivariant Neural Networks

if G(nr) = O(nr), then Θlr kr,nr = Φlr kr,nr and Alr kr(nr) = Blr kr(nr),

if G(nr) = Sp(nr), then Θlr kr,nr = Xlr kr,nr and Alr kr(nr) = Blr kr(nr) (here nr = 2mr), and

if G(nr) = SO(nr), then Θlr kr,nr = Ψlr kr,nr and Alr kr(nr) = Dlr kr(nr).

As a result, a spanning set for (40) can be found by placing each possible basis diagram for each of the vector spaces Alr kr(nr) side by side, taking the image of each under its map Θlr kr,nr, and then calculating the Kronecker product of the resulting matrices.

Features and biases can be added in exactly the same way as discussed in Section 7.

9. Related Work

The combinatorial representation theory of the Brauer algebra was developed by Brauer (1937) for the purpose of understanding the centraliser algebras of the groups O(n), SO(n) and Sp(n). Brown published two papers (1955; 1956) on the Brauer algebra, showing that it is semisimple if and only if n k 1. Weyl (1946) had previously shown that the Brauer algebra was semisimple if n 2k. After Brown s papers, the Brauer algebra was largely forgotten about until Hanlon and Wales (1989) provided an isomorphism between two versions of the Brauer algebra these versions share a common basis but have different algebra products defined on them. Grood (1999) investigated the representation theory of what we have termed the Brauer Grood algebra. Lehrer and Zhang (2012) studied the kernel of the surjection of algebras given in Theorems 6.6 and 6.7 when l = k, showing that, in each case, it is a two-sided ideal generated by a single element of the Brauer algebra.

With regard to the machine learning literature, Maron et al. s paper (2019) is the closest to ours in terms of how it motivated our research idea. They characterised all of the learnable, linear, equivariant layer functions when the layers are some tensor power of Rn for the symmetric group Sn in the practical cases (specifically, when n k + l). Pearce Crump (2022) used the Schur Weyl duality between the symmetric group and the partition algebra to provide a full characterisation for these layer functions for all values of n and for all orders of the tensor power spaces involved, showing, in particular, that the dimension of the space of layer functions is not independent of n. Finzi et al. (2021) were the first to recognise that the dimension is not independent of n, using a numerical algorithm to calculate the correct values for small values of n, k and l. Their numerical algorithm also enabled them to find a basis for the learnable, layer, equivariant functions for the groups that are the focus

of our study, namely O(n), Sp(n) and SO(n), but only for small values of n, k and l, since their algorithm runs out of memory on higher values. In this paper, whilst we have not found a basis in all cases, we have provided a spanning set and an analytic solution for all values of n, k and l, which will make it possible to implement group equivariant neural networks for any such values of n, k and l for the three groups in question. In writing up this paper, we came across a paper by Villar et al. (2021), in which they focus on designing group equivariant neural networks for O(n) and SO(n), amongst others. They also use the First Fundamental Theorem of Invariant Theory for O(n) and SO(n), but only to characterise all invariant scalar functions (Rn) k R and all equivariant vector functions (Rn) k Rn as a sum involving O(n) or SO(n) invariant scalar functions. They then use multilayer perceptrons to learn these invariant scalar functions. Our method, by contrast, characterises a wider selection of functions, since we study the linear, learnable, equivariant functions between layers that are any tensor power of Rn for O(n) and SO(n) (as well as for Sp(n)), and we give the exact matrix form of these functions in the standard basis of Rn, meaning that the group equivariant neural network architectures that we provide are exact and do not require any approximations via multilayer perceptrons.

10. Limitations, Discussion, and Future Work

We believe that our approach for characterising all of the possible O(n), Sp(n) and SO(n) equivariant neural networks whose layers are some tensor power of Rn is promising in terms of the possible practical applications. A number of papers have appeared recently in the literature where the authors tried to learn group equivariant functions for the groups given in this paper on tensor power spaces of Rn. However, they were not especially successful in their attempts. We believe that this is a consequence of them using architectures that approximate the group equivariance property of the functions that they wish to learn, rather than guaranteeing it exactly. The results in this paper directly address this problem. In Finkelshtein et al. (2022), the authors created a tensor product neural network which was approximately equivariant to O(3) Sn in order to learn from point cloud data. By combining the results of Pearce Crump (2022) for Sn with the results in this paper for O(3), we would be able to replace the linear layers in their architecture with exact, parameterizable matrices for the tensor product spaces that are guaranteed to be equivariant to O(3) Sn. We believe that this could improve the final outcome. In addition, in Villar et al. (2021), the authors explore two numerical experiments involving tensor power representations: the first is an O(5)-invariant task from an order 2 tensor power of R5 to R, and the second is an S5 O(3) task where the O(3) component is equivariant

Brauer s Group Equivariant Neural Networks

on order 5 tensors of R3. The authors, however, use standard multi-layer perceptrons to learn the functions. We think that they could improve upon their results by using the linear layers that are characterised in this paper. Furthermore, we are of the opinion that by characterising the equivariant neural networks for these groups, we have made it possible for other researchers in the machine learning community to find further uses for these neural networks.

We are aware that equivariance to the symplectic group Sp(n) does not commonly appear in the machine learning literature. However, as stated in Appendix E.6 of Finzi et al. (2021), Sp(n) equivariance is especially relevant in the context of Hamiltonian mechanics and classical physics. Section 7.2 of Finzi et al. (2021) points to the paper by Greydanus et al. (2019), where the authors look to learn the Hamiltonian of a system coming from Hamiltonian mechanics. In particular, the time evolution of the system is expressed in terms of the symplectic basis. The paper by Villar et al. (2021) also highlights how there are many symmetries in physics that are relevant for the machine learning community, including symplectic symmetry that permits reinterpretations of positions and momenta .

We appreciate that given the current state of hardware, there will be some computational limitations when implementing the neural networks that appear in this paper in practice, and some engineering may be required to obtain the necessary scale. In particular, it is not a trivial task to store the high order tensors that appear in such neural networks. This was demonstrated by Kondor et al. (2018), where the authors needed to develop custom CUDA kernels in order to implement their tensor product based neural networks. In saying that, however, we feel that given the ever-increasing availability of computing power, we should see higher-order group equivariant neural networks appear more often in practice. We also note that while the tensor power spaces increase exponentially in dimension as we increase their order, the dimension of the space of equivariant maps between such tensor spaces is typically much smaller, and the matrices themselves are often sparse. Hence, while there may be some technical difficulties in storing such matrices, it should be possible to do so with the current computing power that is available.

We recognise, however, that further work is required in order to demonstrate fully the practical applicability of our results. In particular, we need to conduct practical experiments to assess how these neural networks perform when the order of the tensor power spaces is increased, and we need to show that these neural networks provide sufficient practical advantages over the existing approaches that use irreducible decompositions.

11. Conclusion

We are the first to show how the combinatorics underlying the Brauer and Brauer Grood vector spaces provides the theoretical background for constructing group equivariant neural networks for the orthogonal, special orthogonal, and symplectic groups when the layers are some tensor power of Rn. We looked at the problem of calculating the matrix form of the linear layer functions between such spaces in the standard/symplectic basis for Rn. We recognised that a solution to this problem would provide a powerful method for constructing group equivariant neural networks for the three groups in question since we could avoid having to solve the much more difficult problem of decomposing the tensor power spaces of Rn into their irreducible representations and then avoid having to find the change of basis matrix that would be needed to perform the layer mappings.

We saw how a basis of diagrams for the Brauer and Brauer Grood vector spaces enabled us to find a spanning set of matrices for the layer functions themselves in the standard/symplectic basis for Rn for each of the three groups in question, and how each diagram provided the parameter sharing scheme for its image in the spanning set. As a result, we were able to characterise all of the possible group equivariant neural networks whose layers are some tensor power of Rn for each of the three groups in question. We were also able to generalise this diagrammatic approach to layer functions that were equivariant to local symmetries.

As discussed in the Introduction, our results were motivated by Brauer (1937) who showed that there exists a Schur Weyl duality for each of the three groups in question with an algebra of diagrams. Consequently, this leads to the following question, which is one for future research: what other Schur Weyl dualities exist between a group and an algebra of diagrams that would enable us to understand the structure of neural networks that are equivariant to the group?

Acknowledgements

The author would like to thank his Ph D supervisor Professor William J. Knottenbelt for being generous with his time throughout the author s period of research prior to the publication of this paper.

This work was funded by the Doctoral Scholarship for Applied Research which was awarded to the author under Imperial College London s Department of Computing Applied Research scheme. This work will form part of the author s Ph D thesis at Imperial College London.

Brauer s Group Equivariant Neural Networks

Barcelo, H. and Ram, A. Combinatorial Representation Theory, 1997. ar Xiv:math/9707221.

Benkart, G. Commuting Actions a Tale of Two Groups. In Lie Algebras and Their Representations: A Symposium on Lie Algebras and Representation Theory, January 23 - 27, 1995, Seoul National University, Seoul, Korea, volume 194, pp. 1 46, 1996. URL dx.doi.og/10.1090/ conm/194/02387.

Benkart, G. Connecting the Mc Kay correspondence and Schur Weyl duality. In Proceedings of the International Congress of Mathematicians, August 13 - 21, 2014, Seoul, Korea, volume I, pp. 633 656, 2014.

Benkart, G. and Halverson, T. Partition algebras Pk(n) with 2k > n and the fundamental theorems of invariant theory for the symmetric group Sn. J. London Math. Soc, 99 (2):194 224, 2019a. URL https://doi.org/10. 1112/jlms.12175.

Benkart, G. and Halverson, T. Partition Algebras and the Invariant Theory of the Symmetric Group. In Barcelo, H., Karaali, G., and Orellana, R. (eds.), Recent Trends in Algebraic Combinatorics, volume 16 of Association for Women in Mathematics Series, pp. 1 41. Springer, 2019b. URL https://doi.org/10. 1007/978-3-030-05141-9.

Benkart, G. and Moon, D. Tensor Product Representations of Temperley Lieb Algebras and Chebyshev Polynomials. In The Tenth International Conference on Representations of Algebras and Related Topics, July 15 August 10, 2002, Fields Institute, Toronto, Canada, volume 45, pp. 57 80, 2005.

Bogatskiy, A., Anderson, B., Offermann, J. T., Roussi, M., Miller, D. W., and Kondor, R. Lorentz Group Equivariant Neural Network for Particle Physics. In Proceedings of the 37th International Conference on Machine Learning, ICML 20. JMLR.org, 2020. URL https://proceedings.mlr.press/ v119/bogatskiy20a/bogatskiy20a.pdf.

Brauer, R. On Algebras Which Are Connected with the Semisimple Continuous Groups. Ann. Math., 38:857 872, 1937. ISSN 0003486X. URL http://www.jstor. org/stable/1968843.

Brown, W. P. An Algebra Related to the Orthogonal Group. Michigan Mathematical Journal, 3(1):1 22, 1955. URL https://doi.org/10.1307/mmj/ 1031710528.

Brown, W. P. The Semisimplicity of the Brauer Algebra. Annals of Mathematics, 63(2):324 335, 1956. URL https://www.jstor.org/stable/1969613.

Cohen, T. and Welling, M. Group Equivariant Convolutional Networks. In Balcan, M. F. and Weinberger, K. Q. (eds.), Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pp. 2990 2999, New York, New York, USA, 20 22 Jun 2016. PMLR. URL https://proceedings.mlr.press/v48/ cohenc16.html.

Cohen, T. S., Geiger, M., K ohler, J., and Welling, M. Spherical CNNs. In International Conference on Learning Representations, 2018. URL https://openreview. net/forum?id=Hkbd5x ZRb.

Colmenarejo, L., Orellana, R., Saliola, F., Schilling, A., and Zabrocki, M. An Insertion Algorithm on Multiset Partitions with Applications to Diagram Algebras. Journal of Algebra, 557:97 128, 2020. ISSN 0021-8693. URL https://doi.org/10.1016/j. jalgebra.2020.04.010.

Finkelshtein, B., Baskin, C., Maron, H., and Dym, N. A Simple and Universal Rotation Equivariant Point-Cloud Network. In Cloninger, A., Doster, T., Emerson, T., Kaul, M., Ktena, I., Kvinge, H., Miolane, N., Rieck, B., Tymochko, S., and Wolf, G. (eds.), Proceedings of Topological, Algebraic, and Geometric Learning Workshops 2022, volume 196 of Proceedings of Machine Learning Research, pp. 107 115. PMLR, 25 Feb 22 Jul 2022. URL https://proceedings.mlr.press/ v196/finkelshtein22a.html.

Finzi, M., Welling, M., and Wilson, A. G. G. A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups. In Meila, M. and Zhang, T. (eds.), Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pp. 3318 3328. PMLR, 18 24 Jul 2021. URL https://proceedings.mlr.press/ v139/finzi21a.html.

Goodman, R. and Wallach, N. R. Symmetry, Representations and Invariants. Springer, 2009.

Greydanus, S., Dzamba, M., and Yosinski, J. Hamiltonian Neural Networks. In Wallach, H., Larochelle, H., Beygelzimer, A., d'Alch e-Buc, F., Fox, E., and Garnett, R. (eds.), Advances in Neural Information Processing Systems, volume 32. Curran Associates, Inc., 2019. URL https://proceedings.neurips. cc/paper_files/paper/2019/file/ 26cd8ecadce0d4efd6cc8a8725cbd1f8-Paper. pdf.

Grood, C. Brauer Algebras and Centralizer Algebras for SO(2n,C). Journal of Algebra, 222(2):678 707,

Brauer s Group Equivariant Neural Networks

1999. ISSN 0021-8693. URL https://doi.org/ 10.1006/jabr.1999.8069.

Halverson, T. and Jacobson, T. N. Set-Partition Tableaux and Representations of Diagram Algebras. Algebraic Combinatorics, 3(2):509 538, 2020. URL https:// doi.org/10.5802/alco.102.

Halverson, T. and Ram, A. Gems from the Work of Georgia Benkart. Notices of the American Mathematical Society, 69(3):375 384, March 2022. URL https: //doi.org/10.1090/noti2447.

Hanlon, P. and Wales, D. On the Decomposition of Brauer s Centralizer Algebras. Journal of Algebra, 121(2):409 445, 1989. ISSN 0021-8693. URL https://doi. org/10.1016/0021-8693(89)90076-8.

Hartford, J. S., Graham, D. R., Leyton-Brown, K., and Ravanbakhsh, S. Deep Models of Interactions Across Sets. In Proceedings of the 35th International Conference on Machine Learning, pp. 1914 1923. PMLR, 2018. URL http://proceedings.mlr.press/ v80/hartford18a.html.

Kondor, R., Lin, Z., and Trivedi, S. Clebsch Gordan Nets: a Fully Fourier Space Spherical Convolutional Neural Network. In Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., and Garnett, R. (eds.), Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018. URL https://proceedings. neurips.cc/paper/2018/file/ a3fc981af450752046be179185ebc8b5-Paper. pdf.

Kumagai, W. and Sannai, A. Universal Approximation Theorem for Equivariant Maps by Group CNNs, 2020. ar Xiv:2012.13882.

Lehrer, G. I. and Zhang, R. B. The Brauer category and invariant theory. Journal of the European Mathematical Society, 17:2311 2351, 2012. URL https://ems.press/content/ serial-article-files/7287.

Lim, L.-H. and Nelson, B. J. What is an equivariant neural network?, 2022. ar Xiv:2205.07362.

Maron, H., Ben-Hamu, H., Shamir, N., and Lipman, Y. Invariant and Equivariant Graph Networks. In International Conference on Learning Representations, 2019. URL https://openreview.net/forum? id=Syx72j C9tm.

Pan, H. and Kondor, R. Permutation Equivariant Layers for Higher Order Interactions. In Camps-Valls, G., Ruiz, F. J. R., and Valera, I. (eds.), Proceedings of The

25th International Conference on Artificial Intelligence andl Statistics, volume 151 of Proceedings of Machine Learning Research, pp. 5987 6001. PMLR, 28 30 Mar 2022. URL https://proceedings.mlr.press/ v151/pan22a.html.

Pearce-Crump, E. Connecting Permutation Equivariant Neural Networks and Partition Diagrams. ar Xiv:2212.08648, 2022.

Ryan, K. Representation-theoretic approaches to several problems in probability. Ph D thesis, Queen Mary University of London, 2021. URL https://qmro.qmul. ac.uk/xmlui/handle/123456789/77236.

Segal, E. Group Representation Theory. Course Notes for Imperial College London, 2014. URL https: //www.homepages.ucl.ac.uk/ ucaheps/ papers/Group%20Representation% 20theory%202014.pdf.

Villar, S., Hogg, D. W., Storey-Fisher, K., Yao, W., and Blum-Smith, B. Scalars are universal: Equivariant machine learning, structured like classical physics. In Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W. (eds.), Advances in Neural Information Processing Systems, 2021. URL https://openreview.net/ forum?id=ba27-Rz Na Iv.

Weiler, M., Geiger, M., Welling, M., Boomsma, W., and Cohen, T. 3D Steerable CNNs: Learning Rotationally Equivariant Features in Volumetric Data. In Proceedings of the 32nd International Conference on Neural Information Processing Systems, NIPS 18, pp. 10402 10413. Curran Associates Inc., 2018. URL https://dl.acm. org/doi/pdf/10.5555/3327546.3327700.

Wenzl, H. On the Structure of Brauer s Centralizer Algebras. Annals of Mathematics, 128(1):173 193, 1988. URL http://www.jstor.org/stable/1971466.

Weyl, H. The Classical Groups: Their Invariants and Representations. Princeton University Press, 1946.

Zaheer, M., Kottur, S., Ravanbakhsh, S., Poczos, B., Salakhutdinov, R. R., and Smola, A. J. Deep Sets. In Guyon, I., Luxburg, U. V., Bengio, S., Wallach, H., Fergus, R., Vishwanathan, S., and Garnett, R. (eds.), Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc., 2017. URL https://proceedings. neurips.cc/paper/2017/file/ f22e4747da1aa27e363d86d40ff442fe-Paper. pdf.

Brauer s Group Equivariant Neural Networks

A. Brauer s Invariant Argument, adapted for Hom G((Rn) k, (Rn) l)

A.1. Some Preliminary Material

We consider throughout the real vector space Rn.

Let GL(n) be the group of invertible linear transformations from Rn to Rn. Let G be any subgroup of GL(n).

Recall that the vector space Rn has, associated with it, its dual vector space, (Rn) . Let B := {bi | i [n]} be any basis of Rn. It has associated with it the dual basis B := {b i | i [n]}, a basis of (Rn) , such that b i (bj) = δi,j.

In particular, coordinates on Rn with respect to the basis B are linear functions, that is, elements of (Rn) . Indeed, if v = P

j xjbj, then the coordinate function xj can be identified with the dual basis vector b j since

b j(v) = b j( X

i xibi) = X

i xib j(bi) = xj (44)

Since G is a subgroup of GL(n), we see that Rn is a representation of G, which we denote by ρ1 in the following. In fact, for all f G, we have that ρ1(f) = f.

Consequently, if v is any vector in Rn, and f is any element of G, then the linear transformation

ρ1(f) = f : v f(v) (45)

can be expressed in its matrix representation, choosing B as the basis for each copy of Rn, as the matrix a(f) = (ai,j) such that yi = X

j ai,jxj (46)

where, in the basis B, v = X

j xjbj and f(v) = X

j yjbj (47)

for some coefficients xj, yj R.

We will sometimes express (46) in the form y = a(f)x, where x, y are column vectors such that the ith component of each vector is xi and yi, respectively.

As (Rn, ρ1) is a representation of G, we can define another representation of G, called the contragredient representation, on the dual space (Rn) , as follows (ρ 1 1 ) : G GL((Rn) ) (48)

where, for all f G, (ρ 1 1 ) (f) : (Rn) (Rn) (49)

is defined by (ρ 1 1 ) (f)[u] : v 7 u(ρ 1 1 (f)(v)) (50)

One way of understanding the contragredient representation ((Rn) , (ρ 1 1 ) ) of G is as the action on (Rn) such that all pairings between (Rn) and Rn remain invariant under their respective actions. Indeed, if u (Rn) , and v Rn, then we see that, for all f G v 7 ρ1(f)(v) (51)

and u 7 (ρ 1 1 ) (f)[u] (52)

and so u(v) 7 (ρ 1 1 ) (f)[u](ρ1(f)(v)) = u(ρ 1 1 (f)(ρ1(f)(v))) = u(v) (53)

Hence, expressing u (Rn) in the dual basis B as

j pjb j (54)

Brauer s Group Equivariant Neural Networks

and (ρ 1 1 ) (f)[u] (Rn) as (ρ 1 1 ) (f)[u] = X

j qjb j (55)

we see that, for any f G, the linear transformation

(ρ 1 1 ) (f) : u (ρ 1 1 ) (f)[u] (56)

can be expressed in its matrix representation as pa(f) 1 = q (57)

as a result of (53), and so p = qa(f) (58)

that is, pi = X

j qjaj,i (59)

In (58), p, q are row vectors such that the ith component of each is pi and qi, respectively.

We also have that (Rn) k is a representation of G, which we denote by ρk. In particular, for all f G, we see that ρk(f) = ρ1(f) k = f k.

Consequently, if v is any vector in (Rn) k, and f is any element of G, then the linear transformation

ρk(f) = f k : v f k(v) (60)

can be expressed in its matrix representation, choosing B as the basis for each copy of Rn, as the matrix A(f) = (AI,J), over all tuples I := (i1, i2, . . . , ik), J := (j1, j2, . . . , jk) [n]k, such that

J AI,Jx J (61)

r=1 air,jr (62)

J x Jb J and f(v) = X

J y Jb J (63)

for some coefficients x J, y J R in the basis {b J | J [n]k} of (Rn) k, where

b J := bj1 bj2 bjk (64)

As before, since ((Rn) k, ρk) is a representation of G, we obtain the contragredient representation, (((Rn) ) k, (ρ 1 k ) )

(ρ 1 k ) : G GL(((Rn) ) k) (65)

where (ρ 1 k ) (f) : ((Rn) ) k ((Rn) ) k (66)

is defined by (ρ 1 k ) (f)[u] : v 7 u(ρ 1 k (f)(v)) (67)

In particular, we see that (ρ 1 k ) = ((ρ 1 1 ) ) k.

Hence, expressing u ((Rn) ) k in the dual basis {b J | J [n]k} of (Rn) k, where

b J := b j1 b j2 b jk (68)

Brauer s Group Equivariant Neural Networks

J p Jb J (69)

and (ρ 1 k ) (f)[u] ((Rn) ) k as (ρ 1 k ) (f)[u] = X

J q Jb J (70)

we see that, for any f G, the linear transformation

(ρ 1 k ) (f) : u (ρ 1 k ) (f)[u] (71)

can be expressed in its matrix representation form as

J q JAJ,I (72)

r=1 ajr,ir (73)

A.2. Brauer s Invariant Argument

We adapt the argument given in (Brauer, 1937) to construct an invariant of G that is in bijective correspondence with an element of Hom G((Rn) k, (Rn) l).

A linear map ϕ : (Rn) k (Rn) l is an element of Hom G((Rn) k, (Rn) l) if and only if

ϕ ρk(f) = ρl(f) ϕ (74)

for all f G.

Choosing the basis B as the basis for each copy of Rn, the matrix representation of (74) is

C(ϕ)K(f) = L(f)C(ϕ) (75)

where K(f) = (KI,J) and L(f) = (LI,J) are as in (60) and C(ϕ) = (CI,J).

Hence, (75) gives X

J [n]k CI,JKJ,M = X

J [n]l LI,JCJ,M (76)

where I [n]l and M [n]k.

Brauer s trick is as follows.

Let v(1), v(2), . . . , v(k) be elements of Rn, and suppose that they are all mapped by the same transformation ρ1(f), for some f G. Then, by (47), in the basis B of Rn, we see that

j xj(r)bj (77)

for all r [k], and so, by (46), we have that yi(r) = X

j ai,jxj(r) (78)

for all r [k].

Then, considering the tensor product v(1) v(2) v(k), an element of (Rn) k, and considering its transformation under ρk(f), for the same f G, we see that the coefficient of the basis vector b J for v(1) v(2) v(k), as in (63), is

r=1 xjr(r) (79)

Brauer s Group Equivariant Neural Networks

and the coefficient of the basis vector b I for ρk(f)[v(1) v(2) v(k)] is

r=1 yir(r) (80)

and that (62) holds, namely that

r=1 air,jr (81)

Hence, by (61), we have that k Y

r=1 yir(r) = X

J [n]k KI,J

r=1 xjr(r) (82)

Also, let u(1), u(2), . . . , u(l) be elements of (Rn) , the dual space of Rn. Then, by (54), in the dual basis B , we have that

j pj(t)b j (83)

for all t [l], and so, by (59), we see that pi(t) = X

j qj(t)aj,i (84)

for all t [l].

Then, considering the tensor product u(1) u(2) u(l), an element of ((Rn) ) l, and considering its transformation under (ρ 1 l ) (f), for the same f G, we see that the coefficient of the basis vector b I for u(1) u(2) u(l), as in (69), is l Y

t=1 pit(t) (85)

and the coefficient of the basis vector b J for (ρ 1 l ) (f)[u(1) u(2) u(l)] is

t=1 qjt(t) (86)

and that (73) holds, namely that

t=1 ajt,it (87)

Hence, by (72), we have that l Y

t=1 pit(t) = X

t=1 qjt(t)LJ,I (88)

Multiplying both sides of (76) by l Y

r=1 xmr(r) (89)

adding over all tuples I [n]l, M [n]k, and applying (82) and (88) gives us, on the LHS

I [n]l,J [n]k CI,J

r=1 yjr(r) (90)

and on the RHS X

J [n]l,M [n]k CJ,M

r=1 xmr(r) (91)

Brauer s Group Equivariant Neural Networks

This means that X

I [n]l,J [n]k CI,J

r=1 xjr(r) (92)

is an invariant for the group G, that is, it is a linear transformation

((Rn) ) l (Rn) k R (93)

which maps an element of the form

u(1) u(2) u(l) v(1) v(2) v(k) (94)

to (92) that is invariant under the action of G.

We see that each stage of the argument, from (74) to (94), is an if and only if statement, since any invariant of G which is linear in any subset {v(1), v(2), . . . , v(k)} of k vectors of Rn and any subset {u(1), u(2), . . . , u(l)} of l vectors in (Rn) , and is a homogeneous function of their union, must be of the form (92) since any invariant of these l + k elements must be a linear combination of the elements Ql t=1 pit(t) Qk r=1 xjr(r) where xjr(r) is the jth r coefficient of v(r) when expressed in some basis B of Rn, and pit(t) is the ith t coefficient of u(t) when expressed in its dual basis B . Hence, starting from (92) and running the argument in reverse gives (74), and therefore shows that it is an if and only if statement.

In particular, this means that each element of Hom G((Rn) k, (Rn) l), having chosen the basis B for each copy of Rn, is in bijective correspondence with an invariant

((Rn) ) l (Rn) k R (95)

of G of the form (92), as desired.

B. First Fundamental Theorems for O(n), Sp(n) and SO(n)

We state, without proof, the First Fundamental Theorems for O(n), Sp(n) and SO(n). See (Goodman & Wallach, 2009) for more details. Theorem B.1 (First Fundamental Theorem for O(n)). Let n Z 1, and suppose that the real vector space Rn has associated with it a non-degenerate, symmetric, bilinear form ( , ), as in Section 4.

Let us choose the standard basis for Rn, so that ( , ) becomes the Euclidean inner product on Rn, as defined in (8).

If f : (Rn) (l+k) R is a polynomial function on elements in (Rn) (l+k) of the form

u(1) u(2) u(l) v(1) v(2) v(k) (96)

that is O(n)-invariant, then it must be a polynomial of the Euclidean inner products

(u(i), u(j)), (u(i), v(j)), (v(i), v(j)) (97)

Theorem B.2 (First Fundamental Theorem for Sp(n)). Let n Z 2 be even, and suppose that the real vector space Rn

has associated with it a non-degenerate, skew-symmetric, bilinear form , , as in Section 4.

Let us choose the symplectic basis for Rn, so that , becomes the form given in (12).

Note that, in this basis, the non-degenerate, symmetric, bilinear form ( , ) which we can also associate with Rn, becomes the Euclidean inner product on Rn, as defined in (8), since the symplectic basis is standard with respect to ( , ). If f : (Rn) (l+k) R is a polynomial function on elements in (Rn) (l+k) of the form

u(1) u(2) u(l) v(1) v(2) v(k) (98)

that is Sp(n)-invariant, then it must be a polynomial of the Euclidean inner products

(u(i), v(j)) (99)

together with the skew products u(i), u(j) , v(i), v(j) (100)

such that i < j in (100).

Brauer s Group Equivariant Neural Networks

Theorem B.3 (First Fundamental Theorem for SO(n)). Let n Z 1, and suppose that the real vector space Rn has associated with it a non-degenerate, symmetric, bilinear form ( , ), as in Section 4.

Let us choose the standard basis for Rn, so that ( , ) becomes the Euclidean inner product on Rn, as defined in (8).

If f : (Rn) (l+k) R is a polynomial function on elements in (Rn) (l+k) of the form

u(1) u(2) u(l) v(1) v(2) v(k) (101)

that is SO(n)-invariant, then it must be a polynomial of the Euclidean inner products

(u(i), u(j)), (u(i), v(j)), (v(i), v(j)) (102)

together with the n n subdeterminants of the n (l + k) matrix M, which is the matrix having as its columns:

| | | | | | u(1) u(2) . . . u(l) v(1) v(2) . . . v(k) | | | | | |

C. Bijective Correspondence between the Brauer and Brauer Grood vector spaces and the Invariants for O(n), Sp(n) and SO(n)

We now provide a proof of the results given in Theorems 6.6, 6.7, and 6.8.

It can be shown that Rn, as a representation of each of the groups G = O(n), Sp(n) and SO(n) (for Sp(n), n = 2m), is isomorphic to its dual space (Rn) , by using the appropriate bilinear form that is used to define each of the groups in question. See (Goodman & Wallach, 2009) for more details.

Hence, for each group G, we can apply its First Fundamental Theorem to the invariant given in (92), now considered as a function (Rn) (l+k) R, noting that each term of the polynomial (92) contains each of the vectors u(1), u(2), . . . , u(l), v(1), v(2), . . . , v(k) exactly once.

Consequently, we obtain a spanning set of invariants (Rn) (l+k) R for each of O(n), Sp(n) and SO(n) (for Sp(n), n = 2m).

Theorem C.1 (Spanning Set of Invariants (Rn) (l+k) R for O(n)). The functions

(z(1), z(2))(z(3), z(4)) . . . (z(l + k 1), z(l + k)) (104)

where z(1), . . . , z(l + k) is a permutation of u(1), u(2), . . . , u(l), v(1), v(2), . . . , v(k) form a spanning set of invariants (Rn) (l+k) R for O(n).

Clearly, functions of the form (104) can only be formed when l + k is even, hence there are no invariants (Rn) (l+k) R for O(n) when l + k is odd.

Theorem C.2 (Spanning Set of Invariants (Rn) (l+k) R for Sp(n), n = 2m). The functions

[z(1), z(2)][z(3), z(4)] . . . [z(l + k 1), z(l + k)] (105)

where z(1), . . . , z(l + k) is a permutation of u(1), u(2), . . . , u(l), v(1), v(2), . . . , v(k) and

[z(i), z(i + 1)] :=

(z(i), z(i + 1)) if z(i) = u(j) and z(i + 1) = v(m), or z(i) = v(m) and z(i + 1) = u(j), for some j [l], m [k]

z(i), z(i + 1) otherwise.

form a spanning set of invariants (Rn) (l+k) R for Sp(n), with n = 2m.

Clearly, functions of the form (105) can only be formed when l + k is even, hence there are no invariants (Rn) (l+k) R for Sp(n) when l + k is odd.

Brauer s Group Equivariant Neural Networks

Theorem C.3 (Spanning Set of Invariants (Rn) (l+k) R for SO(n)). Functions of the form (104) together with functions of the form

det(z(1), . . . , z(n))(z(n + 1), z(n + 2)) . . . (z(l + k 1), z(l + k)) (107)

where z(1), . . . , z(l + k) is a permutation of u(1), u(2), . . . , u(l), v(1), v(2), . . . , v(k) form a spanning set of invariants (Rn) (l+k) R for SO(n).

Clearly, functions of the form (107) can only be formed when n l + k, and either when n is odd and l + k is odd, or when n is even and l + k is even.

We can now construct a bijective correspondence between each of the functions (104), (105), and (107), and either a (k, l) Brauer diagram or an (l + k)\n diagram, as follows.

Indeed, consider the spanning set of invariants of the form (104). Then we can associate a (k, l) Brauer diagram with each element of the set by labelling the top l vertices by u(1), u(2), . . . , u(l) and the bottom k vertices by v(1), v(2), . . . , v(k), and drawing an edge between two vertices if and only if they appear in the same inner product in (104).

We do a similar thing for the spanning set of invariants of the form (105), associating a (k, l) Brauer diagram with each element of the set, labelling the vertices in the same mamner, and drawing an edge between two vertices if and only if they appear in the same inner or skew product in (105).

Finally, consider the functions of the form (107). Then we can associate an (l + k)\n diagram with each element of the set by labelling the top l vertices by u(1), u(2), . . . , u(l) and the bottom k vertices by v(1), v(2), . . . , v(k), leaving the vertices z(1), . . . , z(n) free and drawing an edge between the other vertices if and only if they appear in the same inner product in (107).

As a result, we have constructed a bijective correspondence between a spanning set of invariants (Rn) (l+k) R for O(n) with the set of (k, l) Brauer diagrams whose span is the Brauer vector space Bl k(n), for SO(n) with the set of (k, l) Brauer diagrams together with the set of (l + k)\n diagrams whose span is the Brauer Grood vector space Dl k(n), and for Sp(n) (n = 2m) with the set of (k, l) Brauer diagrams whose span is the Brauer vector space Bl k(n).

Consequently, for each group G, the bijective correspondence (92) that exists between the spannning set of invariants (Rn) (l+k) R (given in Theorems C.1, C.2, and C.3) and a spanning set of matrices for Hom G((Rn) k, (Rn) l) in the standard/symplectic basis of Rn, and the bijective correspondence that exists between the spannning set of invariants (Rn) (l+k) R and the set of diagrams that span either the Brauer vector space Bl k(n), for O(n) and Sp(n), or the Brauer Grood vector space Dl k(n), for SO(n), together prove the results given in Theorems 6.6, 6.7, and 6.8.

D. Examples

In the following, in order to save space, we use some temporary notation to denote a sum of a number of weight parameters. We represent a sum of weight parameters, where the sum is over some index set A, by a single element that is indexed by the set of indices itself, that is

i A λi (108)

For example, λ8,11,12 represents the sum λ8 + λ11 + λ12.

D.1.1. A BASIS OF End O(2)((R2) 2)

We consider the surjective map

Φ2 2,2 : B2 2(2) End O(2)((R2) 2) (109)

and apply Theorem 6.6, noting that l + k is even. Also, as 2n l + k, this map is an isomorphism of vector spaces, hence the images of the basis diagrams of B2 2(2) forms a basis of End O(2)((R2) 2).

Figure 2 shows how to find the basis of End O(2)((R2) 2) from the basis of B2 2(2).

Brauer s Group Equivariant Neural Networks

Basis Diagram dβ Matrix Entries Basis Element of End O(2)((R2) 2)

(δi1,i2δj1,j2)

1,1 1,2 2,1 2,2

1,1 1 0 0 1

1,2 0 0 0 0

2,1 0 0 0 0

2,2 1 0 0 1

(δi1,j1δi2,j2)

1,1 1,2 2,1 2,2

1,1 1 0 0 0

1,2 0 1 0 0

2,1 0 0 1 0

2,2 0 0 0 1

(δi1,j2δi2,j1)

1,1 1,2 2,1 2,2

1,1 1 0 0 0

1,2 0 0 1 0

2,1 0 1 0 0

2,2 0 0 0 1

Figure 2. The images under Φ2 2,2 of the basis diagrams of B2 2(2) make up a basis of End O(2)((R2) 2).

Hence, any element of End O(2)((R2) 2), in the basis of matrix units of End((R2) 2), is of the form

1,1 1,2 2,1 2,2

1,1 λ1,2,3 0 0 λ1

1,2 0 λ2 λ3 0

2,1 0 λ3 λ2 0

2,2 λ1 0 0 λ1,2,3

for scalars λ1, λ2, λ3 R.

D.1.2. A BASIS OF End O(3)((R3) 3)

We consider the surjective map

Φ3 3,3 : B3 3(3) End O(3)((R3) 3) (111)

and apply Theorem 6.6, noting that l + k is even. Also, as 2n l + k, this map is an isomorphism of vector spaces, hence the images of the basis diagrams of B3 3(3) forms a basis of End O(3)((R3) 3).

The basis diagrams of B3 3(3) are

Brauer s Group Equivariant Neural Networks

Taking their images under Φ3 3,3, we see that any element of End O(3)((R3) 3), in the basis of matrix units of End((R3) 3), is of the form

1,1,1 λ1,...,15 0 0 0 λ8,11,12 0 0 0 λ8,11,12 0 λ9,14,15 0 λ7,10,13 0 0 0 0 0 0 0 λ9,14,15 0 0 0 λ7,10,13 0 0

1,1,2 0 λ1,2,7 0 λ4,6,15 0 0 0 0 0 λ3,5,11 0 0 0 λ7,11,15 0 0 0 λ11 0 0 0 0 0 λ15 0 λ7 0

1,1,3 0 0 λ1,2,7 0 0 0 λ4,6,15 0 0 0 0 0 0 0 λ7 0 λ15 0 λ3,5,11 0 0 0 λ11 0 0 0 λ7,11,15

1,2,1 0 λ3,4,13 0 λ1,5,9 0 0 0 0 0 λ2,6,12 0 0 0 λ9,12,13 0 0 0 λ12 0 0 0 0 0 λ9 0 λ13 0

1,2,2 λ8,10,14 0 0 0 λ1,4,8 0 0 0 λ8 0 λ2,3,14 0 λ5,6,10 0 0 0 0 0 0 0 λ14 0 0 0 λ10 0 0

1,2,3 0 0 0 0 0 λ1 0 λ4 0 0 0 λ2 0 0 0 λ6 0 0 0 λ3 0 λ5 0 0 0 0 0

1,3,1 0 0 λ3,4,13 0 0 0 λ1,5,9 0 0 0 0 0 0 0 λ13 0 λ9 0 λ2,6,12 0 0 0 λ12 0 0 0 λ9,12,13

1,3,2 0 0 0 0 0 λ4 0 λ1 0 0 0 λ3 0 0 0 λ5 0 0 0 λ2 0 λ6 0 0 0 0 0

1,3,3 λ8,10,14 0 0 0 λ8 0 0 0 λ1,4,8 0 λ14 0 λ10 0 0 0 0 0 0 0 λ2,3,14 0 0 0 λ5,6,10 0 0

2,1,1 0 λ5,6,10 0 λ2,3,14 0 0 0 0 0 λ1,4,8 0 0 0 λ8,10,14 0 0 0 λ8 0 0 0 0 0 λ14 0 λ10 0

2,1,2 λ9,12,13 0 0 0 λ2,6,12 0 0 0 λ12 0 λ1,5,9 0 λ3,4,13 0 0 0 0 0 0 0 λ9 0 0 0 λ13 0 0

2,1,3 0 0 0 0 0 λ2 0 λ6 0 0 0 λ1 0 0 0 λ4 0 0 0 λ5 0 λ3 0 0 0 0 0

2,2,1 λ7,11,15 0 0 0 λ3,5,11 0 0 0 λ11 0 λ4,6,15 0 λ1,2,7 0 0 0 0 0 0 0 λ15 0 0 0 λ7 0 0

2,2,2 0 λ7,10,13 0 λ9,14,15 0 0 0 0 0 λ8,11,12 0 0 0 λ1,...,15 0 0 0 λ8,11,12 0 0 0 0 0 λ9,14,15 0 λ7,10,13 0

2,2,3 0 0 λ7 0 0 0 λ15 0 0 0 0 0 0 0 λ1,2,7 0 λ4,6,15 0 λ11 0 0 0 λ3,5,11 0 0 0 λ7,11,15

2,3,1 0 0 0 0 0 λ3 0 λ5 0 0 0 λ4 0 0 0 λ1 0 0 0 λ6 0 λ2 0 0 0 0 0

2,3,2 0 0 λ13 0 0 0 λ9 0 0 0 0 0 0 0 λ3,4,13 0 λ1,5,9 0 λ12 0 0 0 λ2,6,12 0 0 0 λ9,12,13

2,3,3 0 λ10 0 λ14 0 0 0 0 0 λ8 0 0 0 λ8,10,14 0 0 0 λ1,4,8 0 0 0 0 0 λ2,3,14 0 λ5,6,10 0

3,1,1 0 0 λ5,6,10 0 0 0 λ2,3,14 0 0 0 0 0 0 0 λ10 0 λ14 0 λ1,4,8 0 0 0 λ8 0 0 0 λ8,10,14

3,1,2 0 0 0 0 0 λ6 0 λ2 0 0 0 λ5 0 0 0 λ3 0 0 0 λ1 0 λ4 0 0 0 0 0

3,1,3 λ9,12,13 0 0 0 λ12 0 0 0 λ2,6,12 0 λ9 0 λ13 0 0 0 0 0 0 0 λ1,5,9 0 0 0 λ3,4,13 0 0

3,2,1 0 0 0 0 0 λ5 0 λ3 0 0 0 λ6 0 0 0 λ2 0 0 0 λ4 0 λ1 0 0 0 0 0

3,2,2 0 0 λ10 0 0 0 λ14 0 0 0 0 0 0 0 λ5,6,10 0 λ2,3,14 0 λ8 0 0 0 λ1,4,8 0 0 0 λ8,10,14

3,2,3 0 λ13 0 λ9 0 0 0 0 0 λ12 0 0 0 λ9,12,13 0 0 0 λ2,6,12 0 0 0 0 0 λ1,5,9 0 λ3,4,13 0

3,3,1 λ7,11,15 0 0 0 λ11 0 0 0 λ3,5,11 0 λ15 0 λ7 0 0 0 0 0 0 0 λ4,6,15 0 0 0 λ1,2,7 0 0

3,3,2 0 λ7 0 λ15 0 0 0 0 0 λ11 0 0 0 λ7,11,15 0 0 0 λ3,5,11 0 0 0 0 0 λ4,6,15 0 λ1,2,7 0

3,3,3 0 0 λ7,10,13 0 0 0 λ9,14,15 0 0 0 0 0 0 0 λ7,10,13 0 λ9,14,15 0 λ8,11,12 0 0 0 λ8,11,12 0 0 0 λ1,...,15

(112) for scalars λ1, . . . , λ15 R.

Notice that End O(3)((R3) 3) is a 15-dimensional vector space living inside a 729-dimensional vector space, End((R3) 3).

We saw that this is similar to O(n), except we replace δ by ϵ if there is an edge between two vertices that are in the same row.

D.2.1. A SPANNING SET FOR Hom Sp(2)((R2) 3, R2)

For the surjective map X1 3,2 : B1 3(2) Hom Sp(2)((R2) 3, R2) (113)

we apply Theorem 6.7, noting that l + k = 4, which is even.

Figure 3 shows how to find a spanning set for Hom Sp(2)((R2) 3, R2).

This means that any element of Hom Sp(2)((R2) 3, R2), in the basis of matrix units of Hom((R2) 3, R2), is of the form

1,1,1 1,1,1 1,1 ,1 1,1 ,1 1 ,1,1 1 ,1,1 1 ,1 ,1 1 ,1 ,1

1 0 λ1 + λ2 λ1 + λ3 0 λ2 λ3 0 0 0

1 0 0 0 λ2 + λ3 0 λ1 λ3 λ1 λ2 0

for scalars λ1, λ2, λ3 R.

D.3.1. A SPANNING SET FOR End SO(3)((R3) 3)

We apply Theorem 6.8.

As n l + k, and n is odd and l + k is even, we see that End SO(3)((R3) 3) = End O(3)((R3) 3) and so any element of End SO(3)((R3) 3), in the basis of matrix units of End((R3) 3), is of the form (112).

Brauer s Group Equivariant Neural Networks

Basis Diagram dβ Matrix Entries Spanning Set Element of Hom Sp(2)((R2) 3, R2)

(δi1,j1ϵj2,j3)

1,1,1 1,1,1 1,1 ,1 1,1 ,1 1 ,1,1 1 ,1,1 1 ,1 ,1 1 ,1 ,1

1 0 1 1 0 0 0 0 0

1 0 0 0 0 0 1 1 0

(δi1,j2ϵj1,j3)

1,1,1 1,1,1 1,1 ,1 1,1 ,1 1 ,1,1 1 ,1,1 1 ,1 ,1 1 ,1 ,1

1 0 1 0 0 1 0 0 0

1 0 0 0 1 0 0 1 0

(δi1,j3ϵj1,j2)

1,1,1 1,1,1 1,1 ,1 1,1 ,1 1 ,1,1 1 ,1,1 1 ,1 ,1 1 ,1 ,1

1 0 0 1 0 1 0 0 0

1 0 0 0 1 0 1 0 0

Figure 3. The images under X1 3,2 of the basis diagrams of B1 3(2) make up a spanning set for Hom Sp(2)((R2) 3, R2).

D.3.2. A SPANNING SET FOR Hom SO(2)((R2) 3, R2)

Again, we apply Theorem 6.8.

As n l + k, and n is even and l + k is even, we see that three (3, 1) Brauer diagrams and six (1 + 3)\2 diagrams make up a basis of D1 3(2). Their images under

Ψ1 3,2 : D1 3(2) Hom SO(2)((R2) 3, R2) (115)

forms a spanning set of Hom SO(2)((R2) 3, R2).

Calculating the images of the three (3, 1) Brauer diagrams is the same as for the O(n) case. Figure 4 shows how to find the images of the six (1 + 3)\2 diagrams.

D.4. Local Symmetry Example

As an example of the result given in Section 8, suppose that we want to find a spanning set for

Hom SO(3) SO(3)((R3) 3 R3, (R3) 3 (R3) 2) (116)

By (43), we know that (116) is isomorphic to

End SO(3)((R3) 3) Hom SO(3)(R3, (R3) 2) (117)

Hence, to find a spanning set, all we need to do is find the images of the basis elements of D3 3(3) D2 1(3) under Ψ3 3,3 Ψ2 1,3 and take the Kronecker product of the resulting matrices.

To save space, we will only show what the basis elements of D3 3(3) D2 1(3) are. By Theorem 6.8, the basis elements are

Brauer s Group Equivariant Neural Networks

where we have used a red demarcation line to separate the vertices of the respective diagrams. Note that no edge can cross this red line.

Basis Diagram dα Matrix Entries Spanning Set Element of Hom SO(2)((R2) 3, R2)

(χ 1 2 j2 j3 δi1,j1)

1,1,1 1,1,2 1,2,1 1,2,2 2,1,1 2,1,2 2,2,1 2,2,2

1 0 1 1 0 0 0 0 0

2 0 0 0 0 0 1 1 0

(χ 1 2 j1 j3 δi1,j2)

1,1,1 1,1,2 1,2,1 1,2,2 2,1,1 2,1,2 2,2,1 2,2,2

1 0 1 0 0 1 0 0 0

2 0 0 0 1 0 0 1 0

(χ 1 2 j1 j2 δi1,j3)

1,1,1 1,1,2 1,2,1 1,2,2 2,1,1 2,1,2 2,2,1 2,2,2

1 0 0 1 0 1 0 0 0

2 0 0 0 1 0 1 0 0

(χ 1 2 i1 j3 δj1,j2)

1,1,1 1,1,2 1,2,1 1,2,2 2,1,1 2,1,2 2,2,1 2,2,2

1 0 1 0 0 0 0 0 1

2 1 0 0 0 0 0 1 0

(χ 1 2 i1 j2 δj1,j3)

1,1,1 1,1,2 1,2,1 1,2,2 2,1,1 2,1,2 2,2,1 2,2,2

1 0 0 1 0 0 0 0 1

2 1 0 0 0 0 1 0 0

(χ 1 2 i1 j1 δj2,j3)

1,1,1 1,1,2 1,2,1 1,2,2 2,1,1 2,1,2 2,2,1 2,2,2

1 0 0 0 0 1 0 0 1

2 1 0 0 1 0 0 0 0

Figure 4. The images under Ψ1 3,2 of the six (1 + 3)\2 diagrams in D1 3(2), together with the images under Ψ1 3,2 of the three (3, 1) Brauer diagrams in D1 3(2) (not shown), make up a spanning set for Hom SO(2)((R2) 3, R2).