# multiparameter_persistence_landscapes__1452d48d.pdf

Journal of Machine Learning Research 21 (2020) 1-38 Submitted 1/19; Published 3/20

Multiparameter Persistence Landscapes

Oliver Vipond vipond@maths.ox.ac.uk Mathematical Institute University of Oxford Oxford, OX2 7DT, UK

Editor: Sayan Mukherjee

An important problem in the ﬁeld of Topological Data Analysis is deﬁning topological summaries which can be combined with traditional data analytic tools. In recent work Bubenik introduced the persistence landscape, a stable representation of persistence diagrams amenable to statistical analysis and machine learning tools. In this paper we generalise the persistence landscape to multiparameter persistence modules providing a stable representation of the rank invariant. We show that multiparameter landscapes are stable with respect to the interleaving distance and persistence weighted Wasserstein distance, and that the collection of multiparameter landscapes faithfully represents the rank invariant. Finally we provide example calculations and statistical tests to demonstrate a range of potential applications and how one can interpret the landscapes associated to a multiparameter module. Keywords: Topological Data Analysis, Multiparameter Persistence, Persistence Landscapes, Machine Learning, Statistical Topology

1. Introduction

Topological and Geometric Data Analysis (TGDA) describes an emerging set of analytic tools which leverage the underlying shape data to produce topological summaries. These techniques have been particularly successful at providing new insight for high dimensional data sets, topological data structures and biological data sets (Gameiro et al., 2015; Kelin and Guo-Wei, 2015; Nicolau et al., 2011). An ideal topological summary should discriminate well between diﬀerent spaces, be stable to perturbations of the initial data, and be amenable to statistical analysis. Persistent homology (PH) has become a ubiquitous tool in the TGDA arsenal. PH studies the homology groups of a family of topological spaces built upon data. Homology is an invariant from algebraic topology which detects diﬀerent dimensional holes in a topological space. Degree zero homology detects connected components, degree one homology detects loops, degree two homology detects voids and higher degree homology detects higher dimensional cavities. Persistent homology tracks how these homological feature persist throughout the family of topological spaces. The associated topological summary for a 1-parameter family is given in terms of a persistence diagram marking the parameter values for births and deaths of homological features. The persistence diagram may equivalently be thought of as a multiset

c 2020 Oliver Vipond.

License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v21/19-054.html.

of points in R2. For an introduction to persistent homology see Edelsbrunner and Harer (2010) and Oudot (2015). Qualitatively one considers long-lived homological features detected as inherent to the data set and the short-lived features as noise. Nevertheless in various applications it has been revealed that the short-lived features provide important discriminating information in classiﬁcation (Bubenik et al., 2019). For example in analysing the topology of brain arteries it was found the 28th longest persisting feature provided the most useful discriminating information (Bendich et al., 2016). We would like therefore to have a statistical framework and topological summary in which one can detect statistically signiﬁcant topological features of large, medium and small persistence, and in particular does not discard short-lived features as noise. One can equip the collection of persistence diagrams with a natural pseudo-metric known as the bottleneck distance. The resulting metric space of persistence diagrams does not enjoy desired properties for traditional statistical analysis. For example, a collection of persistence diagrams may not have a well deﬁned mean. As a result there have been various attempts to vectorize the persistence diagram, in order that the summary is more amenable to statistical analysis and machine learning techniques. The article Bubenik (2015) introduces a stable vectorization of the persistence diagram, the persistence landscape, a function in Lebesgue p-space. The article Bubenik (2018) explores some further properties of the persistence landscape. The persistence landscape naturally enjoys unique means and one can perform traditional hypothesis tests upon the landscape function and numerical statistics derived from this function. Another popular stable vectorization of the persistence diagram is the persistence image (Adams et al., 2017) which has been shown to produce favourable classiﬁcation results when combined with machine learning techniques. There are several natural situations where we may wish to build a richer structure of topological spaces on a data set and track the changes to the homology whilst varying multiple parameters. For example in (Keller et al., 2018) the topology of chemical compounds is studied using 2-parameter ﬁltrations. The PH theory becomes wildly more complicated when the family of topological spaces on our data set is indexed by multiple parameters. The associated family of homology groups is known as a multiparameter persistence module. The theory of multiparameter persistence modules is presented in Carlsson and Zomorodian (2009) and unlike the single parameter case where we may associate a persistence diagram to a module, there is not an analogous complete discrete invariant in the multiparameter setting. There exist various approaches to deﬁne invariants for multiparameter persistence modules in the literature. The rank invariant of a module has been studied in the context of H0-modules and shape matching and has been shown to be stable when endowed with a matching distance (Frosini and Landi, 2011; Cerri et al., 2013). An alternative approach uses algebraic geometry to construct numeric functions on multiparameter persistence modules (Skryzalin and Carlsson, 2017), generalizing the ring of algebraic functions on barcode space for the single parameter case (Adcock et al., 2016). However, as noted by Kaliˇsnik (2019), the disadvantage of both of these approaches is that equipped with their vector space norms, these invariants are unstable with respect to the natural distance to consider on multiparameter modules, the interleaving distance. Another approach is to study

Multiparameter Persistence Landscapes

algebraic invariants associated to the multigraded algebra structure of multiparameter persistence modules (Harrington et al., 2019; Miller, 2017). In this article we introduce a family of new stable invariants for multiparameter persistence modules, naturally extending the results of Bubenik (2015) from the setting of single parameter persistence modules to multiparameter persistence modules. We shall consider persistence modules indexed by continuous real parameters. Our incomplete invariants, the multiparameter persistence landscapes, are derived from the rank invariant associated to a multiparameter persistence module, and they are continuous functions in Lesbegue p-space. The multiparameter persistence landscape reduces to a family of single parameter persistence landscapes: the multiparameter landscape of a persistence module M indexed over Rn is characterized by the fact that it restricts to the single parameter landscape function of the restriction of M to every line parallel to the main diagonal vector 1 = (1, 1, ..., 1). Thus, whilst the single parameter persistence landscape faithfully represents the persistence diagram of a single parameter module (Betthauser et al., 2019), the multiparameter persistence landscape encodes the persistence diagrams of the family of single parameter persistence submodules which lie on the lines of slope 1 through the parameter space. Living in Lesbegue p-space, the multiparameter landscapes are naturally endowed with a distance function and there is a uniquely deﬁned mean associated to multiple landscapes. Considered as Banach space valued random variables, multiparameter landscapes satisfy the Strong Law of Large Numbers and the Central Limit Theorem and are thus are well suited to statistical analysis. The natural inner-product structure on the landscape functions gives rise to a positive-deﬁnite kernel, which can be leveraged by machine learning algorithms. The multiparameter landscape functions are sensitive to homological features of large, medium and small persistence. The landscapes also have the advantage of being interpretable since they are closely related to the rank invariant. Moreover one can derive stable R-valued numeric invariants from the landscape functions using the linear functionals in the dual space. We can produce conﬁdence intervals and perform hypothesis tests on these numeric invariants which are viewed as R-valued random variables. In the 2-parameter case we visualise a multiparameter landscapes as a surface λ : R2 R. We shall present computational examples in the 2-parameter case using the RIVET software presented in Lesnick and Wright (2015). These examples will serve a range of potential applications, demonstrating that the landscapes are sensitive to both the topology and geometry of data.

2. Multiparameter Persistence

The central objects of interest in multiparameter persistence theory are multiparameter persistence modules. These are algebraic objects which can be used to encode rich topological and geometric properties of a data set. The following example gives a construction of a multiparameter persistence module.

Example 1 (Sublevel-set Multiparameter Persistence Module) Let X be a topological space and f : X Rn, called a ﬁltering function. We can associate a family of topological subspaces indexed by vectors a = (a1, ..., an) Rn induced by f:

Xa = {x X : f(x)i < ai i = 1, ..., n}

this is known as the sublevel-set ﬁltration. For any b Rn such that ai bi for all i = 1, .., n, we have an inclusion map Xa , Xb. If we let H denote a singular homology functor with coeﬃcients in a ﬁeld then applying this functor to the collection {Xa}a Rn and the appropriate inclusion maps gives rise to a family of vector spaces {H(Xa)}a Rn and linear maps {H(Xa) H(Xb)}a b known as a Sublevel-set Multiparameter Persistence Module.

In the case of point cloud data in Euclidean space we can capture the spatial distribution of the points with our ﬁltering function. Let P = {p1, ..., pn} RN = X denote a point cloud in Euclidean space. We may deﬁne our ﬁltering function from this point cloud f P : RN Rn to have the ﬁrst parameter be the distance function from the point cloud f P (x)1 = minp P x p . The remaining coordinates can be used to ﬁlter our space by other parameters of interest. For example, one could choose the second parameter to be a density function in order to introduce robustness to outliers of the point cloud. Alternatively a second parameter could be chosen to track a numeric property associated to the points of the point cloud such as charge or mass in the case that the points in the point cloud represent atoms (Keller et al., 2018). The resulting sublevel-set multiparameter persistence module will then encode the topology of the spatial distribution of the point cloud and its interdependence with other chosen ﬁltration parameters. We provide such constructions and example computations in Section 6. We can give a compact mathematical description of multiparameter persistence modules as objects of functor categories or graded modules over polynomial rings. Let Pn denote the monoid ring of the monoid ([0, )n, +) over a ﬁeld F. Equivalently one may think of Pn as a pseudo-polynomial ring F[x1, ..., xn] in which exponents are only required to be non-negative and can be non-integral. Let An denote the polynomial ring F[x1, ..., xn] or analogously the monoid ring of (Nn, +) over F. Let P denote the category associated to the poset P, so that Rn and Zn denote the categories associated to the posets (Rn, ) and (Zn, ) under the standard coordinate-wise partial orders. Let Vect denote the category of vector spaces and linear maps over F, and vect denote the subcategory of ﬁnite dimensional vector spaces. Moreover, for a category C, let us denote the C-valued functors on P by CP.

Deﬁnition 1 (Multiparameter Persistence Module) Let M be a module over the ring Pn. We say M is a persistence module if M is an Rn-graded Pn-module. That is to say M has a decomposition as a F-vector space M = L a Rn Ma compatible with the action of Pn:

m Ma = xb m Ma+b

We require a morphism of Rn-graded modules f : M N to be compatible with the module structure f(xb m) = xb f(m) and respect the grading of the modules so that m Ma implies f(m) Na.

In the setting of a sublevel-set persistence module the vector space at each grade is H(Xa), and the action of xb on H(Xa) is given by the linear map of homology groups induced by the inclusion Xa , Xa+b.

Multiparameter Persistence Landscapes

Deﬁnition 2 (Multiparameter Persistence Module) Let M be an element of the functor category Vect Rn. We say M is a multiparameter persistence module. A morphism of persistence modules is a natural transformation M N.

Deﬁnition 1 and Deﬁnition 2 are equivalent as realised by an equivalence of categories between Rn-graded Pn-Mod and Vect Rn (Lesnick, 2015). The theory of persistent homology is well developed for topological spaces ﬁltered by a single parameter. Under appropriate ﬁniteness conditions, one can record the homological features of the ﬁltered topological space captured by the single parameter persistence module in a multiset of intervals known as the barcode. Moreover the Krull-Schmidt Decomposition Theorem establishes the barcode as a complete invariant (Crawley-Boevey, 2015). No such discrete complete invariant exists for multiparameter modules. However, associated to a pointwise ﬁnite dimensional multiparameter module is a family of single parameter modules whose collection of barcodes is known as the ﬁbered barcode (Lesnick and Wright, 2015). Since we are working with persistence modules induced by real world ﬁnite data, the pointwise ﬁnite dimensionality assumption we require for the ﬁbered barcode to exist is not a serious restriction. We shall see later that we will be able to reduce the computation of our multiparameter module invariant, the multiparameter persistence landscapes, to queries of the ﬁbered barcode.

Deﬁnition 3 (Fibered Barcode)(Lesnick and Wright, 2015) Let L denote the subposet of Rn corresponding to a positively sloped line L Rn. Let ιL : (R, ) (Rn, ) denote the isometric embedding with ιL(R) = L and ιL(0) {xn = 0}. For M vect Rn

the composite ML = M ιL is a pointwise ﬁnite-dimensional single parameter persistence module, and thus has an associated barcode B(ML). Let L denote the set of positively sloped lines. The collection {B(ML) : L L} is known as the ﬁbered barcode of M.

In view of applications of multiparameter persistence to data analysis we would like a distance function to compare multiparameter persistence modules. The interleaving distance, d I, is a pseudo-metric on multiparameter persistence modules and provides a notion of algebraic similarity. Details of the interleaving distance can be found in the Appendix A.2. This pseudo-metric has been shown to be the most discriminating, stable metric on multiparameter persistence modules (Lesnick, 2015). A comprehensive mathematical formulation of metrics for generalized persistence modules is presented in Bubenik et al. (2015). Despite the strong theoretical properties which make the interleaving distance a good candidate to compare multiparameter persistence modules, it has a couple of undesirable properties for data analysis. The interleaving distance has been shown to be NP-hard to compute (Bjerkevik and Botnan, 2018; Bjerkevik et al., 2019) and so is not computable in applications. Moreover the interleaving distance has the characteristics of an L -style distance, in that it measures the worst place two modules match up over the parameter space. In contrast, an Lp-style distance, for p [1, ), is sensitive to the diﬀerence of two modules over the whole parameter space.

2.1. Interval Decomposable Modules

Given the complicated nature of unconstrained multiparameter persistence modules, it is common to consider subclasses of multiparameter persistence modules. These modules admit a discrete complete description in analogy with single parameter modules. We can use the decomposition of this subclass of modules into simple summands to deﬁne matching distances between theses modules. Recall that we denote the category of ﬁnite dimensional vector spaces as vect.

Deﬁnition 4 (Interval Decomposable Modules) We deﬁne a subposet I Rn to be an interval if s, t I, s r t implies r I and for any s, t I ri I connecting s and t, s = r0 r1 r2 r3 ... rn = t. The interval module 1I vect Rn associated to an interval I has a one dimensional vector space at each a I and internal isomorphisms given by the identity wherever possible. We say a module M vect Rn is interval decomposable if M = L j J 1Ij, for some indexed set of intervals {Ij : j J }.

The Krull-Schmidt-Remak-Azumaya Theorem guarantees that the decomposition of an interval decomposable module is unique up to reordering. We can thus assign the indexed set of intervals in the decomposition of a module M to be the barcode B(M) = {Ij : j J }.

Deﬁnition 5 (ε-Matching) Let {Ij : j J } and {Jk : k K} be indexed sets of intervals. We say a partial bijection σ : J K is an ε-matching if d I(1Ij, 1Jσ(j)) ε for matched intervals and d I(1Ij, 0), d I(0, 1Jk) ε for unmatched intervals.

Deﬁnition 6 (Bottleneck Distance) Let M and N be interval decomposable modules. The bottleneck distance between the modules is given by

d B(M, N) = inf{ε 0 : B(M), B(N) admit an ε-matching}.

One would hope to attain a result analogous to the isometry theorem for ordinary one dimensional persistent homology relating the bottleneck distance and the interleaving distance. In the single parameter case an ε-interleaving induces an ε-matching between summands (Bauer and Lesnick, 2014). In contrast, the interleaving distance and bottleneck distance do not coincide for multiparameter interval decomposable modules. Certainly the bottleneck distance provides an upper bound on the interleaving distance. However general interleavings of interval decomposable multiparameter modules do not necessarily induce a matching of interval summands. This is best illustrated by an example provided in Bjerkevik (2016) for which the optimal matching between 1-interleaved modules is a 3-matching. We can further deﬁne a Wasserstein distance for interval decomposable modules.

Deﬁnition 7 (p-Wasserstein Distance) Let M, N be interval decomposable persistence modules with barcodes {Ij : j J } and {Jκ : κ K} respectively. Append a collection of empty intervals to each barcode of cardinality the size of the other barcode. For a matching σ : J K, let εj = d I(1Ij, 1Jσ(j)). The p-Wasserstein distance is given by

d Wp(M, N) = inf σ:J K

Multiparameter Persistence Landscapes

The bottleneck distance is simply the -Wasserstein distance. If we wish to place extra emphasis on intervals with large persistence we may use the persistence weighted p-Wasserstein distance (Deﬁnition 9). In order to ensure the persistence weighted p Wasserstein distance is well deﬁned we need to check that intervals are Lebesgue measurable.

Proposition 8 If I Rn is an interval of the partially ordered set Rn, then I is Lebesgue measurable.

Proof Recall the following characterisation of Lebesgue measurable sets. A set I Rn is Lebesgue measurable if and only if for all ε > 0 there exists an open set U and a closed set C such that U I C, and the Lebesgue measure µ(C \ U) < ε. To establish intervals are measurable it suﬃces to show that for any interval I Rn

there is some open U Rn and some closed set C Rn (open and closed under the standard Euclidean topology) such that U I C and C \ U has zero measure. We shall ﬁrst show that the interior of the closure of an interval (I) coincides with the interior of that interval (I) = I . If x (I) , then there is some -norm ball contained in I containing x in its interior. Thus there exist a, b I with ai < xi < bi for all i = 1, ..., n. The interval I is dense in I so we can perturb a and b to a , b I with a i < xi < b i. Since I is an interval containing a , b we have that the open set {y Rn : a i < yi < b i} is contained in I, and so x I . The set I is closed and so it is Lebesgue measurable and in particular I \ (I) has zero measure. Thus I \ I = I \ (I) also has zero measure and we have found the requisite U = I and C = I witnessing that I is measurable.

Deﬁnition 9 (Persistence Weighted p-Wasserstein Distance) Let M, N be interval decomposable persistence modules with barcodes {Ij : j J } and {Jκ : κ K}. For a union of a pair of intervals I J Rn let |I J| denote the Lebesgue measure. The persistence weighted p-Wasserstein distance is given by

d W p(M, N) = inf σ:J K

J |Ij Jσ(j)|εp j

The p-landscape distance we introduce in the following section is similar to the persistence weighted p-Wasserstein Distance and can be deﬁned for persistence modules which do not admit an interval decomposition. In Section 4 we will show that our invariant is stable with respect to interleaving distance and the persistence weighted Wasserstein distance. In particular the Lp vector space norms on multiparameter landscapes provide stable, computable distance functions for multiparameter persistence modules.

3. Persistence Landscapes

In this section we shall recall the deﬁnition of the single parameter persistence landscape and its properties. We shall generalize the deﬁnition to multiparameter persistence modules and show which properties of the single parameter persistence landscape are preserved. From this point onward all single parameter persistence modules we consider shall be pointwise ﬁnite dimensional in order that they admit an interval decomposition (Crawley Boevey, 2015). The multiparameter persistence modules we consider will be pointwise ﬁnite dimensional, but will not necessarily admit an interval decomposition.

3.1. Single Parameter Persistence Landscapes

The persistence landscape associated to a single parameter persistence module is deﬁned in Bubenik (2015). The persistence landscape is derived from the rank invariant of a module.

Deﬁnition 10 (Rank Invariant) Let M vect R be a persistence module then for a b the function β , giving the corresponding Betti number is the rank invariant of M:

βa,b = dim(Im(Ma Mb)).

Deﬁnition 11 (Rank Function) The rank function rk : R2 R is given by

( βb,d if b d 0 otherwise.

Deﬁnition 12 (Rescaled Rank Function) The rescaled rank function r : R2 R is supported on the upper half plane:

( βm h,m+h if h 0 0 otherwise.

Observe that the rank function has support contained in the upper triangular half of the plane with the coordinates corresponding to births and deaths , whilst the rescaled rank function has support contained in the upper half plane with coordinates corresponding to midpoints and half-lifes .

Deﬁnition 13 (Persistence Landscape) (Bubenik, 2015) The persistence landscape is a function λ : N R R, where R denotes the extended real numbers, [ , ]. The function λ(k, t) : R R is deﬁned by

λ(k, t) = sup{h 0 : βt h,t+h k}.

The value λ(k, t) gives the maximal radius of an interval centred at t that is contained in at least k intervals of the barcode. The persistence landscape and persistence diagram of a suitably well-behaved single parameter module carry the same information (Betthauser et al., 2019). Alternatively the persistence landscape of a single parameter module can be derived from the landscape functions of the modules interval summands; see Figure 1.

Multiparameter Persistence Landscapes

λ(M)(1, x) λ(M)(2, x) λ(M)(3, x)

Figure 1: We show the persistence diagram of a single parameter module M on the left and the associated persistence landscapes on the right. One can see that λ(M)(k, t) is the kmax of the landscape functions λ((bj, dj))(1, t) of the interval summands of M.

Deﬁnition 14 (Persistence Landscape) (Bubenik, 2015) Let M be a single parameter persistence module with associated persistence diagram given by the indexed set {(bj, dj) : j J }. The persistence landscape of M may be equivalently deﬁned as:

λ(k, t) = kmaxj J {λ((bj, dj))(1, t)}

where kmax denotes the kth largest value of the indexed set and λ((bj, dj)) is the landscape associated to the interval module 1(bj,dj).

Lemma 15 (Bubenik, 2015) The persistence landscape has the following properties:

1. λ(k, t) 0.

2. λ(k, t) λ(k + 1, t).

3. λ(k, t) is 1-Lipschitz.

The ﬁrst two properties are immediate from the deﬁnition and the third property is proved by Bubenik (2015).

3.2. Multiparameter Persistence Landscapes

Let us deﬁne the multiparameter persistence landscape in analogy with the single parameter case. The rank invariant, rank function and rescaled rank function deﬁned above generalise naturally to multiparameter persistence modules:

Deﬁnition 16 (Rank Invariant) Let M vect Rn be a multiparameter persistence module, then for a b the function β , giving the corresponding Betti number is the rank invariant of M: βa,b = dim(Im(Ma Mb)).

Deﬁnition 17 (Multiparameter Rank Function) The rank function rk : R2n R is given by

( βb,d if b d 0 otherwise.

Deﬁnition 18 (Rescaled Multiparameter Rank Function) The rescaled rank function r : R2n R

( βm h,m+h if h 0 0 otherwise.

One could perform statistical analysis directly to the rank function and rescaled rank function. Endowed with the Lp normed vector space structure these functions are not stable with respect to the interleaving distance for all p [1, ].

Example 2 (Rank Function Instability) For ε > 0 and N N consider the multiparameter persistence module M with presentation M = {(ai, 0)}i=1,...,N | {xε ai}i=1,...,N . The persistence module M is such that rk M = N, rk M p = for all p [1, ), and d I(M, 0) = ε. Details of presentations and the interleaving distance d I for multiparameter persistence modules can be found in Appendix A.

We wish to deﬁne a stable invariant and we derive a landscape function from the rank invariant.

Deﬁnition 19 (Multiparameter Persistence Landscape) The multiparameter persistence landscape considers the maximal radius over which k features persist in every (positive) direction through x in the parameter space λ : N Rn R.

λ(k, x) = sup{ε 0 : βx h,x+h k for all h 0 with h ε}.

It is worth noting that when restricted to a single parameter persistence module this deﬁnition coincides with the single parameter persistence landscape (Bubenik, 2015). If there are multiple persistence modules under consideration we shall denote the landscape associated to module M as λ(M)(k, x).

Lemma 20 The multiparameter persistence landscape has the following properties:

1. λ(k, x) 0.

2. λ(k, x) λ(k + 1, x).

3. λ(k, x) is 1-Lipschitz.

Proof The ﬁrst two properties follow immediately from the deﬁnition. Let x, y Rn

and let k N. Without loss of generality assume that λ(k, x) λ(k, y) and that also r = λ(k, x) x y = δ. We seek to show that λ(k, y) λ(k, x) x y . For any ε 0 such that ε r δ let us deﬁne h = (|xi yi| + εi)i. We observe that h x y + ε r and in particular that:

x h y ε y + ε x + h.

Thus h r means the map M(x h x + h) which factors through M(y ε y + ε) has rank at least k. Since ε was arbitrary we see that λ(k, y) r δ.

Multiparameter Persistence Landscapes

In extending the persistence landscape deﬁnition to multiparameter persistence modules we encountered a choice of p-norm for the ball in Rn over which we ask features persist. Whilst Lemma 20 holds for all choices of p-norm it transpires that the most natural choice is the -norm. This choice considerably simpliﬁes computation of the multiparameter persistence landscape and gives rise to a number of further properties which we explore in this section.

Lemma 21 Let M be a multiparameter persistence module with rank invariant β , M. For all h 0 with h = h we have that βx h1,x+h1 M βx h,x+h M .

Proof For all h 0 with h = h we have that x h1 x h x + h x + h1. Hence the linear map M(x h1 x + h1) factors through the map M(x h x + h) and thus the result follows.

In view of Lemma 21, we see that in order to compute the value of a multiparameter persistence landscape λ(k, x) at the point x = x0, we only need to compute sup{ε 0 : βx0 ε1,x0+ε1 k}. Thus we only need to compute a single barcode in the ﬁbered barcode to compute the landscape value at a point.

Proposition 22 Let M be a persistence module and let L be a line of slope 1 through the parameter space Rn. Let ιL : (R, ) (Rn, ) denote the isometric embedding with ιL(R) = L and ιL(0) {xn = 0}. The restriction of the multiparameter landscape of M to L and the single parameter persistence landscape of the persistence module ML coincide, λ(ML)(k, t) = λ(M)(k, ιL(t)).

Proof Following the landscape deﬁnitions we see that if λ(M)(k, ιL(t)) > h then we have that βιL(t) h1,ιL(t)+h1 M = βt h,t+h ML k and so λ(ML)(k, t) > h thus λ(ML)(k, t)

λ(M)(k, ιL(t)). Conversely, if λ(ML)(k, t) > h then βιL(t) h1,ιL(t)+h1 M k, and so by Lemma 21 we have that βιL(t) h,ιL(t)+h M βιL(t) h1,ιL(t)+h1 M k for all h 0 with h = h and thus λ(ML)(k, t) λ(M)(k, ιL(t)).

An immediate consequence of Proposition 22 is that like the single parameter persistence landscape, the multiparameter persistence landscape also admits a decomposition as the kmax of a series of simple landscape functions when our module is interval decomposable.

Proposition 23 The multiparameter persistence landscape of an interval decomposable module M = L j 1Ij can be equivalently deﬁned as:

λ(M)(k, x) = kmaxj{λ(1Ij)(1, x)}.

Proposition 24 follows from Proposition 22 and the fact that the map from persistence diagram to persistence landscape is invertible for ﬁnitely presented single parameter persistence modules (Betthauser et al., 2019). For completeness, we provide a proof of Propostion 24. See Appendix A.1 and Deﬁnition 45 for multiparameter persistence module presentations.

M = [(0, 1), (10, 2)] [(4, 1), (6, 2)] N = [(0, 1), (6, 2)] [(4, 1), (10, 2)]

Figure 2: We illustrate a pair of interval decomposable multiparameter persistence modules, M, N vect R2, which have distinct rank invariants but the same multiparameter persistence landscapes. Each module is the direct sum of two rectangular intervals. The ﬁrst summand of each module is shaded with green dots and the second summand is shaded in solid red.

Proposition 24 Let M vect Rn ﬁn be a ﬁnitely presented multiparameter persistence module and let λ : N Rn R be the associated multiparameter persistence landscape. Using λ we can recover the rank invariant of M on the set of pairs of parameter values which lie on lines of slope 1, {(a, a + r1) : a Rn, r 0}.

Proof The rank invariant on the set {(a, a + r1) : a Rn, r 0} is derived from the landscape as follows:

βa,a+r1 = lim sup δ 0+ max k N : λ(k, a + (r + δ)

Consider a pair of parameter values (a, a + r1). If βa,a+r1 k, since M is ﬁnitely presented, βa,a+(r+δ)1 k for all suﬃciently small δ. Let cδ = a + (r+δ)

2 1. By Lemma 21 βcδ h,cδ+h M k for all h 0 with h (r+δ)

2 , and thus we have that λ(k, cδ) (r+δ)

2 for suﬃciently small δ. Conversely, suppose that λ(k, cδ) (r+δ)

2 for some small postitive δ. This implies that

βa,a+r1 βa,a+(r+ δ

The following example illustrates two modules with distinct rank invariants which are not distinguished by their multiparameter persistence landscapes.

Example 3 Denote by [a, b] the rectangular interval module with opposite vertices a, b. Let M = [(0, 1), (10, 2)] [(4, 1), (6, 2)] and N = [(0, 1), (6, 2)] [(4, 1), (10, 2)] be interval decomposable 2-parameter persistence modules. These two modules have diﬀerent rank invariant (β(0,1),(10,2) M = 1 = 0 = β(0,1),(10,2) N ) but the same multiparameter persistence landscapes, see Figure 2.

One would hope that a multiparameter module invariant could distinguish the modules in the previous example. The multiparameter persistence landscape fails to distinguish

Multiparameter Persistence Landscapes

these modules since the rank invariants of these modules coincide on all pairs of parameter values lying on lines of slope 1. This in turn occurs since the overlap between the summands in the x1-coordinate is greater than their signiﬁcance in the x2-coordinate. Thus altering x1, x2 at the same rate we cannot detect the interaction between the summands with the multiparameter persistence landscape. A simple reparametrisation scaling parameters xi appropriately would allow us to distinguish these modules and motivates the following deﬁnition.

Deﬁnition 25 (w-Weighted Persistence Landscape) Let w {u Rn : ui > 0, u = 1} be a weighting vector corresponding to a rescaling of the parameter space Rn. Deﬁne the wweighted inﬁnity norm to be h w = (wihi)i . The w-Weighted Persistence Landscape is a function λw : N Rn R.

λw(k, x) = sup{ε 0 : βx h,x+h k for all h 0 with h w ε}.

Remark 26 The ordinary multiparameter persistence landscape is the 1-weighted persistence landscape.

The example illustrated in Figure 2 exhibits the dependence on the relative scaling of the signiﬁcant parameter values in a multiparameter persistence module and highlights the importance of normalisation or choice of weighting vectors w in practical applications. Consider the weighted landscape with weighting w = ( 1

10, 1) for the examples in Figure 2. This weighted landscape distinguishes the two modules, indeed λw(M)(1, (5, 1.5)) = 0.5 and = λw(N)(1, (5, 1.5)) = 0.1. The weighting hyperparameter for multiparameter persistence landscapes is analogous to the Poisson parameter used for the Poisson-weighted persistence landscape kernel introduced in Bubenik (2018), and the Weighting Function for persistence images introduced in Adams et al. (2017). Determining an appropriate choice of these hyperparameters is dependent on a given applied setting.

Deﬁnition 27 (w-Rescaling) Let ϕw Aut(Rn) denote the invertible rescaling ϕw(x) = (wixi)i for w {u Rn : ui > 0}.

The following proposition makes precise the relationship between the weighted landscape and rescaling the parameter space.

Proposition 28 Let w be a rescaling vector and let λw denote the function taking a module to its w-weighted persistence landscape. Let (ϕw) denote the pull back of ϕw. The w-weighted persistence landscape is given by λw = (id ϕw 1) λ1 (ϕw) and so the following diagram commutes:

vect Rn vect Rn

Lp(N Rn) Lp(N Rn).

λw (id ϕw 1)

Proof Let M vect Rn. It is a straight forward deﬁnition chase to see the diagram commutes. For convenience of notation let us use shorthand notation for component wise multiplication a b = (aibi)i. Direct computation yields the result:

λw(M)(k, x) = sup{ε 0 : βx h,x+h M k, for all h 0 with h w ε}

= sup{ε 0 : βw 1 (x h),w 1 (x+h) M ϕw k, for all h 0 with w 1 h w ε}

= sup{ε 0 : βw 1 x t,w 1 x+t M ϕw k, for all t 0 with t ε}

= λ1(M ϕw)(k, w 1 x) = (id ϕw 1) λ1 (ϕw) (M)(k, x).

4. Stability and Injectivity

In this section we shall show that the multiparameter landscapes are stable with respect to the interleaving distance and persistence-weighted Wasserstein distance. We will then provide an injectivity result that shows the collection of weighted persistence landscapes derived from a persistence module contains almost all the information in the rank invariant of that module. For each p [1, ] let us deﬁne a p-distance on the space of multiparameter landscapes completely analogously with the deﬁnition as in Bubenik (2015) where we implicitly are viewing our landscapes as elements of Lebesgue space Lp(N Rn). Our landscapes are all measurable since they are continuous; however they may be unbounded. We can either choose to permit inﬁnite distances or alternatively truncate our landscapes to a bounded region if we wish to ensure that our distances are ﬁnite.

Deﬁnition 29 (p-Landscape Distance) Let M, N be multiparameter persistence modules. The p-landscape distance between M and N, d(p) λ (M, N), is deﬁned to be:

d(p) λ (M, N) = λ(M) λ(N) p.

4.1. Stability

Unlike the inﬁnity norm of rank function and rank invariant, the inﬁnity norm of multiparameter persistence landscapes is stable with respect to the interleaving distance.

Theorem 30 (Multiparameter Persistence Landscape Stability) If M, N vect Rn are multiparameter persistence modules then the -landscape distance of the multiparameter persistence landscapes is bounded by the interleaving distance d I.

d( ) λ (M, N) d I(M, N).

Proof Suppose M, N are ε-interleaved. Let x Rn and assume without loss of generality that r = λ(M)(k, x) λ(N)(k, x) and that also λ(M)(k, x) ε.

Multiparameter Persistence Landscapes

For any h 0 with h < r ε we have that h + ε1 < r. Since r = λ(M)(k, x) we know that the map M(x (h + ε1) x + (h + ε1)) has rank at least k. An ε-interleaving gives rise to commutative diagram:

M(x (h + ε1)) M(x + (h + ε1))

N(x h) N(x + h)

Thus we see that the map N(x h x+h) has rank at least k. Hence λ(N)(k, x) r ε and thus the inﬁnity norm distance between λ(M) and λ(N) is at most ε.

Corollary 31 (Multiparameter Sublevel-set Landscape Stability Theorem) Let f, g : X Rn be ﬁltering functions and let M(f), M(g) denote the induced sublevel-set multiparameter persistence modules. The sublevel-set persistence modules satisfy:

d( ) λ (M(f), M(g)) f g .

Proof M(f), M(g) are f g -interleaved.

In practical applications we may wish to truncate our landscapes to a bounded region of the parameter space R Rn. Since h1R p |R| h1R , the -landscape distance stability result yields a coarse bound for the p-landscape distance.

Corollary 32 Let M, N be multiparameter persistence modules and R N Rn a Lebesgue measurable subset of the parameter space. The p-landscape distance restricted to the region R is stable with respect to the interleaving distance:

(λ(M) λ(N))1R p |R|d I(M, N).

The weighted landscapes also satisfy stability with respect to the interleaving distance. This can be shown directly or using Proposition 28 and the stability result in the unweighted case.

Corollary 33 (Multiparameter w-Weighted Landscape Stability) Let M, N be multiparameter persistence modules. For unit weightings w {u Rn : ui > 0, u = 1} the -Landscape distance of the weighted Multiparameter Persistence Landscapes is bounded by the interleaving distance.

λw(M) λw(N) d I(M, N).

Proof Suppose that M, N are ε-interleaved and that w is a unit weighting. Since w is a unit weighting (ϕw) (M), (ϕw) (N) are also ε-interleaved. We attain that:

λw(M) λw(N) = (id ϕw 1) λ1 (ϕw) (M) (id ϕw 1) λ1 (ϕw) (N) = λ1 (ϕw) (M) λ1 (ϕw) (N) d I((ϕw) (M), (ϕw) (N)) d I(M, N).

The ﬁrst equality is a result of Proposition 28 and the penultimate inequality is a direct application of Theorem 30.

The p-landscape distance restricted to interval decomposable modules is stable with respect to the persistence weighted p-Wasserstein distance.

Proposition 34 (p-Landscape Distance Stability of Interval Decomposable Modules) Let M, N be interval decomposable multiparameter modules with barcodes consisting of ﬁnitely many intervals. The p-landscape distance is stable with respect to the persistence weighted p-Wasserstein distance: d(p) λ (M, N) d W p(M, N).

Proof Let us use the shorthand notation λM = λ(M), and suppose M, N have barcodes {Ij : j J } and {Jκ : κ K} with equal cardinality by appending a set of empty intervals of the cardinality of the other set. Recall that the landscape for M can be expressed as a pointwise maximum, λM(k, x) = kmax J λ1Ij (1, x). Let σ : J K be any bijection realising the persistence weighted p-Wasserstein distance, and let εj = d I(1Ij, 1Jσj).

d(p) λ (M, N)p = λM λN p p =

Rn |λM(k, x) λN(k, x)|pdµ

k=1 | kmax J λ1Ij (1, x) kmax K λ1Jκ(1, x)|pdµ

j J |λ1Ij (1, x) λ1Jσ(j)(1, x)|pdµ

j J εp j1{Ij Jσ(j)}dµ

j J |Ij Jσ(j)|εp j = d W p(M, N)p.

The inequality between the second and third line follows from the general fact that for any u, v Rn the sum P |ui vi|q is minimized by ordering the components of each tuple. The fourth line bounds the third line by Theorem 30 applied to the matched interval summands.

Multiparameter Persistence Landscapes

4.2. Injectivity

We now show that the collection of weighted landscapes associated to a module preserves almost all the information contained in the rank invariant of ﬁnitely presented persistence modules. See Appendix A.1 and Deﬁnition 45 for multiparameter persistence module presentations.

Deﬁnition 35 (Weighted Landscape Space) Let W := {u Rn : ui > 0, u = 1} denote the set of unit weights. Each multiparameter module M gives rise to a function λW(M) : W L (N Rn) mapping each unit weight to the associated weighted landscape of that module. We deﬁne weighted landscape space to be this function space equipped with the metric:

d W(λW(M), λW(N)) = λW(M) λW(N) = sup w W { λw(M) λw(N) }.

Proposition 36 Let vect Rn ﬁn denote the space of ﬁnitely presented multiparameter persistence modules. Let us deﬁne an equivalence relation on vect Rn ﬁn identifying M N if the rank invariant of M and N coincide. Denote the quotient space of ﬁnitely presented multiparameter persistence modules under this equivalence relation by vect Rn ﬁn / . The map λW : M 7 {(w, λw(M)) : wi > 0, w = 1} is injective and 1-Lipschitz on vect Rn ﬁn / , where we equip the quotient space with the distance induced by the interleaving distance, and the weighted landscape space with the metric d W.

Proof We shall ﬁrst show that the map λ is injective. Let M vect Rn ﬁn and recall Proposition 24. For all a < b there is some unit rescaling vector w such that the pair of parameter values (ϕw(a), ϕw(b)) lie on a line of slope 1. Thus we can recover the rank invariant of M from the collection {(w, λw(M))}. The fact that the map λW is 1-Lipschitz is an immediate consequence of Corollary 33 Multiparameter w-Weighted Landscape Stability.

Since λW is 1-Lipschitz we can compute a lower bound on the interleaving distance between modules from the collection of weighted landscapes. We would be interested to investigate further the relationship between the landscape distance and the interleaving distance to understand when the landscape distance provides a good lower bound for the interleaving distance.

5. Statistics on Multiparameter Landscapes

A principal advantage of working with landscapes as a summary statistic for our data is that we are always able to take the pointwise mean of a collection of landscapes. Whilst the mean of a collection of landscapes is not necessarily the landscape of some module, one can still interpret the mean landscape. The location of local maxima of the mean landscape correspond to the parameter values above which signiﬁcant features of the sample landscapes live, which in turn correspond to signiﬁcant topological features of the data set from which the sample landscapes have been derived. The space of persistence landscapes endowed with the p-landscape distance naturally spans a subspace of Lebesgue space, a Banach space. We would like to perform statistical

analysis on a set landscapes produced from data sets to distinguish signiﬁcant topological signals from sampling noise. In Appendix B we review relevant results from the theory of Banach Space valued random variables. In this section we apply these results to multiparameter persistence landscapes. We attain the same collection of results enjoyed by the single parameter persistence landscape established in Bubenik (2015).

5.1. Convergence Results for Multiparameter Landscapes

We shall take the same probabilistic approach as in Bubenik (2015) in viewing multiparameter landscapes derived from a data set as a Banach space valued random variable. The model for applying statistical analysis to persistence landscapes will trace the following general setup. Suppose X is a Borel measurable random variable on some probability space (Ω, F, P) thought of as sampling data from some distribution. Further let Λ = Λ(X) denote the multiparameter persistence landscape associated to some multiﬁltration of the data X, so that in summary Λ : (Ω, F, P) Lp(N Rn) for 1 p < is a random variable taking values in a real, separable Banach Space. Let {Xi} be i.i.d copies of X and {Λi} their associated landscapes. Denoting the pointwise mean of the ﬁrst n landscapes by Λ n and applying the general theory of probability in Banach spaces presented in Appendix B we attain several results. Observe that in practice we may be required to truncate our multiparameter landscapes to a bounded region in order to satisfy the ﬁniteness criteria in the convergence results. Associated to a well-behaved Banach space valued random variable Λ : (Ω, F, P) Lp(N Rn) is a set function IΛ : F B called the Pettis Integral of Λ. This can be thought of as the expectation of a Banach space valued random variable. For more details see Appendix B and Ledoux and Talagrand (2011).

Theorem 37 (Strong Law of Large Numbers) With our notation as in the above discussion Λ n IΛ(Ω) almost surely if and only if E[ Λ ] < .

Theorem 38 (Central Limit Theorem) Let us consider the landscapes endowed with the p-landscape distance for p 2. If E[ Λ ] < and E[ Λ2 ] < , then n(Λ n IΛ(Ω)) converges weakly to a Gaussian random variable G(Λ) with the same covariance structure as Λ.

The central limit theorem for the landscapes induces a central limit theorem for associated real valued random variables and facilitates the computation of approximate conﬁdence intervals.

Corollary 39 Let us consider the landscapes endowed with the p-landscape distance for p 2. Suppose E[ Λ ] < and E[ Λ2 ] < . If f Lp(N Rn) , so that Y = f(Λ) is a real valued random variable, then n(Y n E[Y ]) N(0, Var(Y )) converges in distribution.

Corollary 40 (Approximate Conﬁdence Intervals) Suppose Y is a real-valued random variable attained from a functional applied to the multiparameter landscape Λ satisfying the conditions of Corollary 39. Let {Yi}n i=1 be i.i.d. instances of this random variable and

Multiparameter Persistence Landscapes

S2 n = 1 n 1 Pn i=1(Yi Y n)2 the sample variance. An approximate (1 α) conﬁdence interval for E[Y ] is given by: [Y n z α

2 Sn n, Y n + z α

2 Sn n], where z α

2 critical value for the normal distribution.

In practice, a functional of choice could be given by integrating the landscapes over a subset R of the parameter domain f R(Λ) = R

R Λ dµ. These functionals can be used to establish the signiﬁcance of homological features in diﬀerent regions of the parameter space. We remark that recent work has attained conﬁdence bands for single parameter persistence landscapes (Chazal et al., 2014, 2015). It is shown in Chazal et al. (2014) that the single parameter persistence landscapes satisfy a uniform central limit theorem, and moreover the rate of convergence is computed. It would be interesting to see similar analysis performed in the multiparameter setting.

6. Example Computations and Machine Learning Applications

In this section we shall present example computations of multiparameter persistence landscapes and demonstrate a simple application of machine learning to the persistence landscapes. We use the RIVET software (The RIVET Developers, 2018) for computations of 2-parameter persistence modules presented in Lesnick and Wright (2015). RIVET supports the fast computation of multigraded Betti numbers (see Deﬁnition 46) and an interactive visualisation for 2-parameter persistence modules. The software computes a data structure associated to a module which facilitates real time queries of the ﬁbered barcode. As far as we know, RIVET is the only publicly available TDA software package supporting multiparameter persistent homology calculations. The software supports a range of input formats including: point cloud, metric space, algebraic chain complex, and explicit biﬁltered complex. In particular we shall use the software to calculate and query the ﬁbered barcode associated to a module along a selection of one dimensional slices of the parameter space. In order to reduce computational cost, RIVET approximates multiparameter modules with a discretization. These approximations can be taken to arbitrary accuracy with respect to the interleaving distance, see Appendix A.3. Discretization is not the only approach one can take in computations involving continuous multiparameter modules. In contrast Miller (2017) develops a primary decomposition of modules which facilitates a ﬁnite description of a wide class of persistence modules which would require inﬁnitely many generators if discretized. Our only obstruction to using this approach rather than discretization is the lack of available software to cope with these presentations. Computation of the module with RIVET is the most computationally expensive procedure in our calculations. Details of the time and space complexity of the algorithm may be found in Lesnick and Wright (2015), loosely if m denotes the size of the ﬁltered complex associated to the input data, in the worst case one requires time O(m5) and space O(m5) to compute the data structure which admits fast queries of the ﬁbered barcode. Further details of the software and complexity may be found in Lesnick and Wright (2015). In theory, since our landscape is derived solely from the rank invariant, we need not calculate the full module and ﬁbered barcode. Recall that the value of the multiparameter persistence landscape at each point can be calculated using the single parameter persistence

landscape associated to the line of slope 1 passing through that point. Thus we could reduce the computation of the multiparameter landscape in any dimension to repeated single parameter persistent homology calculations. This reduction would be highly parallelizable and likely to provide signiﬁcant speedup.

Proposition 41 Let M vect R2 be a multiparameter persistence module derived from a simplicial complex with m simplices. Let ε > 0 be some tolerance value and [0, R] [0, R] R2 a subset of our parameter space. We can compute an ε-approximation λ(M)(ε)

to the persistence landscape λ(M) of M on the region [0, R] [0, R] in time O(m3 R

ε ). Our approximation is with respect to the inﬁnity norm, that is λ(M)(ε) λ(M) ε.

Proof Divide the region [0, R] [0, R] into a grid of spacing ε. It suﬃces to calculate the values of the landscape on this grid since the landscape functions are 1-Lipschitz and so we can extend the grid values to an ε-approximate function on [0, R] [0, R]. Thus we reduce our computation to the computation of 2R

ε single parameter landscapes corresponding to the collection of 2R

ε slope 1 lines passing through the points of the grid. Given birth-death pairs, Bubenik and D lotko (2017) provide an algorithm to compute the persistence landscapes in time O(m2). It is well known from Edelsbrunner and Harer (2010) that one can produce birth-death pairs from a ﬁltration of size m in time O(m3). Hence the result follows.

It is possible that the above time estimate for the landscape computation could be improved by using vineyard style updates between the single parameter landscapes (Cohen Steiner et al., 2006). Moreover it may be that in practical applications, computing the module with RIVET is signiﬁcantly faster than the worst time bound O(m5) and thus using the ﬁbered barcode queries will be faster than the computation of a series of single parameter landscapes. Note also that the 2R

ε single parameter landscape calculations are independent and so can be computed in parallel. We postpone comparisons of diﬀerent computational algorithms, benchmarking, and eﬃcient implementation to follow up work. One may want to utilise machine learning algorithms with landscape functions as a collection of features for a data set. Recall that if we consider the persistence landscapes associated with the 2-landscape distance then we are naturally in the setting of a Hilbert Space. The inner product on this space is positive deﬁnite on the space of persistence landscapes. As such we may use this kernel to learn non-linear relationships in our data and then apply convex optimisation techniques to an SVM. Another point to note is that we can discretize the landscape to give an n-dimensional array as a summary of our data to which one could apply a convolutional neural network. This transform from landscape to multidimensional array will satisfy stability with respect to the landscape distance. A similar approach is used by Adams et al. (2017) to produce a persistence image from a persistence diagram. We provide three computational examples together with the application of a basic statistical test and standard SVM classiﬁer. Our examples demonstrate that the multiparameter landscape is sensitive to both topology and geometry. We do not claim that the multiparameter landscape is the optimal analytic tool to perform the various tasks in our examples, rather we demonstrate a range of potential applications.

Multiparameter Persistence Landscapes

6.1. Concentric Circles

Our ﬁrst example will look at points sampled from densities concentrated around a pair of concentric circles with radii 1 and 3 respectively. We colour the points from each circle in two distinct ways. Colouring A assigns the large circle colour parameter 0.5 and the small circle colour parameter 1.5. Colouring B assigns the small circle colour parameter 0.5 and the large circle colour parameter 1.5. We examine how the multiparameter landscapes diﬀer depending on the colouring of the circles. For each colouring we perform 30 samples, each sample consisting of 100 points uniformly sampled from each circle Figure 3a.

We produce a ﬁltration on each pointcloud with the Rips ﬁltration in the ﬁrst parameter and the colour parameter in the second parameter. Thus at parameter value (r, c) R2, we have the space X(r,c) = VR(Pc, r) where Pc denotes the sampled points with colour parameter no more than c. We apply the degree one homology functor H1 to produce a multiparameter persistence module which detects the loops in our ﬁltered topological space.

We compute the average landscapes of the H1-modules for the two diﬀerent colourings, Figure 3. When the large circle has the smaller colour parameter value, the ﬁrst landscape (k = 1) can detect the large circle Figure 3b. We see the large circle in the ﬁrst landscape as the large mountain spanning the parameter subspace [1, 5.4] [0.5, 1.5]. When the large circle has the higher parameter value, the persistence in the Rips ﬁltration parameter is diminished by the presence of the small circle with smaller colour parameter. In both colourings, the second landscape (k = 2) exhibits the range of parameter values for which both circles are detected Figure 3c.

We test the robustness of the landscape by repeating the sampling this time with only 50 points per circle and perturbing both the location and colour of the sampled points with the addition of i.i.d. normals N(0, 0.3), Figure 4a. We illustrate in Figure 4b and Figure 4c the average landscapes taken over 30 noisy samples. The resulting landscapes are similar to those of the larger samples without noise.

Let us perform a statistical test to determine whether the multiparameter landscapes can detect that the noisy samples are drawn from diﬀerent distributions. Consider the functional f R(λ) = R

R λdµ. Using the results of Section 5 we ﬁnd approximate conﬁdence intervals for f R(λ) with R = {1} ([2, 6] [0, 1.5]) N R2. We attain approximate 99%- conﬁdence intervals on the noisy samples: for Colouring A [0.400, 0.474], and for Colouring B [0.00556, 0.00809]. A two sample t-test on the values of this functional on the two sets of colourings attains a p-value of 0.00629. Thus we reject the null hypothesis that the functional values on the landscapes of the two colourings have the same mean.

We also perform a permutation test for the test statistic λA λB 4 (with norm taken for the ﬁrst landscape over the whole parameter space [0, 6] [0, 3]). We apply 10,000 random permutations to the collection of landscapes randomly assigning 30 landscapes to group A and 30 landscapes to group B and then computing the test statistic. Each of the sampled permutations attained a smaller test statistic than the observed test statistic, (indicating an approximate p-value of 0), and identifying that the landscapes for Colouring A have been drawn from a distinct distribution than the landscapes for Colouring B.

3 2 1 0 1 2 3

3 2 1 0 1 2 3

(a) An example point cloud sample from each colouring.

Colour Parameter

Rips Filtration Radius

3 4 5 6 0.0

Colour Parameter

Rips Filtration Radius

3 4 5 6 0.0

(b) The mean ﬁrst landscape for each colouring taken over the 30 samples, λ(1, x).

Colour Parameter

Rips Filtration Radius

3 4 5 6 0.0

Colour Parameter

Rips Filtration Radius

3 4 5 6 0.0

(c) The mean second landscape for each colouring taken over the 30 samples, λ(2, x).

Figure 3: The ﬁrst column shows the plots for Colouring A and the second column Colouring B.

Multiparameter Persistence Landscapes

3 2 1 0 1 2 3

3 2 1 0 1 2 3

(a) An example point cloud sample from each colouring with noise added.

Colour Parameter

Rips Filtration Radius

3 4 5 6 0.0

Colour Parameter

Rips Filtration Radius

3 4 5 6 0.0

(b) The mean ﬁrst landscape λ(1, x) taken over the 30 noisy samples.

Colour Parameter

Rips Filtration Radius

3 4 5 6 0.0

Colour Parameter

Rips Filtration Radius

3 4 5 6 0.0

(c) The mean second landscape λ(2, x) taken over the 30 noisy samples.

Figure 4: The ﬁrst column shows the plots for Colouring A and the second column Colouring B.

6.2. Modal Estimation

For this example we work on meteorite data which we have lifted from Good and Gaskins (1980). The data set consists of values of the proportion of silica measured in 22 samples. Our task is to infer how many modes there are in the distribution from which this data has been sampled. A standard approach to this task is kernel density estimation (KDE). With data {xi} Rn one estimates the probability density function (pdf) of the distribution using a sum of normalized kernels:

i=1 Kσ(x xi)

Here Kσ is a density function with mass concentrated about the origin, for example a Gaussian centred at the origin. There are two natural parameters in this KDE setup. The log bandwidth parameter σ of the kernel function Kσ, and a threshold parameter which dictates how large a peak in the estimated distribution must be to be considered a mode. The choice of these parameters will dramatically alter our inferred number of modes (see Figure 5). Figure 6 is a surface plot of the KDEs ranging over various bandwidth parameters, demonstrating the change in the number of modes as we change the bandwidth. The surface has been triangulated using a triangulation subordinate to a regular grid on our parameter space. To each 2-simplex τ in the triangulation we attach two parameters; the mean bandwidth σ(τ), and the mean probability density value p(τ), (averages taken over the vertices of the simplex). We produce a biﬁltration by taking the simplicial closure of the 2simplices with appropriate parameter values, X(σ0,p0) = SC({τ : σ(τ) σ0, p(τ) 1 p0}). The multiparameter landscape detects that three modes appear in the KDEs for a range of parameter values. Looking at the landscapes associated to the H0-module we see that the inﬁnity norm of the ﬁrst three landscapes is constant but decreases signiﬁcantly between the third and fourth landscapes, Figure 7. This indicates that within this setup, three modes are seen across a signiﬁcantly wider range of parameter values than four modes, suggesting the data is drawn from a tri-modal distribution which coincides with our expected result. Whilst in this simple example one could suggest there are three modes from inspection,

20 25 30 35 Silica %

Bandwidth Parameter=0.01

20 25 30 35 Silica %

Bandwidth Parameter=0.15

20 25 30 35 Silica %

Bandwidth Parameter=0.29

20 25 30 35 Silica %

Bandwidth Parameter=0.43

20 25 30 35 Silica %

Bandwidth Parameter=0.57

Figure 5: We plot kernel density estimates on the meteorite data (red) for a range of bandwidth parameters. As we increase the bandwidth parameter we yield fewer modes in our kernel density estimate.

Multiparameter Persistence Landscapes

Figure 6: A triangulated surface plot of the KDE for a range of bandwidth parameters. We observe three modes in the KDE estimate for a large range of bandwidth values.

the landscape analysis can equally be applied to higher dimensional data sets for which visualisation is not possible.

This basic example can be generalized to detect other properties of KDEs robust to changes in parameter values. For example one could detect signiﬁcant i-dimensional holes in the distribution by considering the Hi module in a similar setup. For related work see Persistence Terraces (Moon et al., 2018).

1.00 1.05 1.10 1.15 1.20 1.25 1.30

1.00 1.05 1.10 1.15 1.20 1.25 1.30

1.00 1.05 1.10 1.15 1.20 1.25 1.30

1.00 1.05 1.10 1.15 1.20 1.25 1.30

1.00 1.05 1.10 1.15 1.20 1.25 1.30

Figure 7: The ﬁrst to ﬁfth multiparameter persistence landscapes associated to the H0 module for the KDE surface. The inﬁnity norm of the ﬁrst three landscapes is constant λ(1, x) = λ(2, x) = λ(3, x) = 0.283 but drops signiﬁcantly between the third and fourth landscape λ(4, x) = 0.120, λ(5, x) = 0.106. This indicates that there are three modes in the KDE for a wide range of bandwidth parameters.

6.3. Curvature

In this subsection we shall work with a synthetic data set sampled from spaces of diﬀerent curvature. Single parameter persistence landscapes have been used to detect curvature (Bubenik et al., 2019). This example is used to emphasise the ability of the multiparameter landscapes to detect geometric diﬀerences between point samples. The samples consist of 100 points chosen uniformly with respect to the volume measure from discs of radius 1 in the hyperbolic plane, the surface of the unit sphere and Euclidean space so that the spaces have constant curvature of 1, 1, 0 respectively. Topologically these disks are all trivial, our landscapes are detecting geometric diﬀerences induced by the distribution of points. We would like to show that the multiparameter landscape is able to detect the curvature of the space from which a sample is drawn given only the pairwise distances between points. A multiﬁltered complex is built on the sampled points by ﬁltering the Rips complex with the third nearest neighbour density function ρ on the points. Explicitly, if P denotes our sampled points and (r, ρ0) R2 then X(r,ρ0) = VR(Pρ0, r) where Pρ0 = {p P : ρ(p) ρ0} for the third nearest neighbour density function ρ. We take 100 samples of 100 points in each space and investigate the resulting multiparameter landscapes for dimension 1 homology. We plot the average ﬁrst multiparameter landscapes in Figure 8a and the diﬀerences between the average landscapes in Figure 8b. We observe that the persistence of cycles is aﬀected by the curvature of the space. The more negative the curvature the longer the one dimensional cycles persist. Let us now apply a simple machine learning algorithm to the multiparameter landscapes to see if we can reliably distinguish the curvature of the space from which our small samples have been drawn. Using the Python package Linear SVC, we train a Support Vector Machine (SVM) with linear kernel on discretizations of the ﬁrst 10 landscapes for the samples of the hyperbolic discs and elliptic discs, using l2 penalty and squared hinge loss function. We randomly partition our samples into 160 training samples and 40 test samples and evaluate the accuracy by the proportion of test samples correctly classiﬁed. Repeating this process 100 times we attain an average classiﬁcation score of 85.78%. Thus we see that the multiparameter landscapes are able to reliably detect curvature given a relatively small local sample. It is possible that alternative choices of ﬁltration parameters may be better suited to detecting curvature.

7. Discussion

Multiparameter persistence landscapes provide a stable representation of the rank invariant of a persistence module whilst retaining the discriminating power of the rank invariant. Moreover the landscape distance provides a computable lower bound for the optimal stable distance on persistence modules, the interleaving distance. The multiparameter landscape also oﬀers a bridge from topological data analysis to machine learning and statistical analysis of multiparameter modules. The multiparameter landscapes, although hard to visualize in dimensions higher than 2, are interpretable in any dimension with large landscape values indicating features robust to changes in the ﬁltration parameters, and non-zero landscapes for large k indicating a large number of homological features.

Multiparameter Persistence Landscapes

Nearest Neighbour Density

Rips Filtration

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.00

Nearest Neighbour Density

Rips Filtration

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.00

Nearest Neighbour Density

Rips Filtration

0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.00

(a) The mean ﬁrst landscape λ(1, x) of the H1 module for the hyperbolic, Euclidean and elliptic discs taken over 100 samples.

Nearest Neighbour Density

Rips Filtration

0.00 0.05 0.10 0.15 0.20 0.25 0.30

Hyperbolic minus Euclidean

Nearest Neighbour Density

Rips Filtration

0.00 0.05 0.10 0.15 0.20 0.25 0.30

Euclidean minus Spherical

(b) The pointwise diﬀerence between the mean landscapes λ(1, x).

Figure 8: Multiparameter persistence landscapes for uniform samples of discs of unit radius from spaces of diﬀerent curvature.

An important question to address in practical applications of multiparameter persistence landscapes is how to determine the optimal weighting w of the parameter space to be used in any given application. The optimal choice of weighting is likely to be highly dependent on the application in mind. The need to choose a hyperparameter is not unique to multiparameter persistence and similar challenges are presented in choosing hyperparameters for various vectorizations of single parameter persistence applications (Bubenik, 2018; Adams et al., 2017; Chazal et al., 2014). We note that multiparameter persistence landscapes depend continuously on the choice of weighting vector w, and so given suﬃcient training data one could tune this parameter to maximize discriminatory power. The multiparameter landscapes highlight several open questions and challenges in the development of the theory and applications of multiparameter persistent homology that we would be interested to see addressed.

1. We would like to understand the relationship between the interleaving distance and landscape distance associated to modules to understand when the landscape distance provides a good lower bound estimate.

2. We have restricted our invariant to the discriminating power of the rank invariant. We would be interested to see if we could combine our landscapes with invariants

that capture the more subtle relationships between features born at incomparable parameter values.

3. With respect to applications there is a question as to whether the increased complexity of multiparameter persistence is necessary, or whether single parameter persistence may be suﬃcient. One could possibly substitute a multiparameter persistence landscape for a vectorization of a well-chosen single parameter persistence module without signiﬁcant loss of information.

4. The Bootstrap Method has been used to compute conﬁdence bands for single parameter persistence landscapes (Chazal et al., 2014). We would be interested in applying similar analysis for multiparameter landscapes.

Finally it is worth remarking that the construction of multiparameter persistence landscapes from multiparameter persistence modules can be generalized to produce stable invariants of generalized persistence modules indexed over other posets. Providing the indexing poset P is equipped with a superlinear family of translations Ω, one can derive a landscape function λ : N P R from the rank function rk : P P N. This landscape equipped with the supremum norm is stable with respect to the interleaving distance induced by the superlinear family, and provides an interpretable, stable representation of the rank function. This vectorization may prove a useful invariant should the computation of generalized persistence modules be developed in future work.

Acknowledgements

The author would like to give recognition to the Theory and Foundations of TGDA workshop hosted at Ohio State University which facilitated useful conversations with experts in TGDA. The author wishes to thank his supervisors Ulrike Tillmann and Vidit Nanda for their guidance and support with this project, and Peter Bubenik for helpful suggestions. The author would like to thank the anonymous reviewers for their detailed feedback on an earlier draft of this paper. The author gratefully acknowledges support from EPSRC studentship EP/N509711/1 and EPSRC grant EP/R018472/1.

Multiparameter Persistence Landscapes

Appendix A. Multiparameter Persistence Theory

For completeness we present the multiparameter persistence theory, deﬁning presentations and interleavings of multiparameter persistence modules.

A.1. Presentations

Deﬁnition 42 (Translation Endofunctors)(Bubenik et al., 2015) Let P be the category associated to a preordered set (proset) and let Γ : P P be an endofunctor. We say that Γ is a translation. Since Γ is a functor Γ is monotone x y implies that Γ(x) Γ(y). We say that Γ is increasing if x Γ(x) for all x P. Let Trans P denote the set of increasing translations of P and observe that Trans P is a monoid with respect to composition.

It is straight forward to see that Trans P also has a natural proset structure with preorder Γ K Γ(x) K(x) for all x . This preorder is compatible with the monoid structure of Trans P and Γ K implies there is a unique natural transformation ηΓ K : Γ K. If P is a poset then so is Trans P.

Deﬁnition 43 (Shift Functor) Let C be a category, F an element of the functor category CP, and Γ a translation endofunctor. Let F(Γ) denote F Γ CP, we call this functor the Γ shift of F.

For a multiparameter persistence module M Vect Rn we shall write M(a) to denote the shift by the translation in Rn, Γa(x) = x+a. We deﬁne an Rn-graded set to be some set X together with a map gr : X Rn. For an element j X, we shall refer to gr(j) = a Rn

as the grade of j. Let Pn denote the Rn-graded monoid ring of the monoid ([0, )n, +) over a ﬁeld F.

Deﬁnition 44 (Free Module) Let X be an Rn-graded set. The free module on X is denoted as Free[X], and deﬁned to be:

Free[X] = M

j X Pn( gr(j)).

Note that each element of the graded set gives rise to an independent copy of the Rn-graded ring Pn, and the negative shift determines that the elements of grade a in Pn are shifted to be elements of grade a + gr(j) in the shifted copy Pn( gr(j)). The notion of a free module on a graded set can equivalently be deﬁned using a univeral property characterisation. We say a subset R M, of a persistence module is homogeneous if R a Rn Ma, that is to say each element has a well deﬁned grade.

Deﬁnition 45 (Presentations) Let X be a graded set and R a homogeneous subset of the free module on X generating the submodule R . We say that a persistence module M has presentation X|R if:

M = Free[X]

We say that a presentation is ﬁnite if both X and R are ﬁnite. Let I denote the ideal of Pn generated by the elements {xa : a > 0} and let Φ X|R : Free[R] Free[X] be the map induced by the inclusion R , Free[X]. We say that a presentation of M is minimal if R I Free[X] and kerΦ X|R I Free[R].

Deﬁnition 46 (Multigraded Betti Numbers) Let M be a persistence module. The associated multigraded Betti numbers are maps ξi(M) : Rn N deﬁned by:

ξi(M)(a) = dim F(Tor Pn i (M, Pn/IPn)a).

Standard homological algebra arguments establish that the multigraded Betti numbers are well deﬁned (see (Lesnick and Wright, 2015) for details). If X|R is a minimal presentation for M then ξ0(M)(a) = |gr 1 X (a)| and ξ1(M)(a) = |gr 1 R (a)|, where gr X : J Rn, so that |gr 1 X (a)| gives the cardinality of the collection of elements of X with grading a.

The multigraded Betti numbers are related to the initial topological space with ξ0(M) marking the ﬁltration values for the birth of homological features, and ξ1(M) marking the ﬁltration values for relations between features.

A.2. Generalized Interleavings

We adopt the notion of a generalized interleaving from (Bubenik et al., 2015) to deﬁne interleaving distances on multiparameter persistence modules.

Deﬁnition 47 (Interleaving) (Bubenik et al., 2015) Let P be a proset and C be a category. Let F, G CP be modules and Γ, K Trans P. We say that F, G are (Γ, K)-interleaved if there exist natural transformations ϕ : F GΓ, ψ : G FK satisfying the coherence criteria that (ψΓ)ϕ = Fηid KΓ, (ϕK)ψ = Gηid ΓK where ηid α denotes the unique natural transformation between the translations id α.

An interleaving may be thought of as an approximate isomorphism. Indeed if we take Γ = K = id then F, G are (Γ, K)-interleaved if and only if F, G are isomorphic. By warping the proset with translations Γ, K we admit ﬂexibility to the rigid notion of isomorphism. In order to introduce an associated distance we must assign a weight to the translations to quantify how close the interleaving is to an isomorphism.

Deﬁnition 48 (Superlinear Family) (Bubenik et al., 2015) Let Ω: [0, ) Trans P be a superlinear function: Ωε1+ε2 Ωε1Ωε2. We say that Ωis a superlinear family.

Deﬁnition 49 (ε-Interleaving) (Bubenik et al., 2015) Let F, G CP be modules and Ωa superlinear family. We say F and G are ε-interleaved with respect to Ωif they are (Ωε, Ωε)- interleaved.

Proposition 50 (Induced Interleaving Distance) (Bubenik et al., 2015) Given a superlinear family Ωand modules F, G CP, we have an induced interleaving distance given by:

dΩ(F, G) = inf{ε 0 : F, G are ε-interleaved with respect to Ω}.

Multiparameter Persistence Landscapes

M = [(1, 1), (4, 4)] [(2, 0), (4, 2)] N = [(2, 2), (3, 3)]

Figure 9: A pair of interval decomposable modules M, N with interleaving distance d I(M, N) = 1.

The most common superlinear family to consider for persistence modules indexed by Rn is given by the translation in the diagonal direction, Ωε(x) = x + ε1. We refer to the interleaving distance induced by this superlinear family as simply the interleaving distance and denote this distance by d I.

Example 4 Let M, N vect R2 be interval decomposable modules with M = [(1, 1), (4, 4)] [(2, 0), (4, 2)] and N = [(2, 2), (3, 3)]. There is a 1 + δ-interleaving between M and N for all δ > 0 obtained by matching the ﬁrst summand of M with the summand of N. Explicitly, the natural transformation ϕ : M NΩ1+δ is zero everywhere except at grades a R2

such that a lies in the support of the ﬁrst summand of M and Ω1+δ(a) lies in the support of N, for which it is an isomorphism. The natural transformation ψ : N MΩ1+δ is deﬁned similarly. There is no 1-interleaving between this pair of modules. For example, the morphism M(2,0) M(4,2) is non-zero and so cannot factor through NΩ1(2,0) = N(3,1) = 0. Hence the interleaving distance between these modules is d I(M, N) = 1.

A.3. Discretization and Continuous Extension

In Section 6 our computations were simpliﬁed by restricting a continuous module to a ﬁnite grid and then dealing with the continuous extension of this discretization. We will show that restricting to a ﬁnite grid gives us a suitable approximation to our module with respect to the interleaving distance between modules.

Deﬁnition 51 (Grid Function) Let G : Zn Rn be deﬁned by component-wise strictly increasing functions Gi : Z R with sup Gi = sup Gi = . We say G is a grid function. Let us deﬁne the size of G to be

|G| = max i:1 i n sup z Z |Gi(z) Gi(z + 1)|.

Deﬁnition 52 (Discretization) Let M Vect Rn be a persistence module and G a grid function. We say that M G Vect Zn is the G-discretization of M.

Deﬁnition 53 (Continuous Extension)(Lesnick and Wright, 2015) Let Q Vect Zn be a discrete persistence module and G a grid function. For x Rn let us deﬁne ﬂoor and ceiling functions:

x G = max{z Im G : z x} and x G = min{z Im G : z x}.

We deﬁne the continuous extension EG(Q) Vect Rn to be the persistence module with:

EG(Q)a = QG 1( a G) and EG(Q)(a b) = Q(G 1( a G) G 1( b G))

with the obvious action on the morphisms of Vect Zn. We have deﬁned a functor EG : Vect Zn Vect Rn.

The following proposition shows that discretization is stable with respect to the interleaving distance, and so we may produce an arbitrarily close approximation to a persistence module by restricting the module to a grid of suﬃciently small size.

Proposition 54 Let M Vect Rn be a persistence module and G a grid function, we have that: d I(M, EG(M G)) |G|.

Proof The modules M, EG(M G) are ( G, id)-interleaved with natural transformations given by the appropriate internal morphisms of M.

Appendix B. Probability in Banach Spaces

We shall present the general theory of Probability for Banach space valued random variables from which we derive the convergence results of Section 5. Let us begin by deﬁning some notation. Let (B, ) denote a real, separable Banach Space with topological dual space B . Let V : (Ω, F, P) B denote a Borel measurable random variable. The covariance structure of such a random variable is deﬁned to be the set of expectations

{E[(f(V ) E[f(V )])(g(V ) E[g(V )])] : f, g B }.

In order to take expected values of Banach valued random variables we require the notion of the Pettis Integral, which is an extension of the Lebesgue integral to functions on measure spaces taking values in normed spaces. We shall brieﬂy introduce the properties of this integral and existence criteria, the essence of which is built upon reducing the problem to integrability of R-valued functions.

Deﬁnition 55 (Scalarly Integrable)(Geitz, 1981) A function V : (Ω, F, µ) B is scalarly integrable if for all f B we have that f(V ) is measurable and f(V ) L1(µ).

Deﬁnition 56 (Pettis Integrable)(Geitz, 1981) A scalarly integrable function V : (Ω, F, µ) B is Pettis integrable if for all E F there is an element IV (E) B such that: Z

E f(V )dµ = f(IV (E)) for all f B .

The set function IV : F B is called the Pettis Integral of V with respect to µ. We may also refer to IV (Ω) as the Pettis Integral of V and denote this by IV .

Multiparameter Persistence Landscapes

Deﬁnition 57 (Strong and Weak Measurability)(Musia l, 1991) A function V : (Ω, F, µ) B is simple if there exist v1, ..., vm B and E1, ..., Em F such that f = Pm i=1 vi1Ei. A function V : (Ω, F, µ) B is strongly µ-measurable if there exists a sequence of simple functions Vn : (Ω, F, µ) B such that limn Vn(ω) V (ω) = 0 µ-almost everywhere. A function V : (Ω, F, µ) B is weakly µ-measurable if for all functionals in the dual space v B the real-valued function v V : (Ω, F, µ) R is µ-measurable. We shall suppress the preﬁx µ if the measure is clear from context.

Theorem 58 (Pettis Measurability Theorem)(Musia l, 1991)(Theorem 3.1) A function V : (Ω, F, µ) B is strongly measurable if and only if:

1. V is weakly measurable.

2. There exists a µ-null set E such that V (Ω\ E) is a separable subset of B.

Proposition 59 (Musia l, 1991)(Proposition 5.1) A strongly measurable B-valued function V : (Ω, F, µ) B with Eµ[ V ] < is Pettis Integrable.

A consequence of the Pettis Measurablilty Theorem is that if the codomain is a separable Banach Space then the notions of weak and strong measurability coincide. Thus Proposition 59 gives a suﬃcient criterion for Pettis Integrability in the setting of multiparameter persistence landscapes endowed with the p-norm for p [1, ) for which the underlying Banach space is separable.

Corollary 60 Let V : (Ω, F, µ) B be measurable with B real and separable. If Eµ[ V ] < , then V is Pettis Integrable and IV (Ω) Eµ[ V ].

Proof By assumption V is measurable. Moreover, Eµ[ V ] < implies that for any linear functional v B we have Eµ[ v V ] v Eµ[ V ] < and so V is scalarly integrable. Hence by Proposition 59 V is Pettis Integrable. A straight forward application of the Hahn-Banach theorem establishes that IV (Ω) Eµ[ V ].

Theorem 61 (Strong Law of Large Numbers)(Ledoux and Talagrand, 2011)(Collary 7.10) Let Vi be i.i.d copies of V : (Ω, F, P) B and let Sn = Pn i=1 Vi. We have that E[ V ] < if and only if: Sn

n IV (Ω) almost surely as n .

Deﬁnition 62 We say a B-valued random variable X is Gaussian if for all f B the real valued random variable f(X) is Gaussian with mean zero. Note that such a Gaussian random variable is determined by its covariance structure (Ledoux and Talagrand, 2011).

The next result only applies for a certain class of Banach spaces. The type and cotype of a Banach space can be thought loosely of as a measure of how close that Banach space is to a Hilbert space, see (Ledoux and Talagrand, 2011) for more details. For p [1, 2] the Lebesgue space Lp has type p and cotype 2, and for p [2, ) the Lebesgue space Lp has type 2 and cotype p.

Theorem 63 (Central Limit Theorem)(Hoﬀmann-Jørgensen and Pisier, 1976) Let B be a Banach space of type 2 and V : (Ω, F, P) B. If IV = 0 and E[ V 2] < then 1 n Sn converges weakly to a Gaussian random variable with the same covariance structure as V .

Multiparameter Persistence Landscapes

Henry Adams, Tegan Emerson, Michael Kirby, Rachel Neville, Chris Peterson, Patrick Shipman, Sofya Chepushtanova, Eric Hanson, Francis Motta, and Lori Ziegelmeier. Persistence images: A stable vector representation of persistent homology. Journal of Machine Learning Research, 18(8):1 35, 2017. URL http://jmlr.org/papers/v18/ 16-337.html.

Aaron Adcock, Erik Carlsson, and Gunnar Carlsson. The ring of algebraic functions on persistence bar codes. Homology, Homotopy and Applications, 18(1):381 402, 2016. doi: 10.4310/hha.2016.v18.n1.a21. URL https://doi.org/10.4310/hha.2016.v18.n1.a21.

Ulrich Bauer and Michael Lesnick. Induced matchings of barcodes and the algebraic stability of persistence. In Proceedings of the Thirtieth Annual Symposium on Computational Geometry (So CG 2014), pages 355 364. ACM, 2014. ISBN 978-1-4503-2594-3. doi: 10. 1145/2582112.2582168. URL http://doi.acm.org/10.1145/2582112.2582168.

Paul Bendich, J. S. Marron, Ezra Miller, Alex Pieloch, and Sean Skwerer. Persistent homology analysis of brain artery trees. Annals of Applied Statistics, 10(1):198 218, 2016. ISSN 1932-6157. doi: 10.1214/15-AOAS886. URL https://uncch.pure.elsevier.com/en/ publications/persistent-homology-analysis-of-brain-artery-trees.

Leo Betthauser, Peter Bubenik, and Parker B. Edwards. Graded persistence diagrams and persistence landscapes. ar Xiv e-prints, art. ar Xiv:1904.12807, 2019.

H avard Bakke Bjerkevik and Magnus Bakke Botnan. Computational complexity of the interleaving distance. In 34th International Symposium on Computational Geometry (So CG 2018), Leibniz International Proceedings in Informatics (LIPIcs), pages 1 15. Schloss Dagstuhl - Leibniz-Zentrum f ur Informatik, 2018. doi: 10.4230/LIPIcs.So CG.2018.13.

H avard Bakke Bjerkevik. Stability of higher-dimensional interval decomposable persistence modules. ar Xiv e-prints, abs/1609.02086, 2016. URL http://arxiv.org/abs/1609. 02086.

H avard Bakke Bjerkevik, Magnus Bakke Botnan, and Michael Kerber. Computing the interleaving distance is NP-hard. Foundations of Computational Mathematics, 2019. ISSN 1615-3383. doi: 10.1007/s10208-019-09442-y. URL https://doi.org/10.1007/ s10208-019-09442-y.

Peter Bubenik. Statistical topological data analysis using persistence landscapes. Journal of Machine Learning Research, 16(1):77 102, 2015. ISSN 1532-4435. URL http://jmlr. org/papers/v16/bubenik15a.html.

Peter Bubenik. The persistence landscape and some of its properties. ar Xiv e-prints, art. ar Xiv:1810.04963, 2018.

Peter Bubenik and Pawe l D lotko. A persistence landscapes toolbox for topological statistics. Journal of Symbolic Computation, 78:91 114, 2017. ISSN 0747-7171. doi: https: //doi.org/10.1016/j.jsc.2016.03.009. URL http://www.sciencedirect.com/science/ article/pii/S0747717116300104.

Peter Bubenik, Vin de Silva, and Jonathan Scott. Metrics for generalized persistence modules. Foundations of Computational Mathematics, 15(6):1501 1531, 2015. ISSN 16153383. doi: 10.1007/s10208-014-9229-5.

Peter Bubenik, Michael Hull, Dhruv Patel, and Benjamin Whittle. Persistent homology detects curvature. Inverse Problems, 2019. URL http://iopscience.iop.org/10.1088/ 1361-6420/ab4ac0.

Gunnar Carlsson and Afra Zomorodian. The theory of multidimensional persistence. Discrete and Computational Geometry, 42(1):71 93, 2009. ISSN 01795376. doi: 10.1007/ s00454-009-9176-0.

Andrea Cerri, Barbara Di Fabio, Massimo Ferri, Patrizio Frosini, and Claudia Landi. Betti numbers in multidimensional persistent homology are stable functions. Mathematical Methods in the Applied Sciences, 36(12):1543 1557, 2013. ISSN 1099-1476. doi: 10.1002/ mma.2704. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/mma.2704.

Fr ed eric Chazal, Brittany Terese Fasy, Fabrizio Lecci, Alessandro Rinaldo, and Larry A. Wasserman. Stochastic convergence of persistence landscapes and silhouettes. Journal of Computational Geometry, 6:140 161, 2014.

Fr ed eric Chazal, Brittany Terese Fasy, Fabrizio Lecci, Alessandro Rinaldo, Aarti Singh, and Larry A. Wasserman. On the bootstrap for persistence diagrams and landscapes. Modeling and Analysis of Information Systems, 20(6):111 120, 2015. ISSN 2313-5417, 1818-1015. doi: 10.18255/1818-1015-2013-6-111-120. URL https://www.mais-journal.ru/jour/ article/view/162.

David Cohen-Steiner, Herbert Edelsbrunner, and Dmitriy Morozov. Vines and vineyards by updating persistence in linear time. In Proceedings of the Twenty-second Annual Symposium on Computational Geometry (So CG 2016), pages 119 126. ACM, 2006. ISBN 1-59593-340-9. doi: 10.1145/1137856.1137877. URL http://doi.acm.org/10.1145/ 1137856.1137877.

William Crawley-Boevey. Decomposition of pointwise ﬁnite-dimensional persistence modules. Journal of Algebra and its Applications, 14(05):1550066, 2015. doi: 10.1142/S0219498815500668. URL https://www.worldscientific.com/doi/abs/10. 1142/S0219498815500668.

Herbert Edelsbrunner and John Harer. Computational Topology - an Introduction. American Mathematical Society, 2010. ISBN 978-0-8218-4925-5.

Patrizio Frosini and Claudia Landi. Persistent Betti numbers for a noise tolerant shapebased approach to image retrieval. In Computer Analysis of Images and Patterns, pages 294 301. Springer-Verlag Berlin Heidelberg, August 2011. ISBN 978-3-642-23671-6 9783-642-23672-3. doi: 10.1007/978-3-642-23672-3 36. URL https://link.springer.com/ chapter/10.1007/978-3-642-23672-3_36.

Marcio Gameiro, Yasuaki Hiraoka, Shunsuke Izumi, Miroslav Kramar, Konstantin Mischaikow, and Vidit Nanda. A topological measurement of protein compressibility. Japan

Multiparameter Persistence Landscapes

Journal of Industrial and Applied Mathematics, 32(1):1 17, 2015. ISSN 0916-7005, 1868937X. doi: 10.1007/s13160-014-0153-5. URL https://link.springer.com/article/ 10.1007/s13160-014-0153-5.

Robert F. Geitz. Pettis Integration. Proceedings of the American Mathematical Society, 82(1):81, 1981. ISSN 00029939. doi: 10.2307/2044321. URL http://www.jstor.org/ stable/2044321?origin=crossref.

Irving J. Good and Ray A. Gaskins. Density estimation and bump-hunting by the penalized likelihood method exempliﬁed by scattering and meteorite data. Journal of the American Statistical Association, 75(369):42 56, 1980. ISSN 01621459. URL http://www.jstor. org/stable/2287377.

Heather A. Harrington, Nina Otter, Hal Schenck, and Ulrike Tillmann. Stratifying multiparameter persistent homology. SIAM Journal on Applied Algebra and Geometry, 3(3):439 471, 2019. doi: 10.1137/18M1224350. URL https://doi.org/10.1137/18M1224350.

Jørgen Hoﬀmann-Jørgensen and Gilles Pisier. The Law of Large Numbers and the Central Limit Theorem in Banach Spaces. The Annals of Probability, 4(4):587 599, 1976.

Sara Kaliˇsnik. Tropical coordinates on the space of persistence barcodes. Foundations of Computational Mathematics, 19(1):101 129, 2019. ISSN 1615-3383. doi: 10.1007/ s10208-018-9379-y. URL https://doi.org/10.1007/s10208-018-9379-y.

Xia Kelin and Wei Guo-Wei. Multidimensional persistence in biomolecular data. Journal of Computational Chemistry, 36(20):1502 1520, July 2015. ISSN 1096-987X. doi: 10.1002/ jcc.23953. URL https://onlinelibrary.wiley.com/doi/abs/10.1002/jcc.23953.

Bryn Keller, Michael Lesnick, and Theodore L. Willke. PHo S: Persistent homology for virtual screening. 2018. doi: 10.26434/chemrxiv.6969260.v1. URL https://chemrxiv. org/articles/PHo S_Persistent_Homology_for_Virtual_Screening/6969260.

Michel Ledoux and Michel Talagrand. Probability in Banach Spaces, Isoperimetry and Processes. Springer-Verlag Berlin Heidelberg, 2011. ISBN 978-3-642-20212-4.

Michael Lesnick. The theory of the interleaving distance on multidimensional persistence modules. Foundations of Computational Mathematics, 15(3):613 650, Jun 2015. ISSN 1615-3383. doi: 10.1007/s10208-015-9255-y. URL https://doi.org/10.1007/ s10208-015-9255-y.

Michael Lesnick and Matthew Wright. Interactive visualization of 2-D persistence modules. ar Xiv e-prints, art. ar Xiv:1512.00180, December 2015.

Ezra Miller. Data structures for real multiparameter persistence modules. ar Xiv e-prints, art. ar Xiv:1709.08155, September 2017.

Chul Moon, Noah Giansiracusa, and Nicole A. Lazar. Persistence terrace for topological inference of point cloud data. Journal of Computational and Graphical Statistics, 27(3): 576 586, 2018. doi: 10.1080/10618600.2017.1422432. URL https://doi.org/10.1080/ 10618600.2017.1422432.

Kazimierz Musia l. Topics in the theory of Pettis integration. 1991. ISSN 0049-4704.

Monica Nicolau, Arnold J. Levine, and Gunnar Carlsson. Topology based data analysis identiﬁes a subgroup of breast cancers with a unique mutational proﬁle and excellent survival. Proceedings of the National Academy of Sciences of the United States of America, 108(17):7265 7270, April 2011. ISSN 0027-8424. doi: 10.1073/pnas.1102826108. URL https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3084136/.

Steve Y. Oudot. Persistence Theory: From Quiver Representations to Data Analysis. Number 209 in Mathematical Surveys and Monographs. American Mathematical Society, 2015. URL https://hal.inria.fr/hal-01247501.

Jacek Skryzalin and Gunnar Carlsson. Numeric invariants from multidimensional persistence. Journal of Applied and Computational Topology, 1(1):89 119, 2017. ISSN 23671726. doi: 10.1007/s41468-017-0003-z. URL http://link.springer.com/10.1007/ s41468-017-0003-z.

The RIVET Developers. Rivet, 2018. URL http://rivet.online.