# dynamic_angular_synchronization_under_smoothness_constraints__8fefa555.pdf

Journal of Machine Learning Research 26 (2025) 1-45 Submitted 6/24; Revised 2/25; Published 4/25

Dynamic angular synchronization under smoothness constraints

Ernesto Araya araya@math.lmu.de Department of Mathematics Ludwig-Maximilians-Universit at M unchen Geschwister-Scholl-Platz 1, Munich, 80539, Germany

Mihai Cucuringu mihai@math.ucla.edu Department of Mathematics University of California Los Angeles 520 Portola Plaza, Los Angeles, CA 90095, USA Department of Statistics & Oxford-Man Institute of Quantitative Finance University of Oxford 24-29 St Giles , Oxford OX1 3LB, United Kingdom

Hemant Tyagi hemant.tyagi@ntu.edu.sg Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, 21 Nanyang Link, 637371, Singapore

Editor: Jie Peng

Given an undirected measurement graph H = ([n], E), the classical angular synchronization problem consists of recovering unknown angles θ 1, . . . , θ n from a collection of noisy pairwise measurements of the form (θ i θ j ) mod 2π, for all {i, j} E. This problem arises in a variety of applications, including computer vision, time synchronization of distributed networks, and ranking from pairwise comparisons. In this paper, we consider a dynamic version of this problem where the angles, and also the measurement graphs evolve over T time points. Assuming a smoothness condition on the evolution of the latent angles, we derive three algorithms for joint estimation of the angles over all time points. Moreover, for one of the algorithms, we establish non-asymptotic recovery guarantees for the meansquared error (MSE) under diﬀerent statistical models. In particular, we show that the MSE converges to zero as T increases under milder conditions than in the static setting. This includes the setting where the measurement graphs are highly sparse and disconnected, and also when the measurement noise is large and can potentially increase with T. We complement our theoretical results with experiments on synthetic data.

Keywords: angular synchronization, dynamical networks, smoothness constrained estimators

c 2025 Ernesto Araya, Mihai Cucuringu, Hemant Tyagi.

License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v26/24-0925.html.

Araya, Cucuringu and Tyagi

1. Introduction

The group synchronization problem, which consists of recovering group elements g 1, . . . , g n from a potentially sparse set of noisy pairwise ratios g i g 1 j , has attracted considerable attention in the last decade, starting with the seminal paper of Singer (Singer, 2011). The motivation for this problem originated from the cryo-EM (Shkolnisky and Singer, 2012) and sensor network localization (Cucuringu et al., 2012a) literatures, but instantiations of this problem occur in a variety of application domains, including computer vision (Arie Nachimson et al., 2012) in the context of the structure-from-motion problem ( Ozye sil et al., 2017), ranking from pairwise comparisons (Cucuringu, 2016), community detection in social networks (Cucuringu, 2015), NMR spectroscopy (Cucuringu et al., 2012b), temporal ordering and registration of images in image processing applications (Dsilva et al., 2015), and time synchronization of distributed networks (Giridhar and Kumar, 2006). In the most typical setting, the group elements g 1, . . . , g n are represented as the nodes of a simple undirected graph H = ([n], E), where an edge {i, j} is present if and only if the observer has access to a noisy version of g i g j 1. In the particular case where H is connected and the exact value g i g j 1 is observed, the recovery problem has a straightforward solution. Indeed, it suﬃces to consider a spanning tree of H, assign any value to an arbitrarily chosen root, and the remaining n 1 values are found by traversing the edges of tree, solving each time a simple equation. Consequently, these n 1 values are uniquely determined up to an global shift deﬁned by the chosen value of the root node. On the other hand, if H is disconnected, there is insuﬃcient information to determine a global solution, since there are no comparisons between the elements of diﬀerent connected components.

1.1 Angular synchronization

The simplest groups for which the synchronization problem has been studied are Z2, in the context of community detection and signed clustering with two clusters (Cucuringu, 2015; Liu et al., 2023), and the non-compact group R arising in the synchronization of clocks in a distributed network (Giridhar and Kumar, 2006). The special orthogonal group SO(d) is the most commonly studied group with a particular emphasis on SO(2) often referred to as angular synchronization which ﬁnds applications in various practical domains, including 2D sensor network localization (Cucuringu et al., 2012a) and ranking from pairwise comparisons (Cucuringu, 2016). In the angular synchronization problem, which is the focus of this paper, the goal is to recover n unknown angles θ 1, . . . , θ n [0, 2π) from a set of m noisy measurements of their pairwise oﬀsets Θij = (θ i θ j) mod 2π. Equivalently, we can represent each angle θ i by a unit modulus complex number g i = eιθ i and the pairwise oﬀsets by g i g j H, where for any z C , z H represent its complex conjugate. As explained above, the available pairwise measurements can be encoded into the edges of a measurement graph H, which in most typical applications can be modeled as a sparse random geometric graph (e.g., disc graph), a k-nearest-neighbor graph, or an Erd os-R enyi graph. The diﬃculty of the problem is ampliﬁed, on one hand, by the amount of noise in the available pairwise oﬀset measurements, and on the other hand by the sparsity of the measurements, since typically, |E| n 2 , i.e., only a small subset of all possible pairwise oﬀsets are available to the user.

Dynamic angular synchronization under smoothness constraints

Aiming to reconstruct the angle oﬀsets as best as possible, the initial approach of Singer (Singer, 2011) considered a quadratic maximization problem over the angular domain, that aims to maximize the number of preserved angle oﬀsets. Denote by A the measurement matrix of size n n (which can be seen as a weighted adjacency matrix of H = ([n], E)), where for each {i, j} [n] 2 , one considers the angular embedding

( eıΘij if {i, j} E 0 if {i, j} / E ,

with diagonal entries Aii = 0. Note that A is Hermitian since Θji = ( Θij) mod 2π. One then considers the following maximization problem

max θ1,...,θn [0,2π)

i,j=1 e ιθi Aijeιθj max g1,...,gn C; |gi|=1

i,j=1 g i Aijgj, (1)

which is in general NP hard to solve (Zhang and Huan, 2006). This intractability is usually circumvented by considering continuous relaxations of (1), leading typically to a spectral algorithm (Singer, 2011), or a semi-deﬁnite program (SDP) (Singer, 2011; Bandeira et al., 2017). Other algorithmic approaches include message passing algorithms (Perry et al., 2018; Bandeira et al., 2018), those based on graph neural networks (He et al., 2024), and also the generalized power method (Liu et al., 2017; Boumal, 2016). The statistical performance of algorithms is typically studied under the following two noise models.

Additive Gaussian noise (AGN) model (Bandeira et al., 2017). Here, H is the simple complete graph and the observations are of the form g i g j H + σWij, where (Wij)i<j are independent complex standard Gaussian s, Wii = 0 (for all i), and σ > 0 represents the noise level. Note that (1) is the maximum likelihood estimator (MLE) for the AGN model.

Outliers model (Singer, 2011). Here, H is modeled as an Erd os-R enyi graph with connection probability p. When p is small, the resulting measurement graph will be sparse and potentially disconnected (e.g., when p = o(log n

n )). Under this model, whenever {i, j} is an edge in H, we observe either the true value g i g j H with probability η, or a value uniformly distributed on the complex unit circle with probability 1 η. The parameter η quantiﬁes the level of noise.

Several results now exist pertaining to the statistical error analysis of the aforementioned algorithms for the above models. For instance, the SDP is known to be tight1 for the AGN model if σ is not too large w.r.t n (Bandeira et al., 2017; Zhong and Boumal, 2018). In particular, this was ﬁrst shown in (Bandeira and van Handel, 2016) for σ = O(n1/4), and was later improved to σ = O( p

n/ log n) in (Zhong and Boumal, 2018). Recovery guarantees for the spectral method exist for both the Outliers model (Singer, 2011), and the AGN model (e.g., (Boumal, 2016; Zhong and Boumal, 2018)). It was shown in (Zhong and Boumal, 2018) for the AGN model that the spectral method recovers g with an ℓ2 error bound of O(σ), and an ℓ error bound of O(σ p

log n/n), provided σ = O( p

1. This means that its global solution recovers that of (1).

Araya, Cucuringu and Tyagi

For this regime of σ, it was also shown in (Zhong and Boumal, 2018) that (with spectral initialization) the iterates of the generalized power method (GPM) converge linearly to the global solution of (1). This improved upon previous convergence results for the GPM (Boumal, 2016; Liu et al., 2017) which required stronger conditions on σ.

Towards dynamic group synchronization. In many applications of group synchronization, the underlying measurement graph, along with the associated pairwise data evolve with time. This is the case, e.g., for the ranking problem where the strengths of players, and hence the pairwise comparison outcomes evolve with time (e.g., sports tournaments (Cattelan et al., 2013; Lopez et al., 2018), recommender systems), and has been the subject of recent work (Bong et al., 2020; Araya et al., 2023; Karl e and Tyagi, 2023). In (Cucuringu, 2016), this problem was formulated as an instance of the angular synchronization problem, along with an algorithm Sync Rank which leads to state-of-art performance empirically. It is also the case for the signed community detection problem (with two communities), where the network and the underlying communities can evolve with time (e.g., (Chen et al., 2020)). This motivates us to initiate a principled study of group synchronization in a dynamic setting where the latent group elements, and the measurement graphs can both evolve with time.

1.2 Dynamic angular synchronization

In this paper, we introduce a dynamic version of the angular synchronization problem. Here, we are given a sequence of measurement graphs of the form Hk = ([n], Ek), over T time points k = 1, . . . , T. At each k, we are given noisy pairwise information about the latent signal g (k) Cn as per the AGN model, or the Outliers model; see Section 2.2. While the measurements are assumed to be independently generated over time, we assume that g (k) s evolve smoothly in the sense that

k=1 g (k) g (k + 1) 2 2 ST , (2)

for a suitably small smoothness parameter ST . The above condition essentially stipulates that the quadratic variation of g (k) w.r.t the path graph on T vertices is small. A similar assumption was recently used in (Araya et al., 2023) in the context of dynamic translation synchronization, which is also the main motivation for the present paper. Our goal then is to recover the vector g Cn T , obtained by column-stacking g (k) s. Speciﬁcally, we aim to output an estimate bg of g that is weakly-consistent w.r.t T in terms of the mean squared error (MSE), i.e.,

1 T bg g 2 2 = 1

k=1 bg(k) g (k) 2 2 T 0. (3)

The following points are useful to keep in mind.

1. When all the graphs are connected, one could estimate each g (k) by only using the pairwise information at time k. However this will typically lead to 1

T bg g 2 2 = Ω(1) as T . Exploiting the smoothness information over k is then, in a sense, necessary to ensure (3). This is particularly relevant in the setting where the noise level is large.

Dynamic angular synchronization under smoothness constraints

2. If each measurement graph is disconnected, then estimating g (k) s individually at each k will be meaningless. However, one might still expect, at least intuitively, that a weakly consistent estimate of g can be obtained by leveraging smoothness. For instance, if ST = 0, then this would be possible provided the union graph ([n], k Ek) is connected, where we would simply take the average of all the pairwise information.

Main contributions. We summarize our main contributions as follows.

We propose three algorithms for this problem, based on diﬀerent relaxation-based approaches.

1. Global sphere relaxation (GTRS-Dyn Sync; Algorithm 1): Optimizes a smoothnesspenalized data ﬁdelity term with solutions restricted to a sphere in Cn T . This leads to a trust-region subproblem (TRS) which is eﬃciently solvable.

2. Local sphere relaxation with global smoothing (LTRS-GS-Dyn Sync; Algorithm 2): Solves sphere-relaxed synchronization problems at each time point k (via a TRS formulation), stacks the local solutions, and projects the stacked vector onto the low-frequency eigenspace of a smoothness operator (motivated by the Laplacian eigenmaps estimator (Sadhanala et al., 2016)).

3. Global matrix denoising and local synchronization (GMD-LTRS-Dyn Sync; Algorithm 3): Projects the measurement matrix onto the low-frequency eigenspace of a smoothness operator (matrix denoising), followed by locally solving (at each time k) the sphere-relaxation of the angular synchronization problem via a TRS formulation.

As the TRS estimator is hard to analyze theoretically, we analyze a slight modiﬁcation of Algorithm 3, namely Algorithm 4, where the TRS estimator for the local synchronization step is replaced by an alternative spectral approach (see also Remark 3). We provide theoretical guarantees for Algorithm 4, in the form of non-asymptotic high probability bounds for the global ℓ2 estimation error, under the dynamic versions of the AGN and Outliers model (see Corollaries 17 and 19 respectively). In particular, under mild smoothness assumptions, we prove that (3) holds for the output of Algorithm 4, and give precise non-asymptotic bounds from which the convergence rates can be deduced.

1. In the case of the Outliers model, our results apply even when the measurement graphs are very sparse each being potentially disconnected (see Remarks 15 and 20).

2. For the AGN model, our results show that Algorithm 4 succeeds even when the noise level σ is large w.r.t n, and is allowed to grow with T as O(T α) for α [0, 1/4) (see Remarks 10 and 18).

Finally, we support our theoretical ﬁndings with extensive experiments on synthetic data, for the aforementioned AGN and Outliers models. Notably, our experiments demonstrate that Algorithm 3 usually outperforms other competing methods in terms of accuracy.

Araya, Cucuringu and Tyagi

While we have focused on the SO(2) group for concreteness, the framework in the present paper can be extended to other compact groups such as Z2 and SO(d).

1.3 Related work

Ranking and angular synchronization. The work of (Cucuringu, 2016) considered the problem of ranking n items given a set of pairwise comparisons on their rank oﬀsets, over a single measurement graph. The Sync Rank algorithm, introduced in (Cucuringu, 2016), formulates the above problem as an instance of the angular synchronization problem, following a group compactiﬁcation procedure of the real line. Concretely, the approach in (Cucuringu, 2016) starts by mapping the noisy rank oﬀsets Cij { (n 1), . . . , n 1} to angles Θij [0, 2πδ) via the transformation Θij := 2πδ Cij

n 1, where δ [0, 1). The resulting problem is then treated as an instance of the angular synchronization problem, and solved via the tools therein. A post-processing step is applied to mod out the best circular permutation, and recover the ﬁnal rankings. While we do not integrate our algorithmic framework with Sync Rank in this paper, it would be interesting to do so as future work, in order to evaluate its performance on dynamic ranking problems. More recently, (Huang et al., 2017; d Aspremont et al., 2021) investigated algorithms for recovering the latent strengths r i , i = 1, . . . , n from noisy versions of r i r j. This is an instance of the group synchronization over the real line, also dubbed as translation synchronization. The work (Huang et al., 2017) proposed an iterative procedure which involves solving a truncated least-squares problem at each iteration. Conditions were derived under which the iterates were shown to converge to the ground truth. In (d Aspremont et al., 2021), an algorithm based on the singular value decomposition was introduced, along with a detailed theoretical analysis.

Ranking from dynamic pairwise information. A recent line of work initiated a systematic study of ranking from dynamic pairwise information, along with statistical guarantees (Bong et al., 2020; Karl e and Tyagi, 2023; Araya et al., 2023). Here, the latent strengths of the items are assumed to evolve smoothly over T time points, with the pairwise data generated independently over time. These works are the main motivation behind the present paper.

The setup in (Bong et al., 2020; Karl e and Tyagi, 2023) assumed a dynamic Bradley Terry-Luce (BTL) model where the strength vector satisﬁes a Lipschitz condition in the (continuous) time domain [0, 1]. Pointwise estimation error bounds were derived for a given t [0, 1] via a two stage procedure: the ﬁrst stage involved suitably smoothing the pairwise data while the second stage involved applying an existing ranking method from the static setup. In particular, (Karl e and Tyagi, 2023) proposed a nearest-neighbor based averaging procedure in conjunction with Rank Centrality (Negahban et al., 2017), while (Bong et al., 2020) proposed a kernel based smoothing procedure in conjunction with the MLE for the static BTL model. The results in both papers imply pointwise consistency w.r.t T, with (Karl e and Tyagi, 2023) showing this even when the individual graphs are very sparse (Erd os-R enyi graphs) and potentially disconnected.

Dynamic angular synchronization under smoothness constraints

The work (Araya et al., 2023) considered the dynamic translation synchronization problem which is related to the BTL model for ranking. Here, the latent strength vector r (k) Rn (k = 1, 2, . . . , T) satisﬁes a similar smoothness assumption as (2). Given a measurement graph Hk = ([n], Ek) at time k, we are given noisy information r i (k) r j(k)+ϵij(k) for each {i, j} Ek, where ϵij(k) is zero-mean subgaussian noise. Two estimators were proposed, the ﬁrst of which involves solving an unconstrained smoothness-penalized least squares problem. The second estimator involves ﬁrst estimating r (k) via least-squares minimization, at each k, and then projecting the vector of estimated solutions on to the low-frequency space of a smoothness operator. Highprobability bounds were obtained for the MSE, establishing their consistency when T is large and the rest of parameters are ﬁxed. In particular, these results imply that the proposed estimators can tolerate larger levels of noise compared to the static case. Our Algorithms 1 and 2 are motivated from these two methods, the key diﬀerence being that we now need to incorporate the (relaxations of) the constraints, imposed by the group SO(2). This in particular leads to solving a TRS-based sphere-relaxation which is challenging to analyze theoretically. We also remark that the analysis in (Araya et al., 2023) requires each measurement graph to be connected due to technical reasons. Our analysis for Algorithm 4, however, does not suﬀer from these issues, and applies even when the graphs are very sparse and disconnected. We believe that a similar procedure can be applied for the dynamic translation synchronization problem, leading to similar guarantees as in the present paper (Theorem 13 and Corollary 19).

The work (Li and Wakin, 2021) considered a generalization of the translation synchronization model where the time-dependent model parameters evolve in a Markovian manner over time. Theoretical results were obtained for estimating a latent pairwise comparison matrix at time T. In general, however, existing results for the dynamic ranking problem are essentially empirical works (Fahrmeir and Tutz, 1994; Glickman and Stern, 1998; Lopez et al., 2018; Maystre et al., 2019; Cattelan et al., 2013; Jabin and Junca, 2015) with limited theoretical guarantees.

1.4 Paper structure

The remainder of this paper is structured as follows. In Section 2 we present the setup for dynamic synchronization. In Section 2.2 we introduce the dynamic versions of the AGN and Outliers model, togheter with our global smoothness assumptions. Our proposed estimators and the algorithms to compute them are described in Section 2.3. Section 3, which consists of the theoretical analysis of Algorithm 4, is divided into three parts. In Sections 3.1 and 3.2 we analyse the matrix denoising stage for the AGN and Outliers model respectively. In Section 3.3 we provide an analysis for the local synchronization stage of Algorithm 4, which is valid for both noise models. Section 4 contains numerical experiments on synthetic data and Section 5 contains concluding remarks.

Araya, Cucuringu and Tyagi

2. Problem setup and algorithms

In Section 2.1 we describe general notation used throughout the paper. Section 2.2 contains a formal description of the problem, while Section 2.3 formally outlines our proposed algorithms. In particular, Section 2.3 details additional notation which is needed for describing the algorithms and is also used in the analyses later in Section 3.

2.1 Notation

For any n N, deﬁne [n] = {1, , n}. For A Cn p, AH denotes its Hermitian conjugate transpose. We deﬁne the operators Re and Im, from Cn p to Rn p, which for any matrix give their real and imaginary parts, respectively. Denote e1, . . . , en Rn to be the canonical vectors. We represent the vector of all ones as 1, with the dimension being clear from the context. Let z 2 = p

|z1|2 + + |zn|2 denote the Euclidean norm for any z Cn. If Z, Z Cn m, their Frobenius inner product is deﬁned by Z, Z F = Tr(ZHZ ), where Tr( ) denotes the trace of a matrix. The Frobenius norm 2 F is the norm induced by this inner product. We denote the Kronecker product of Z and Z by Z Z , and their Hadamard product by Z Z . For a matrix Z Cn m, Z op is the operator norm of Z, which is deﬁned as its largest singular value. Deﬁne, for n N, the set Cn := {z Cn : |zi| = 1, for all i [n]}. Recall that the special orthogonal group SO(2) can be represented by the group of complex numbers z, such that |z| = 1, with the usual complex multiplication. The concatenation operator is deﬁned

for matrices X in Cn m and Y Cn m, denoted as concat(X, Y ) = X Y

This deﬁnition extends to any sequence (Xk)K k=1, with K N, and Xk Cnk m, through the recurrence concat(X1, . . . , Xk) = concat(concat(X1, . . . , Xk 1), Xk). Similarly, for any natural number m, blkdiag(X1, . . . , Xm) denotes the block-diagonal matrix with square matrices X1, . . . , Xm on the main diagonal. Recall that the subgaussian norm of a random variable X is given by

X ψ2 := sup p 1 p 1/2(E|X|p)1/p,

see e.g., (Vershynin, 2012) for other equivalent deﬁnitions. Note that if |X| M a.s, then X ψ2 M. When X follows a normal distribution with mean µ and variance σ2, we write X N(µ, σ2). We say X CN(0, 1) if X is complex-valued and

Re(X), Im(X) ind. N(0, 1/2).

We will use the classic asymptotic notation, deﬁned as follows. We say that f(t) = O(g(t)) if there exists positive real numbers t0, M, such that for all t t0 we have f(t) Mg(t). Similarly, we write f(t) = o(g(t)) is for every constant c > 0, there exists a constant t0, such that, f(t) cg(t), for all t t0. Additional, we say f(t) = Ω(g(t)), if f(t) = O(g(t)), and f(t) = ω(g(t)), if g(t) = o(f(t)). Symbols used for denoting constants (e.g., c, C, c1 etc.) may change from line to line.

Dynamic angular synchronization under smoothness constraints

2.2 Problem setup

Consider a set of items [n] and a sequence of time-indexed undirected comparison graphs Hk = ([n], Ek) for k = 1, . . . , T. The set of edges Ek determines which items are being compared at time k. For each i [n], we associate time-dependent latent phases g i (k) SO(2) which will constitute the ground truth signal that we aim to recover from noisy pairwise measurements. More formally, at time k [T], we observe for every pair {i, j} Ek a noisy version of g i (k)g j (k)H. Notice that even without the addition of noise, the recovery of the ground truth phases g i (k), from the observation of g i (k)g j (k)H, can only be accomplished up to a global phase-shift. That is, given angles θ1, , θT [0, 2π), the quantities (g i (k) eιθk)T k=1 deﬁne the same relative phases as (g i (k))T k=1. To avoid identiﬁability issues, we will make the following assumption on the phase of the ﬁrst element of each block.

Assumption 1 (Anchoring) For all k [T], we assume that g 1(k) = 1.

For each k [T], denote g (k) = concat(g 1(k), , g n(k)) Cn, and g to be formed by column-wise stacking as

g = concat(g (1), , g (T)) =

g (1) ... g (T)

We now describe two statistical models for generating the noisy pairwise measurements at each k.

Additive Gaussian noise (AGN) model. Denoting for each k [T], G (k) := g (k)g (k)H, we are given the measurements A(k) Cn n where

A(k) = (G (k) In) + σW(k) for k [T], (5)

where W(k) is a random complex Hermitian (noise) matrix with independent entries above the diagonal and Wii(k) = 0 (for all i). In particular, for each i = j, Wij(k) CN(0, 1). The parameter σ 0 denotes the noise level. Implicit in (5) is the assumption that the comparison graphs Hk are all complete. We remark that in (5), Aii(k) = 0 for each i, in contrast to the Spiked Gaussian Wigner model discussed in Section 1. Furthermore, we assume that W(1), . . . , W(T) are independent.

Outliers model. In this model, we assume that the observation graphs are generated independently from Erd os-R enyi distributions, i.e., Hk ind. ER(n, p(k)), where p(k) represents the connection probability at time k. For each k [T] and edge {i, j} Ek, we observe the exact relative phase g i (k)g j (k)H, with probability 1 η, and pure noise (chosen uniformly at random in SO(2)), with probability η. The matrix of observations A(k) can then be written as

g i (k)g j (k)H with probability (1 η) p(k), eιϕij(k) with probability ηp(k), 0 with probability (1 p(k)),

Araya, Cucuringu and Tyagi

where φij(k) ind. Unif[0, 2π). Since there are no observations when i = j, we choose to set (A(k))ii = 0, for all i [n]. Note that (Aij(k))i j are independent, and A(k) can be expressed as A(k) = (1 η) p(k)(G (k) In) | {z } =E[A(k)]

where R(k) := A(k) E[A(k)]. It is easy to see that (Rij(k))i j are independent centered sub-Gaussian random variables.

Smoothness assumption. As discussed in the introduction, we will assume that the latent phases g i (k) do not change too quickly with time as outlined below.

Assumption 2 (Global ℓ2 smoothness) For a given parameter ST > 0,

k=1 g (k) g (k + 1) 2 2 ST . (7)

Denoting M {0, 1, 1}T (T 1) to be the incidence matrix of any directed path graph2 on T vertices, this can be written equivalently as

(M In)g 2 2 = g H((MM ) In)g ST ,

where MM corresponds to the Laplacian matrix of the undirected path graph.

The above assumption states that g (k) does not change too quickly with k on average , and resembles the notion of quadratic variation of a function. It can also be viewed as stating that g lies close (depending on ST ) to the subspace spanned by the smallest few eigenvectors of (MM ) In. The closed-form expressions of the eigenvectors (and also eigenvalues) of MM are well-known (e.g., (Brouwer and Haemers, 2012)), and can be viewed as the Fourier basis for signals residing on that graph (with the smallest eigenvector corresponding to the lowest frequency signal, and so on). Since Assumption 1 is in place, we do not need to worry about the invariance of (7) with respect to constant shifts.

2.3 Smoothness constrained estimators

We now describe algorithms to estimate g , each of which incorporates Assumption 2 in diﬀerent ways. We ﬁrst recall that in the static setting (for some ﬁxed time k [T]), the SO(2)-synchronization problem can be formulated as solving

max g(k) Cn g(k)HA(k)g(k), (8)

which, under the AGN model, corresponds to ﬁnding the MLE. As a sanity check, one could solve (8) for each k, and stack the solutions to produce an estimate of g . However, this suﬀers from the drawback of treating each time block independently. In light of Assumption 2, there is a temporal smoothness in the observations, which can be exploited to improve the estimation. Before proceeding, it will be useful to introduce some notation.

2. The rows of M correspond to the T vertices, and the columns correspond to the T 1 directed edges. For the column corresponding to edge i j, we set its ith entry to +1, the jth entry to 1, and the rest of the entries in that column to 0.

Dynamic angular synchronization under smoothness constraints

We will use the diacritical mark to denote the removal of information relative to the ﬁrst item of each time block. Thus g (k) and g(k), are deﬁned as

g (k) = concat(1, g (k)) = 1 g (k)

, where g (k) Cn 1,

g(k) = concat(1, g(k)) = 1 g(k)

, where g(k) Cn 1.

Similarly, A(k) C(n 1) (n 1) and b(k) Cn 1 are deﬁned as

A(k) = A11(k) b H(k) b(k) e A(k)

Let us deﬁne the set

1 g H : g Cn 1 o .

Note that G (k) G for each k.

Whenever we do not specify the time index, it will mean that we stack all time blocks, as

A = concat A(1), A(T) =

A(1) A(2) ... A(T)

, G = concat G (1), , G (T) =

G (1) G (2) ... G (T)

G = concat G(1), , G(T) =

G(1) G(2) ... G(T)

For any z = (z1, , zm) Cm we deﬁne its projection onto Cm by

PCm(z)i := zi |zi| if zi = 0 1 otherwise. (10)

Finally, for any τ [T], deﬁne

Lτ,n := span {v T k ej}τ 1,n k=0,j=1 ,

where {vk}T k=1 are the eigenvectors of MM , the Laplacian of the path graph on T vertices, with corresponding eigenvalues λ1 λ2 > λT (= 0). Denote Pτ,n to be the projection matrix onto the space Lτ,n, i.e.,

k=0 v T kv T k

Araya, Cucuringu and Tyagi

The following simple observation shows that Assumption 2 can be equivalently stated (up to factors depending only on n) as a condition on the matrices G (k). Note that, for all k [T], g (k) g (k + 1) 2 2 = g (k) g (k + 1) 2 2,

G (k) G (k + 1) 2 F = 2 g (k) g (k + 1) 2 2 + g (k) g (k)H g (k + 1) g (k + 1)H 2 F ,

2 g (k) g (k + 1) 2 2 G (k) G (k + 1) 2 F (2 + 4n) g (k) g (k + 1) 2 2. (12)

This implies, in particular, that the matrices G (k) do not evolve too quickly, as given by

k=1 G (k) G (k + 1) 2 F (2 + 4n)ST . (13)

2.3.1 Global sphere relaxation

Denoising optimization problem. Since (13) holds, under Assumption 2, it is natural to consider the following denoising optimization problem, which promotes smoothness via penalization.

min G=concat(G(1), ,G(T)) s.t k:G(k) G

k=1 A(k) G(k) 2 F | {z } data ﬁdelity term

k=1 G(k) G(k + 1) 2 F | {z } smoothness term

where λ 0 is the parameter that controls the amount of regularization in (14). On the other hand, for G(k) G, we have

A(k) G(k) 2 F = A(k) 2 F + G(k) 2 F | {z } =n2 g(k)H A(k) g 2Re b H(k) g(k) , (15)

which will help us write (14) more compactly (with variables g(k)). Speciﬁcally, using (15) and (12), the following problem is equivalent to (14)

max g=concat( g(1), , g(T)) s.t k: g(k) Cn 1

g H Ablock g + 2Re(b H g) | {z } data ﬁdelity term

λ g H(MM In 1) g | {z } smoothness term

where Ablock := blkdiag( A(1), , A(T)).

Remark 1 (Static anchored synchronization) In the case T = 1, the smoothness penalty term will be absent from (14), and we recover the usual anchored synchronization MLE for the AGN model min G G A G 2 F max g Cn 1 g H A g + 2Re(b H g). (17)

Dynamic angular synchronization under smoothness constraints

Remark 2 (Complexity of (14)) As mentioned in (Bandeira et al., 2017, Sec.2), the problem (17) is an instance of complex quadratic programming, which is known to be NPHard (Zhang and Huan, 2006, Prop.3.5). Without the penalty term, problem (14) amounts to solve T independent instances of (17), clearly remaining NP-Hard. However, it is unclear whether the hardness holds for all λ > 0. In particular, if one allows the value λ = , the problem becomes trivial as any perfectly smooth G (that is G(k) = G(1) for all k [T]) will be a solution. Nevertheless, the problem remains highly non-convex even for large values of λ, which makes the existence of an eﬃcient algorithm for solving (14) (for an arbitrary A) unlikely.

In light of Remark 2, we consider the following relaxation of (16)

max g=concat( g(1), , g(T)) s.t g 2 2=(n 1)T

g H Ablock λ(MM In 1) g + 2Re(b H g). (18)

Problem (18) is an instance of the trust region subproblem (TRS) with equality constraints, which consists of minimizing a general quadratic function (non-necessarily convex), under a sphere constraint. Although it will not be used in the sequel, it is worth mentioning that a version of TRS with ℓ2 ball constraints exists (and it is, in some respect, more standard). There is a vast literature on the theory of TRS, including the characterization of its solutions (Hager, 2001), and several eﬃcient algorithms that solve it, in both the spherical and ball constraints (Adachi et al., 2017). The complete procedure is outlined in Algorithm 1.

Algorithm 1 Global TRS for Dynamic Synchronization (GTRS-Dyn Sync)

1: Input: Matrix A Cn T n of observations, smoothness parameter λ > 0.

2: Form the matrix Ablock = blkdiag A(1), , A(T) .

3: Obtain b g = concat(b g(1), ,b g(T)) C(n 1)T a solution of (18)

4: for k = 1 . . . , T do

5: Deﬁne bg(k) = 1 PCn 1b g(k)

7: Output: Estimator bg Cn T .

2.3.2 Local sphere relaxation

Instead of the global relaxation in (18), we now consider a diﬀerent methodology which entails, as a ﬁrst step, solving the following sphere relaxation of (17)

b g(k) := arg max g(k) Cn 1

g(k) 2 2=n 1

g(k)H A(k) g(k) + 2Re b(k)H g(k) . (19)

This leads to a local estimate b g(k) for each k [T]. Stacking them, we obtain

b g := concat b g(1), ,b g(T) ,

Araya, Cucuringu and Tyagi

which satisﬁes the sphere constraint for each time block. The smoothness constraint has not been used yet, so as a second step, we consider an additional projection step onto the space of low frequency eigenvectors of (MM ) In 1. To achieve this, our proposed algorithm takes a regularization parameter τ [T] as input, and forms the projection matrix Pτ,n 1, which projects onto Lτ,n 1. It is worth noting that when τ is small, this projection step has a smoothing eﬀect. For instance, for τ = 1, the resulting projected-vector has equal block components. After the projection step onto Lτ,n 1, we ﬁnally need to use the projection PC(n 1)T to ensure that the estimator lies in C(n 1)T . The complete procedure is outlined in Algorithm 2.

Algorithm 2 Local TRS + Global Smoothing for Dynamic Synchronization (LTRS-GS-Dyn Sync)

1: Input: Matrix A Cn T n of observations, smoothness parameter τ N.

2: for k = 1, . . . , T do

3: Obtain b g(k) solution of (19).

5: Form b g = concat b g(1), ,b g(T) C(n 1)T 1

6: Deﬁne h = Pτ,n 1b g where Pτ,n 1 C(n 1)T (n 1)T is the projection onto Lτ,n 1 as in (11).

7: for k = 1, . . . , T do

8: Deﬁne bg(k) = 1 PCn 1 h(k)

, where h(k) is such that h = concat( h(1), , h(T)).

10: Output: Estimator bg Cn T .

2.3.3 Denoising measurement matrices

An alternative procedure involves denoising the measurement matrices A (recall (9)) by projecting its columns onto the low-frequency eigenspace of (MM ) In, in the spirit of the Laplacian eigenmaps estimator (see e.g., (Sadhanala et al., 2016)). Our proposed method consists of the following two steps.

Matrix denoising stage. We project the columns of A onto the low frequency space Lτ,n, leading to the intermediate estimator b G = Pτ,n A.

Local synchronization stage. For each k [T] we ﬁrst Hermitianize the block b G(k) to obtain b G (k), and then solve a local synchronization problem, as in (17), but with input data b G (k). In Algorithm 3, we solve its TRS relaxation as in (19).

The matrix denoising stage can be regarded as a global denoising step, where the temporal smoothness captured by (MM ) In is used to produce an estimation of G Cn T n T . Notice that after this step, the obtained b G does not necessarily lie in G. This constraint is enforced in the local anchored synchronization step, where we ﬁrst solve a relaxation such as (19) and then project the solution onto Cn T . The complete procedure is outlined in Algorithm 3.

Dynamic angular synchronization under smoothness constraints

Remark 3 (Spectral local synchronization) It is possible to replace the TRS-based approach with a spectral method to solve the local synchronization problem in Algorithm 3. More speciﬁcally, lines 5 and 6 in Algorithm 3 are replaced by the following lines

5: Find bg (k) the eigenvector associated to the largest eigenvalue of b G (k).

6: Deﬁne bg(k) = PCn(bg (k))e ιˆθ1(k), where ˆθ1(k) is deﬁned by bg 1(k) = |bg 1(k)|eιˆθ1(k) (with ˆθ1(k) = 0 if bg 1(k) = 0).

While the spectral method may sacriﬁce some accuracy compared to the TRS-based approach, it oﬀers simplicity in analysis. Our theoretical analysis in Section 3 is based on the spectral approach (outlined separately as Algorithm 4 for clarity), whereas in our experiments, we opt for the TRS-based method.

Remark 4 (Denoising smooth signals on a graph) As alluded to earlier, the matrix denoising step in Algorithm 3 is motivated by the Laplacian eigenmaps estimator (see e.g., (Sadhanala et al., 2016)) for denoising smooth signals on a graph. To make the connection explicit using our notation, we recall that in this latter problem, we are given an undirected graph H = ([T], E), along with noisy measurements y = x + η of a signal x RT with η denoting the zero-mean noise. Denoting L to be the Laplacian matrix of H, here x is assumed to be smooth w.r.t H in the sense that

(x ) Lx = X

{i,j} E (x i x j)2 ST . (20)

The Laplacian eigenmaps estimator then estimates x by projecting y on to the space spanned by the τ smallest eigenvectors of L.

In our setting, the aforementioned graph H is a path graph3 and each vertex k [T] has a latent signal g (k) Cn. We observe at each k a noisy matrix A(k) which is also in the form of a signal (scalar times G (k) In) plus zero-mean noise term; recall (5) and (6). Moreover, g (k) s satisfy a smoothness condition (recall (7)) which, as discussed earlier, translates to a smoothness condition on G (k) s (recall (13)) that is analogous to (20).

As we will see in Section 3, the outline of the analysis is at a top-level similar to that in (Sadhanala et al., 2016) in the sense of bounding the bias and variance error terms, and ﬁnding the optimal τ. The diﬀerence lies in the technical details: we observe matrixvalued information (with complex-valued entries) at each node of H which leads to more cumbersome calculations especially for the Outliers model in deriving high probability error bounds.

Remark 5 (Comparison with (Araya et al., 2023)) As discussed in Section 1.3, the work of (Araya et al., 2023) considered an estimator for the dynamic translation synchronization problem, which also relied on the idea of projecting on to the low-frequency subspace. In that work, the estimator ﬁrst solved the least-squares problem locally at each time-point, and then projected the stacked vector of individual-solutions onto the low-frequency space. This required each individual graph to be connected for the analysis to work. In Algorithm

3. since we consider its vertices to denote time instants, although the analysis can be extended to general connected graphs as well.

Araya, Cucuringu and Tyagi

3, however, we ﬁrst project the (stacked) input data matrices onto the low-frequency space, and then solve the local synchronization problem. This allows the individual graphs to be disconnected in the Outliers model something we will prove in Section 3 for a modiﬁed version of Algorithm 3, namely Algorithm 4.

Algorithm 3 Global Matrix Denoising + Local TRS for Dynamic Synchronization (GMD-LTRS-Dyn Sync)

1: Input: Matrix A Cn T n of observations, parameter τ N.

2: Deﬁne b G = Pτ,n A, where Pτ,n Cn T n T is the projection onto Lτ,n.

3: for k = 1 . . . , T do

4: Hermitianize b G (k) :=( b G(k) + b G(k) H)/2.

5: Find b g(k) solution of (19) with data b G (k).

6: Form bg(k) = 1 PCn 1b g(k)

8: Output: Estimator bg = concat bg(1), , bg(T) Cn T .

3. Analysis

In this section, we present theoretical guarantees for recovering g (1), . . . , g (T) under the AGN and Outliers models. Sections 3.1 and 3.2 contain results speciﬁcally addressing the matrix denoising stage of Algorithm 3 (as elaborated in Section 2.3.3). The main results therein are Theorem 8 for the AGN model, and Theorem 13 for the Outliers model. Then, Section 3.3 uses these estimates to provide guarantees for recovering g (1), . . . , g (T) it provides an analysis of the local synchronization stage by employing the spectral approach outlined in Remark 3. The main results therein are Corollary 17 for the AGN model, and Corollary 19 for the Outliers model.

3.1 Matrix denoising under AGN model

Given (5), we can write (c.f. the notation introduced in Section 2.3)

A = (G 1T In) | {z }

where A, G are as in (9), and W = concat(W(1), . . . , W(T)) stacks the (Hermitian) noise matrices. We will focus on bounding the distance between the intermediate estimator b G = Pτ,n A and the (shifted) ground truth G . To this end, note that

b G = Pτ,n A = Pτ,n G + σPτ,n W,

from which we deduce

b G G 2 F = P τ,n G 2 F + σ2 Pτ,n W 2 F , (21)

Dynamic angular synchronization under smoothness constraints

where P τ,n := In T Pτ,n. The ﬁrst term on the right-hand side (RHS) of (21) represents the bias which depends on the smoothness level. Clearly G has the same quadratic variation as G and hence satisﬁes (13) as well. Thus, we expect this term to decrease as τ increases. The second term therein corresponds to the variance introduced by noise, which we expect to increase with τ. Hence, selecting the optimal value for τ will achieve the right bias-variance trade-oﬀ. The following lemma gives a control over the bias term.

Lemma 6 (Bias bound) Let G Cn T n be a matrix satisfying (13), for some ST 0. Then, it holds for all 1 τ T

P τ,n G 2 F 20n

τ 2 1τ<T . (22)

Proof When τ = T, we have P τ,n = 0, from which the ﬁnal bound follows. So assume now that 1 τ T 1. Denote the j-th column of G by G :,j. Since

P τ,n G 2 F =

j=1 P τ,n G :,j 2 2,

we have for each j [n] that

P τ,n G :,j 2 F =

k=τ v T k ei, G :,j 2.

On the other hand, it is easy to see that (13) can be rewritten as

k=0 λT k v T k ei, G :,j 2 (2 + 4n)ST ,

from which we deduce (since n 2)

k=τ v T k ei, G :,j 2 5n ST .

This implies (notice that λT τ = 0 since 1 τ T 1)

j=1 P τ,n G :,j 2 2 5n ST

By the known expression for the eigenvalues of the Laplacian of the path graph (Brouwer and Haemers, 2012), we have λT τ = 4 sin2 ( τπ

2T ), and using that sin(x) x

2, for x [0, π/2], we obtain n X

j=1 P τ,n G :,j 2 2 20n

We now control the variance term.

Araya, Cucuringu and Tyagi

Lemma 7 (Variance bound) Let W(1), . . . , W(T) be independent, random Hermitian matrices as in the AGN model in Section 2.2. Then there exists a constant C > 0, such that for all τ [T] and δ (0, 1), it holds with probability larger than 1 δ that

Pτ,n W 2 F C n2τ + log 4

Proof To simplify the analysis, we split W as follows

2(W1 + W2),

where W1, W2 Cn T n satisfy W1(k) = W2(k)H, for all k [T], and the entries of W1(k) are i.i.d random variables from CN(0, 1), for all k [T]. Furthermore, we consider the decomposition Ws = Re(Ws) | {z } =:Ws,R

+ι Im(Ws) | {z } =:Ws,I

for s = 1, 2. It will be enough to bound Pτ,n W1 F , since the same bound will hold for Pτ,n W2 F , analogously. Denoting (W1,R):,j (resp. (W1,I):,i) to be the j-th column of W1,R(resp. W1,I), we obtain

Pτ,n W1 2 F = Pτ,n W1,R 2 F + Pτ,n W1,I 2 F

j=1 Pτ,n(W1,R):,j 2 2 +

j=1 Pτ,n(W1,I):,j 2 2

= W 1 H(I2n Pτ,n)W 1 | {z } =:V1

where W 1 R2n2T 1 is deﬁned as

(W1,R):,1 ... (W1,R):,n (W1,I):,1 ... (W1,I):,n

Observe that W 1 has independent entries, with each entry either 0 or drawn from N(0, 1/2). To bound V1, we will use the Hanson-Wright inequality (Rudelson and Vershynin, 2013, Thm. 1.1). It is easy to see that

I2n Pτ,n op = 1, I2n Pτ,n 2 F = 2n2τ.

Thus, from Hanson-Wright inequality, we obtain, for all t 0

P(|V1 E[V1]| > t) 2 exp c1

Dynamic angular synchronization under smoothness constraints

On the other hand, notice that E[V1] c2n2τ. We deduce that, for any δ > 0, with probability larger than 1 δ,

2C n2τ + log 2

for a constant C > 0. To pass from the ﬁrst to the second inequality, we used the fact a2 + ab + b2 2(a2 + b2) for any a, b R. Arguing in an analogous way, we obtain that

Pτ,n W2 2 F 2C n2τ + log 2

with probability greater than 1 δ. Since Pτ,n W 2 F Pτ,n W1 2 F + Pτ,n W2 2 F , the result follows via the union bound.

Optimal choice of τ. Combining (21) with Lemmas 6 and 7 we obtain the following bound (with probability larger than 1 δ)

b G G 2 F 20n

τ 2 1τ<T + Cσ2 n2τ + log 4

It remains to choose the optimal value of τ [T] which minimizes the RHS of (23), up to constant factors. To this end, using 1τ<T 1, note that the global minimizer of the ensuing bound is given by (T 2ST

nσ2 )1/3. But this quantity does not necessarily lie in [T]. Hence we deﬁne

τ := min 1 + j T 2ST

1/3k , T [T]. (24)

Clearly τ is not the global minimizer of the bound in (23) but it will lead to an error bound which achieves the right scaling in terms of T and ST . This is shown in the following theorem which is the ﬁrst main result of this section. Its proof is deferred to Appendix A.

Theorem 8 (Main result: G recovery under AGN model) Let A be the stacked measurement matrix for the AGN model in (5), with a ground truth g satisfying Assumption 2. Then choosing τ = τ with τ as in (24), the following holds for the estimator b G = Pτ,n A. There exists a constant C > 0 such that for any δ (0, 1), it holds with probability at least 1 δ that

b G G 2 F C max

n T 2ST 1 + ( T 2ST

nσ2 2/3 , n ST

1{τ <T} + σ2n2 min 1 + T 2ST

Furthermore, there exist constants C1, C2 > 0 such that if

T C1 max nσ2

, σ4ST n5 (26)

Araya, Cucuringu and Tyagi

then (25) implies

b G G 2 F C2 σ4/3T 2/3S1/3 T n5/3 + σ2 log 4

The following remarks are in order for interpreting Theorem 8.

Remark 9 When σ = 0 then we obtain τ = T and b G G F = 0. This is also seen directly from (23). If ST = 0, we obtain τ = 1 and b G G 2 F σ2n2 + σ2 log(4/δ). This can also be obtained directly from (23). Note that 1

T b G G F = O(1/

T) which is the parametric rate.

Remark 10 The ﬁrst part of Theorem 8 holds for all σ, ST 0. The interesting situation, however, is when σ, ST > 0 especially when both these quantities are suﬃciently bounded away from zero. In this case, the second part of Theorem 8 is meaningful. For instance, suppose σ T α for some α [0, 1/4). Then provided

ST n T 2(1 α) and ST = o(T 1 4α) (28)

it is easy to see that (26) will be satisﬁed for large enough T. Moreover, (27) then implies

1 T b G G 2 F T 0.

Finally, we remark that the O(T 2/3S1/3 T ) term in (27) matches that obtained in (Araya et al., 2023, Theorem B) and also matches the rate for denoising smooth signals on a path graph (Sadhanala et al., 2016, Theorem 6).

3.2 Matrix denoising under the Outliers model

We now focus on the outliers model, deﬁned in (6), which can be written in stacked matrix notation as A = D(G In) + R = DG + R, (29)

where D = 1 η blkdiag (p(1)In, . . . , p(T)In) and R = concat R(1), . . . , R(T) . Note that, for each k [T], R(k) is a Hermitian matrix with (Rij(k))i j being independent centered sub-Gaussian random variables, and in particular, Rii(k) = 0 for each i [n]. Our goal is to bound the error b G DG 2 F . Since b G = Pτ,n A = Pτ,n(DG)+ Pτ,n R, we obtain the bound

b G DG 2 F 2 P τ,n(DG ) 2 F | {z } Bias

+2 Pτ,n(R) 2 F | {z } Variance

where we recall P τ,n = In T Pτ,n. The error bound decomposition in (30) is similar to the bias-variance decomposition in (21). In order to control the bias term P τ,n(DG ) 2 F , we need to place an assumption on the level of ﬂuctuation of p(1), . . . , p(T). Indeed, consider the ideal scenario where p(k) = p for all k [T]. Then, P τ,n(DG ) 2 F = (1 η)2p2 P τ,n G 2 F which can then be bounded using Lemma 6. This motivates the following assumption on the smoothness of the sequence (p(k))T k=1, or equivalently, on the diagonal entries of matrix D.

Dynamic angular synchronization under smoothness constraints

Assumption 3 The matrix D is µ-smooth, i.e., for some µ 1 it holds for all τ [T] that Pτ,n(DG ) 2 F µ d2 Pτ,n G 2 F ,

where d := PT k=1 d(k)

T , and d(k) := (1 η)p(k).

If p(k) = p for all k, then note that Assumption 3 holds with µ = 1. Now, if D satisﬁes Assumption 3, we can readily bound the bias term using (22) as

Pτ,n(DG ) 2 F C1 µ d2n T 2ST

τ 2 1τ<T (31)

for some constant C1 > 0. For the variance term we have the following analog to Lemma 7. The proof mirrors that of Lemma 7, with a technical diﬀerence arising due to the diﬀerent distribution of the entries of R. Its proof is deferred to Appendix B.

Lemma 11 (Variance bound) Let R(1), . . . , R(T) be independent random Hermitian matrices arising from the Outliers model in Section 2.2. Denote Q(p) to be the ψ2 norm of a centered Bernoulli random variable with parameter p, and deﬁne

pmax := max k [T] p(k), Q := η + Q(η) + max k [T] Q(p(k))

V := max k [T] {p(k)(1 p(k))} , f(η, pmax, V ) := pmaxη + (1 η)2V + pmax V (1 η).

Then, there exist constants C1 (0, 1) and C2 > 0 such that for any δ (0, C1), it holds with probability at least 1 δ that

Pτ,n R 2 F C2

n2τf(η, pmax, V ) + Q2n τ log 1

Using (31) and Lemma 11, we arrive at the following theorem which is the second main result of this section. The proof uses similar calculations as for that of Theorem 8 and is deferred to Appendix C.

Remark 12 The exact expression for Q(p) is known to be (Ostrovsky and Sirota, 2014)

1 2p 4 log(1 p

Moreover, Q( ) is non-negative continuous on [0, 1], with Q(0 + 0) = Q(1 0) = 0, and Q2(1/2) = 1/8 (see (Ostrovsky and Sirota, 2014)). It is easy to also verify that Q(p) 1

2 log(1/p) if p 1/4.

Theorem 13 (Main result: G recovery under Outliers model) Let A be the stacked measurement matrix under the outliers model (6), with a ground truth g satisfying Assumption 2, and the matrix D (c.f. (29)) satisﬁes Assumption 3 with parameter µ 1. Then there exist constants C1 (0, 1) and C2, C3, C4, C5 > 0 such that for any δ (0, C1), the following is true (recall the notation of Lemma 11).

Araya, Cucuringu and Tyagi

1. Denote f(η, pmax, V, Q, δ) := f(η, pmax, V ) + Q2 log(1/δ), and choose

τ = τ := min 1 + j µ d2T 2ST n f(η, pmax, V, Q, δ)

1/3k , T . (32)

Then with probability at least 1 δ,

b G DG 2 F C2 max

µ d2n T 2ST

1 + µ d2T 2ST n f(η,pmax,V,Q,δ)

2/3 , µ d2n ST

+ C3 f(η, pmax, V, Q, δ)n2 min 1 + µ d2T 2ST n f(η, pmax, V, Q, δ)

2. Furthermore, if T satisﬁes

( n f(η, pmax, V, Q, δ)

1/2 , µ d2ST n f(η, pmax, V, Q, δ)

then (33) implies

b G DG 2 F C5 f2/3(η, pmax, V, Q, δ)(µ d2)1/3n5/3T 2/3S1/3 T . (35)

Remark 14 If ST = 0, then τ = 1 and (33) simpliﬁes to b G DG 2 F f(η, pmax, V, Q, δ)n2. Hence 1

T b G DG F = O(1/

T) which is the parametric rate. On the other hand, if

η = 0 and p(k) = 1 for all k, then note that Q = V = f(η, pmax, V, Q, δ) = f(η, pmax, V ) = 0. This implies τ = T and b G DG F = 0.

Remark 15 Suppose η is constant and let pmax 1/n. This means that a given measurement graph Hk is disconnected with high probability. Moreover, suppose d 1/n, and µ 1 (e.g., this will happen if p(k) = p 1/n for all k). Then we have Q 1, V 1/n which implies f(η, pmax, V ) 1/n and f(η, pmax, V, Q, δ) log(1/δ) (assuming δ is a constant for simplicity). Consequently, the condition in (34) simpliﬁes to

T max n3 log(1/δ)

1/2 , ST n3 log(1/δ)

and (35) translates to b G DG 2 F n T 2/3S1/3 T log2/3(1/δ). Thus for the smoothness regime ST = ω(1/T 2) and ST = o(T), we see that (36) will be satisﬁed for T suitably large w.r.t n.

3.3 Spectral local synchronization guarantee

We now proceed to establish guarantees for the local synchronization phase within Algorithm 3. While Algorithm 3 employs a TRS-based methodology at this stage, it is easier to analyze the spectral method, as delineated in Remark 3. This is outlined as a separate

Dynamic angular synchronization under smoothness constraints

Algorithm 4 Global Matrix Denoising + Local Spectral method for Dynamic Synchronization (GMD-LSPEC-Dyn Sync)

1: Input: Matrix A Cn T n of observations, parameter τ N.

2: Deﬁne b G = Pτ,n A, where Pτ,n Cn T n T is the projection onto Lτ,n.

3: for k = 1 . . . , T do

4: Hermitianize b G (k) := ( b G(k) + b G(k) H)/2.

5: Find bg (k) the eigenvector associated to the largest eigenvalue of b G (k).

6: Find bg(k) = PCn(bg (k))e ιˆθ1(k), where ˆθ1(k) is deﬁned by bg 1(k) = |bg 1(k)|eιˆθ1(k) (with ˆθ1(k) = 0 if bg 1(k) = 0).

8: Output: Estimator bg = concat bg(1), , bg(T) Cn T .

method in Algorithm 4 for clarity. Our main results for recovering g are Corollaries 17 and 19 for the AGN and Outliers models respectively.

Our strategy involves utilizing the Davis-Kahan inequality locally , focusing on speciﬁc time instances where the error converges to zero. To begin with, denote for each k a scaling s(k) 0 where s(k) = 1 for the AGN model, and s(k) = d(k) = (1 η)p(k) for the outliers model (recall Assumption 3). Then note that since G (k) is symmetric, we have for the Hermitian matrix b G (k) deﬁned in line 4 of Algorithm 4 that

b G (k) s(k)G (k) F b G(k) s(k)G (k) F . (37)

The next step is to use the Davis-Kahan theorem (Davis and Kahan, 1970) to relate the largest eigenvector of b G (k) to g (k)/ n, for each k. Since s(k)G (k) = s(k)(g (k)g (k)H In), the spectral gap between the largest and second largest eigenvalues of s(k)G (k) is s(k)n. Recall bg (k) as deﬁned in line 5 of Algorithm 4. Then, by the Davis-Kahan theorem (Davis and Kahan, 1970), we have for each k that

I g (k)g (k)H

bg (k) 2 4 b G (k) s(k)G (k) op s(k)n . (38)

A standard calculation reveals (see e.g., (d Aspremont et al., 2021, Appendix D.1)) that there exists φ(k) [0, 2π) (depending on g (k), bg (k)) such that

bg (k) eιφ(k) g (k) n

2 2 I g (k)(g (k))H

which together with (38) implies

bg (k) eιφ(k) g (k) n

2 8 b G (k) s(k)G (k) op s(k)n 8 b G (k) s(k)G (k) F s(k)n . (39)

Using (39), (37), we arrive at the following result.

Araya, Cucuringu and Tyagi

Proposition 16 Let g be a ground truth vector satisfying Assumption 1 as deﬁned in (4), and let bg = concat bg(1), , bg(T) Cn T be obtained from Algorithm 4. Then there exists a constant C > 0 such that

bg(k) g (k) 2 C b G(k) s(k)G (k) F s(k) .

Denoting smin := mink s(k), it then follows that

k=1 bg(k) g (k) 2 2 C2 1 s2 min

k=1 b G(k) s(k)G (k) 2 F .

The proof is deferred to Appendix D. Combining Proposition 16 with Theorems 8 and 13 directly leads to error guarantees for Algorithm 3, with spectral local synchronization, for the AGN and Outliers models. Recall that for the AGN model, smin = 1, while in the outliers model, smin = (1 η) mink p(k).

Corollary 17 (Main result: g recovery under AGN model) Let A be the stacked measurement matrix for the AGN model in (5), with a ground truth g satisfying Assumption 2. Choosing τ = τ with τ as in (24), consider the estimator bg = concat (bg(1), . . . , bg(T)) obtained in Algorithm 4. There exist constants C1, C2, C3, C4 > 0 such that for any δ (0, 1) the following is true.

bg g 2 2 C max

n T 2ST 1 + ( T 2ST

nσ2 2/3 , n ST

1{τ <T} + σ2n2 min 1 + T 2ST

Furthermore, there exist constants C1, C2 > 0 such that if

T C1 max nσ2

then (40) implies

bg g 2 2 C2 σ4/3T 2/3S1/3 T n5/3 + σ2 log 4

Remark 18 The following points are useful to note regarding Corollary 17.

1. Following Remark 9, when σ = 0 then we obtain τ = T and bg g 2 = 0. If ST = 0, we obtain τ = 1 and

bg g 2 2 σ2n2 + σ2 log(4/δ) .

2. Following Remark 10, suppose σ T α for some α [0, 1/4). Then provided ST satisﬁes (28), part 2 of Corollary 17 implies that

1 T bg g 2 2 T 0.

Dynamic angular synchronization under smoothness constraints

In particular, when σ > 0 is a constant, then bg g 2 2 = O(T 2/3S1/3 T ) which matches the (optimal) rate for denoising smooth signals on a path graph on T vertices (Sadhanala et al., 2016, Theorems 5,6); recall the discussion in Remark 4.

Corollary 19 (Main result: g recovery under Outliers model) Let A be the stacked measurement matrix under the outliers model (6), with a ground truth g satisfying Assumption 2, and the matrix D (c.f. (29)) satisﬁes Assumption 3 with parameter µ 1. For τ = τ as in (32), consider the estimator bg = concat (bg(1), . . . , bg(T)) obtained in Algorithm 4 and recall the notation of Theorem 13. There exist constants C1 (0, 1) and C2, C3, C4 > 0 such that for any δ (0, C1), the following is true.

1. Denoting pmin := mink p(k) it holds with probability at least 1 δ,

bg g 2 2 C2 (1 η)2p2 min

µ d2n T 2ST

1 + µ d2T 2ST n f(η,pmax,V,Q,δ)

2/3 , µ d2n ST

+ f(η, pmax, V, Q, δ)n2 min 1 + µ d2T 2ST n f(η, pmax, V, Q, δ)

2. Furthermore, if T satisﬁes (34) (with a constant C3 > 0), then (41) implies that

bg g 2 2 C4

f2/3(η, pmax, V, Q, δ)

(1 η)2p2 min

(µ d2)1/3n5/3T 2/3S1/3 T .

Remark 20 Let us note the following points regarding Corollary 19.

1. Following Remark 14, if ST = 0, then τ = 1 and we obtain

bg g 2 2 f(η, pmax, V, Q, δ)

(1 η)2p2 min n2.

On the other hand, if η = 0 and p(k) = 1 for all k, then τ = T and bg g 2 = 0.

2. Consider the setup of Remark 15 so that pmin 1/n. Then if T satisﬁes (36) (which is the case for T large enough when ST = ω(1/T 2) and ST = o(T) holds), we obtain the bound bg g 2 2 n3T 2/3S1/3 T log2/3(1/δ).

The bound exhibits the same scaling4 w.r.t T, ST as in the AGN model; recall Remark 18.

4. Experiments

We now provide some empirical results5 on synthetic data generated by the AGN and Outliers models. Apart from Algorithms GTRS-Dyn Sync, LTRS-GS-Dyn Sync and GMD-LTRS-Dyn Sync, we also evaluate the block-wise spectral approach described in Remark 3.

4. Note that the rates in (Sadhanala et al., 2016, Theorems 5,6) are obtained for additive Gaussian noise. We believe the dependency on T should be unchanged for more general sub-Gaussian noise models, however this needs to be established formally. 5. The code is available at https://github.com/Ernesto Araya V/dynamic_synch.

Araya, Cucuringu and Tyagi

Experiment setup. The following remarks are in order.

1. For speciﬁed values of n, T and smoothness ST , we generate a smooth signal g Cn T , with g 1(k) = 1 for each block k, as follows. Deﬁne τ := min 1 + TST , T , deﬁne Pτ ,n 1 as in (11). Generate a Gaussian u N(0, I(n 1)T ) and compute

T Pτ ,n 1u Pτ ,n 1u 2 .

Then denoting g := exp(ι(0.5π)u ) which is computed entry-wise, we form g by prepending each block of g with 1. It is a simple exercise to verify that g (almost surely) satisﬁes the smoothness condition in Assumption 2 for the speciﬁed ST , up to a constant.

2. For an estimate bg Cn T of the ground truth truth g Cn T , we will measure the Root Mean Square Error (RMSE) deﬁned as 1

n T bg g 2. We remark that an additional n factor appears here as compared to the quantity being bounded in the analyses earlier.

3. In order to choose the appropriate value of the regularization parameters for the algorithms (λ for GTRS-Dyn Syncand τ for LTRS-GS-Dyn Sync, GMD-LTRS-Dyn Sync), we consider a data-driven grid-search based heuristic. In particular, deﬁne a discretization of [0, T], where (i) the ﬁrst

T +1 elements of the grid are the numbers 0, 1, . . . ,

T ; (ii) the next ﬁve elements are equi-spaced numbers in the interval [

T , T 2/3 ]; (iii) the last ﬁve elements are equi-spaced numbers in the interval [ T 2/3 , T]. For each value on the grid, call it β, we set τ = min {β + 1, T} and λ = β λscale for a ﬁxed choice of the scaling parameter λscale > 0. For the corresponding output bg of each algorithm, we compute a data-ﬁdelity value using the input pairwise data, as for the data-ﬁdelity term in (16) (with g therein replaced by bg). Then we select the optimal β (thus optimal τ and λ) for each algorithm as follows.

(a) For GTRS-Dyn Sync, the optimal β is the value that leads to the largest dataﬁdelity.

(b) For LTRS-GS-Dyn Syncand GMD-LTRS-Dyn Sync, the optimal β is chosen to be the value that leads to the maximum change in the slope of the data-ﬁdelity (w.r.t β).

The above data-driven rule for selecting the optimal β is motivated by empirical evidence as shown in Figs. 1 and 2. Notice that the data-ﬁdelity for LTRS-GS-Dyn Sync and GMD-LTRS-Dyn Sync exhibits a sharp change in the slope at β = 1 (which is where the RMSE is minimized for these methods). But it is not clear why the data-ﬁdelity keeps increasing beyond β = 1, this needs further investigation. For GTRS-Dyn Sync, notice that the data ﬁdelity peaks at the RMSE-minimizer in the outliers model (see Fig. 1), while for the Wigner model, it peaks at a slightly sub-optimal β (see Fig. 2).

Dynamic angular synchronization under smoothness constraints

Figure 1: RMSE (top) and Data Fidelity (bottom) for the Outliers model for a single MC run (n = 30, T = 20, ST = 1/T, η = p = 0.2). Notice a sharp change in slope of the data-ﬁdelity for LTRS-GS-Dyn Sync and GMD-LTRS-Dyn Sync at β = 1, which is also where the RMSE is minimized for these two methods. For GTRS-Dyn Sync (with λscale = 10) the data-ﬁdelity peaks at β = 1 which is where the RMSE is minimized.

Experiment 1: RMSE versus T. In this experiment, we evaluate the RMSE versus T for all the above methods. Throughout, we ﬁx n = 30, and vary T from 10 to 100. We perform 20 Monte Carlo (MC) runs for each T. In a single MC run, we randomly generate the ground-truth for a speciﬁed smoothness ST (as described earlier), then generate the noisy pairwise data (as per speciﬁed noise model), and then compute the RMSE for each algorithm for the regularization parameters selected by the aforementioned data-ﬁdelity rule. The RMSE values are then averaged over the 20 runs. The results are shown in Figs. 3 for the Wigner model, and Fig. 4 for the Outliers model, for diﬀerent smoothness conditions ST . As a sanity check, note that the RMSE of the spectral method remains high for all T. We can make some additional observations.

1. For the AGN model (Fig. 3), we see that GMD-LTRS-Dyn Sync performs the best out of the three methods, while the RMSE for GTRS-Dyn Sync does not shrink to zero with T. The performance of LTRS-GS-Dyn Sync is seen to improve with T, albeit with a larger error than GMD-LTRS-Dyn Sync.

2. For the Outliers model (Fig. 4), the RMSE for all three methods decreases with T. Interestingly, GTRS-Dyn Sync typically performs the best out of the three methods (followed by GMD-LTRS-Dyn Sync and then LTRS-GS-Dyn Sync).

Experiment 2: RMSE versus noise-level. In this experiment, we evaluate the RMSE versus the noise-level for all the above methods. The noise-level is denoted by γ for ease of

Araya, Cucuringu and Tyagi

Figure 2: RMSE (top) and Data Fidelity (bottom) for the AGN model for a single MC run (n = 30, T = 20, ST = 1/T, σ = 3). Notice a sharp change in slope of the data-ﬁdelity for LTRS-GS-Dyn Sync and GMD-LTRS-Dyn Sync at β = 1, which is also where the RMSE is minimized for these two methods. For GTRS-Dyn Sync (with λscale = 10) the data-ﬁdelity peaks at β = 1, but the RMSE is minimized for β = 20.

exposition, with γ = σ > 0 for the AGN model (varied from 0 to 10), and γ = η [0, 1] for the Outliers model (varied from 0 to 0.8). Throughout, we ﬁx n = 30, T = 20, and perform 20 Monte Carlo (MC) runs for each value of γ. In a single MC run, we randomly generate the ground-truth for a speciﬁed smoothness ST (as described earlier), then generate the noisy pairwise data (as per speciﬁed noise model), and then compute the RMSE for each algorithm for the regularization parameters selected by the aforementioned data-ﬁdelity rule. The RMSE values are then averaged over the 20 runs. The results are shown in Fig. 5 for the Wigner model, and Fig. 6 for the Outliers model (with p = 0.2). The following observations can be made from these plots.

1. For the AGN model (Fig. 5), note that the performance of the spectral method and GTRS-Dyn Sync are similar for low noise levels (σ less than 1), both being worse than the other two methods in this noise regime. The performance of the spectral method degrades for σ > 1, while that of the other three methods is similar at moderate noise levels (1 < σ < 5). For larger values of σ, we see that GMD-LTRS-Dyn Sync performs best, while LTRS-GS-Dyn Sync and GTRS-Dyn Sync have similar error values.

2. For the Outliers model (Fig. 6), we observe that the spectral method performs much worse than the other three methods for almost the whole noise regime. We further observe that GTRS-Dyn Sync has a similar performance to GMD-LTRS-Dyn Sync and LTRS-GS-Dyn Sync up to moderate noise levels (η < 0.35), but its performance degrades for η > 0.35. On the other hand, GMD-LTRS-Dyn Sync appears to be more robust to high noise levels (η > 0.35) as compared to the other two methods.

Dynamic angular synchronization under smoothness constraints

(a) ST = 1/T

(b) ST = 1/T

(e) ST = T 1/4

(f) ST = T 1/4

Figure 3: RMSE versus T for AGN model (n = 30, σ = 3) with ST 1/T, 1, T 1/4 , results averaged over 20 MC runs. We choose λscale = 10 for GTRS-Dyn Sync. Results on left show average RMSE, the right panel shows average RMSE standard deviation.

Araya, Cucuringu and Tyagi

(a) ST = 1/T

(b) ST = 1/T

(e) ST = T 1/4

(f) ST = T 1/4

Figure 4: RMSE versus T for Outliers model (n = 30, η = p = 0.2) with ST 1/T, 1, T 1/4 , results averaged over 20 MC runs. We choose λscale = 10 for GTRS-Dyn Sync. Results on left show average RMSE, the right panel shows average RMSE standard deviation. 30

Dynamic angular synchronization under smoothness constraints

5. Concluding remarks

In this work we introduced the problem of dynamic angular synchronization, where the underlying latent angles, and the measurement graphs, both evolve with time. Assuming a smoothness condition on the evolution of the angles, with the smoothness measured by the quadratic variation of the angles w.r.t the path graph on T vertices, we proposed three algorithms (namely, GTRS-Dyn Sync, LTRS-GS-Dyn Sync and GMD-LTRS-Dyn Sync) for jointly estimating the angles at all time points. For Algorithm GMD-LTRS-Dyn Sync, we derived non-asymptotic error bounds for the AGN and Outliers noise models, and established conditions under which the MSE goes to zero as T increases. In particular, this was shown to occur under much milder conditions than the static setup (T = 1). It includes the setting where the measurement graphs are highly sparse and disconnected, and also when the measurement noise is large and can potentially increase with T. We validated our theoretical results with experiments on synthetic data which show that GMD-LTRS-Dyn Sync typically outperforms the other two methods. An interesting direction for future work would be to extend our framework to other groups such as SO(d). Another direction would be to consider evolution models where the measurements are statistically dependent over time our analysis requires the measurements to be independent across time points.

Acknowledgments

Part of the work was done while E.A was at Inria Lille. The ﬁrst version of this paper was prepared while H.T was aﬃliated to Inria Lille. Authors are listed in alphabetical order. Corresponding author: H.T.

Araya, Cucuringu and Tyagi

(a) ST = 1/T

(b) ST = 1/T

(e) ST = T 1/4

(f) ST = T 1/4

Figure 5: RMSE versus γ(= σ) for AGN model (n = 30, T = 20) with ST 1/T, 1, T 1/4 , results averaged over 20 MC runs. We choose λscale = 10 for GTRS-Dyn Sync. Results on left show average RMSE, the right panel shows average RMSE standard deviation.

Dynamic angular synchronization under smoothness constraints

(a) ST = 1/T

(b) ST = 1/T

(e) ST = T 1/4

(f) ST = T 1/4

Figure 6: RMSE versus γ(= η) for Outliers model (n = 30, p = 0.2, T = 20) with ST 1/T, 1, T 1/4 , results averaged over 20 MC runs. We choose λscale = 10 for GTRS-Dyn Sync. Results on left show average RMSE, the right panel shows average RMSE standard deviation. 33

Araya, Cucuringu and Tyagi

Appendix A. Proof of Theorem 8

The bound in (25) follows by plugging τ in (23), and using the bound τ min 1 + T 2ST

nσ2 1/3 , T

along with the fact (a + b)2 a2 + b2 for a, b 0.

To show the second part, observe that if T 2ST

nσ2 1/3 1 then (25) simpliﬁes to

b G G 2 F max n n5/3σ4/3T 2/3S1/3 T , n ST o 1{τ <T} + min n σ4/3T 2/3S1/3 T n5/3, T o + σ2 log 4

and also τ min T 2ST

nσ2 1/3 , T . Now the bound in (27) follows readily after minor sim-

pliﬁcations provided T satisﬁes (26). We just remark that T ST

nσ2 implies τ T 2ST

nσ2 1/3 , and in particular τ < T.

Appendix B. Proof of Lemma 11

Let H(k) {0, 1}n n denote the (symmetric) adjacency matrix of Hk. Hence (Hij(k))i<j are independent Bernoulli random variables with parameter p(k), and Hii(k) = 0 for all i. Also denote Z(k) Cn n to be a Hermitian matrix with Zii(k) = 0 for each i [n], and for each i < j

( g i (k)g j (k)H with probability 1 η, eιϕij(k) with probability η

where (ϕij(k))i<j ind. Unif[0, 2π). Then we can write A(k) = Z(k) H(k), and A = Z H where Z Cn T n (resp. H {0, 1}n T n) is formed by stacking Z(1), . . . Z(T) (resp. H(1), . . . , H(T)). Then we can write R = Z H DG . Now decompose Z = Z1 + Z2 where Z1(k) (resp. Z2(k)) contains the upper triangular (resp. lower triangular) part of Z(k). Write H = H1 +H2 and G = G 1 +G 2 in an analogous manner. Then R decomposes as R = (Z1 H1 DG 1) | {z } =:R1

+ (Z2 H2 DG 2) | {z } =:R2

with the entries of R1 (resp. R2) being independent, centered subgaussian random variables. Now, Pτ,n R 2 F 2( Pτ,n R1 2 F + Pτ,n R2 2 F ) and it is suﬃcient to bound Pτ,n R1 2 F , since the same bound will follow for Pτ,n R2 2 F . Decomposing R1 into its real and complex parts R1 = Re(R1) | {z } =:R1,R

+ι Im(R1) | {z } =:R1,I

we have Pτ,n R1 2 F = Pτ,n R1,R 2 F + Pτ,n R1,I 2 F . Again, we focus only on Pτ,n R1,R 2 F as the same bound follows for Pτ,n R1,I 2 F in an analogous manner. Denote u Rn2T to be the vector formed by stacking the columns of R1,R Rn T n. Then we obtain Pτ,n R1,R 2 F = u (In Pτ,n)u

Dynamic angular synchronization under smoothness constraints

which we will bound via the Hanson-Wright inequality (Rudelson and Vershynin, 2013, Thm. 1.1). To this end, we ﬁrst derive an upper bound on E[u (In Pτ,n)u] as follows. Denoting Z1,R(k) (resp. G 1,R(k)) to be the real part of Z1(k) (resp. G 1(k)), we obtain

(R1,R(k))ij = (Z1,R(k))ij(H1(k))ij (1 η)p(k)(G 1,R(k))ij.

Hence we can bound E[(R1,R(k))2 ij] for all i = j [n] as follows.

E[(R1,R(k))2 ij] = p(k)E[(Z1,R(k))ij (1 η)p(k)(G 1,R(k))ij]2 + (1 p(k))p2(k)(1 η)2(G 1,R(k))2 ij p(k)(Var((Z1,R(k))ij) + (1 η)2(1 p(k))2) + (1 p(k))p2(k)(1 η)2.

It is not diﬃcult to verify that Var((Z1,R(k))ij) η, which upon plugging in the above bound readily implies

E[(R1,R(k))2 ij] pmaxη + (1 η)2V + pmax V (1 η) | {z } =:f(η,pmax,V )

Thus we have shown that

E[u (In Pτ,n)u] Cn2τf(η, pmax, V )

for some constant C > 0. We next bound (R1,R(k))ij ψ2 for all i = j as follows. To begin with, we have

(R1,R(k))ij ψ2 = (Z1,R(k))ij(H1(k))ij (1 η)p(k)(G 1,R(k))ij ψ2 (Z1,R(k))ij (1 η)(G 1,R(k))ij ψ2 + (H1(k))ij p(k) ψ2 (43)

where we used triangle inequality and the fact (H1(k))ij, |(G 1,R(k))ij| 1 (uniformly over i, j, k). Now the second term in (43) is bounded as

(H1(k))ij p(k) ψ2 Q(p(k)) max k [T] Q(p(k)).

To bound the ﬁrst term in (43), we begin by writing

(Z1,R(k))ij (1 η)(G 1,R(k))ij = Bij(k) cos(ϕij(k)) (Bij(k) η) cos(θi(k) θj(k))

where Bij(k) is a Bernoulli random variable with parameter η. Then it is not diﬃcult to verify Bij(k) cos(ϕij(k)) (Bij(k) η) cos(θi(k) θj(k)) ψ2 Q(η) + η.

Thus, we have shown that

(R1,R(k))ij ψ2 Q(η) + η + max k [T] Q(p(k)) =: Q.

Upon applying the above bounds in conjunction with the Hanson Wright inequality, we readily obtain that with probability at least 1 δ ,

Pτ,n R1,R 2 F C n2τf(η, pmax, V ) + Q2n p

τ log(1/δ) + Q2 log(1/δ)

C n2τf(η, pmax, V ) + Q2n τ log(1/δ)

where we used p

log(1/δ) log(1/δ) if δ (0, c) for a suitably small constant c. The proof is completed by using the same error bound on the other terms discussed earlier, applying a union bound, and adjusting the constants.

Araya, Cucuringu and Tyagi

Appendix C. Proof of Theorem 13

Using (31) and Lemma 11 leads to the bound

b G DG 2 F C1 µ d2n T 2ST

τ 2 1τ<T + C2

n2τf(η, pmax, V ) + Q2n τ log 1

Following the same argument as in Section 3.1, we arrive at the choice τ = τ which upon plugging in the RHS of the above bound leads to the stated error bound in part (i) (after minor simpliﬁcation).

For part (ii), we simply note that if 1 µ d2T 2ST n f(η,pmax,V,Q,δ)

τ µ d2T 2ST n f(η, pmax, V, Q, δ)

and consequently, (33) implies

b G DG 2 F n5/3 f2/3(η, pmax, V, Q, δ)(µ d2)1/3T 2/3S1/3 T

+ max n f2/3(η, pmax, V, Q, δ)n5/3(µ d2)1/3T 2/3S1/3 T , µ d2n ST o .

We then arrive at the simpliﬁed bound in (35) provided T µ d2ST n f2/3(η,pmax,V,Q,δ).

Appendix D. Proof of Proposition 16

We will use the following useful inequality from (Liu et al., 2017).

Proposition 21 ((Liu et al., 2017)) For any q [1, ], w Cn and g Cn, we have

PCn(w) g q 2 w g q.

Proof [Proof of Proposition 16] For any k [T] we obtain

bg(k) g (k) 2 = PCn bg (k) e ιˆθ1(k) g (k) 2

= PCn nbg (k)e ιˆθ1(k) g (k) 2

2 nbg (k)e ιˆθ1(k) g (k) 2 (using Proposition 21)

2 nbg (k)e ιϕ(k) g (k) 2 + 2 n bg (k) e ιˆθ1(k) e ιφ(k) 2 , (44)

where (44) follows from the triangle inequality. Using (39), (37), we bound the ﬁrst term as

nbg (k)e ιφ(k) g (k) 2 C b G (k) s(k)G (k) F s(k) n C b G(k) s(k)G (k) F s(k) n . (45)

Dynamic angular synchronization under smoothness constraints

For the second term, we note that

n bg (k) e ιˆθ1(k) e ιφ(k) 2 = n eιˆθ1(k) eιφ(k) (since bg (k) 2 = 1)

= n|PC1 eιˆθ1(k) eιφ(k)|

= n|PC1 n |bg 1(k)|eιˆθ1(k) | {z } =bg 1(k)

2 n| nbg 1(k) eιφ(k)| (using Proposition 21)

2n bg (k) eιφ(k) g (k) n

2 (since g 1(k) = 1)

C b G(k) s(k)G (k) F s(k) (using (39), (37)). (46)

Plugging (45), (46) in (44) leads to the stated bound.

Appendix E. Additional simulations

E.1 RMSE versus noise-level

We show additional plots for RMSE versus noise-level (γ) for both the AGN (Fig. 7) and Outliers (Fig. 8) noise models but for T = 100 (recall we considered T = 20 in Figs. 5 and 6 in Section 4). We see for the AGN model that out of the three methods, GMD-LTRS-Dyn Sync is most robust for moderate to high levels of noise. For the Outliers model, we ﬁnd that GTRS-Dyn Sync performs the best for low level of noise, while GMD-LTRS-Dyn Sync is again the best method for moderate to high levels of noise.

E.2 Simulations for a Projected Power Method (PPM)

We now discuss an iterative projected power method for solving (16) that needs to be initialized with a seed g(0) C(n 1)T that is suﬃciently close to the ground truth g . This method, also referred to as the generalized power method in the literature, has had success in tackling several non-convex problems such as orthogonal procrustes (Ling, 2024), multireference alignment (Chen and Cand es, 2016), graph clustering (Wang et al., 2022), graph matching (Araya et al., 2024) and group synchronization (Boumal, 2016; Gao and Zhang, 2023; Liu et al., 2017), to name a few. The version that we consider for dynamic synchronization is outlined in Algorithm 5, dubbed PPM-Dyn Sync. At its core, it involves iteratively reﬁning the seed by sequentially performing the following operations.

Power step. Pre-multiply the current iterate by the gradient of the objective function in (16).

Projection step. The result is projected onto the set C(n 1)T using the operator PC(n 1)T .

Araya, Cucuringu and Tyagi

For initialization, we will employ GTRS-Dyn Sync, GMD-LTRS-Dyn Sync, LTRS-GS-Dyn Sync and the naive spectral algorithm considered in the experiments (Section 4). For PPM-Dyn Sync, the parameter λ is chosen via the same grid search procedure as for GTRS-Dyn Sync (see Section 4); we chose λscale = 10 for both PPM-Dyn Sync and GTRS-Dyn Sync. Surprisingly, we found that PPM-Dyn Sync does not improve the performance of the seed, and in general worsens the estimation error. While a more detailed investigation of PPM-Dyn Sync is left for future work, we illustrate its performance in Fig. 9 for the AGN model. Similar behaviour was observed for the Outliers model, and this was the case for diﬀerent choices of λscale for PPM-Dyn Sync.

Algorithm 5 Projected Power Method for Dynamic Synchronization (PPM-Dyn Sync)

1: Input: Matrix A Cn T n of observations, parameter λ > 0, number of iterations S, initial point g(0).

2: Form Ablock = blkdiag A(1), , A(T) C(n 1)T (n 1)T .

3: for s = 1, 2 . . . , S do

4: g(s) = PC(n 1)T ( Ablock λMM In 1) g(s 1) + b .

6: for k = 1, . . . , T do

7: Deﬁne bg(k) = 1 g(S)(k)

9: Output: Estimator bg = concat bg(1), , bg(T) Cn T .

Dynamic angular synchronization under smoothness constraints

(a) ST = 1/T

(b) ST = 1/T

(e) ST = T 1/4

(f) ST = T 1/4

Figure 7: RMSE versus γ(= σ) for the AGN model (n = 30, T = 100) with ST 1/T, 1, T 1/4 , results averaged over 20 MC runs. We choose λscale = 10 for GTRS-Dyn Sync. Results on left show average RMSE, the right panel shows average RMSE standard deviation. 39

Araya, Cucuringu and Tyagi

(a) ST = 1/T

(b) ST = 1/T

(e) ST = T 1/4

(f) ST = T 1/4

Figure 8: RMSE versus γ(= η) for the Outliers model (n = 30, p = 0.2, T = 100) with ST 1/T, 1, T 1/4 , results averaged over 20 MC runs. We choose λscale = 10 for GTRS-Dyn Sync. Results on left show average RMSE, the right panel shows average RMSE standard deviation. 40

Dynamic angular synchronization under smoothness constraints

(a) GTRS-Dyn Sync

(b) GMD-LTRS-Dyn Sync

(c) LTRS-GS-Dyn Sync

(d) Spectral

Figure 9: RMSE versus γ(= σ) for PPM-Dyn Sync for the AGN model (n = 30, T = 100) with ST = 1/T and diﬀerent initializations. Results averaged over 20 MC runs. We choose λscale = 10 for GTRS-Dyn Sync and PPM-Dyn Sync.

Araya, Cucuringu and Tyagi

Satoru Adachi, Satoru Iwata, Yuji Nakatsukasa, and Akiko Takeda. Solving the trust-region subproblem by a generalized eigenvalue problem. SIAM Journal on Optimization, 27(1): 269 291, 2017.

Ernesto Araya, Eglantine Karl e, and Hemant Tyagi. Dynamic ranking and translation synchronization. Information and Inference: A Journal of the IMA, 12(3):2224 2266, 2023.

Ernesto Araya, Guillaume Braun, and Hemant Tyagi. Seeded Graph Matching for the Correlated Gaussian Wigner Model via the Projected Power Method. Journal of Machine Learning Research, 25(5):1 43, 2024.

M. Arie-Nachimson, S. Z. Kovalsky, I. Kemelmacher-Shlizerman, A. Singer, and R. Basri. Global motion estimation from point matches. In 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization Transmission, pages 81 88, 2012.

Afonso S. Bandeira and Ramon van Handel. Sharp nonasymptotic bounds on the norm of random matrices with independent entries. Ann. Probab., 44(4):2479 2506, 07 2016.

Afonso S. Bandeira, Nicolas Boumal, and Amit Singer. Tightness of the maximum likelihood semideﬁnite relaxation for angular synchronization. Mathematical Programming, 163(12):145 167, 2017.

Afonso S. Bandeira, Amelia Perry, and Alexander S. Wein. Notes on computational-tostatistical gaps: predictions using statistical physics. ar Xiv:1803.11132, 2018.

Heejong Bong, Wanshan Li, Shamindra Shrotriya, and Alessandro Rinaldo. Nonparametric Estimation in the Dynamic Bradley-Terry Model. In Proceedings of the Twenty Third International Conference on Artiﬁcial Intelligence and Statistics, pages 3317 3326, 2020.

Nicolas Boumal. Nonconvex phase synchronization. SIAM Journal on Optimization, 26(4): 2355 2377, 2016.

A.E. Brouwer and W.H. Haemers. Spectra of Graphs. New York, NY, 2012.

Manuela Cattelan, Cristiano Varin, and David Firth. Dynamic Bradley Terry modelling of sports tournaments. Journal of the Royal Statistical Society: Series C (Applied Statistics), 62(1):135 150, 2013.

Jianrui Chen, Danwei Liu, Fei Hao, and Hua Wang. Community detection in dynamic signed network: an intimacy evolutionary clustering algorithm. Journal of Ambient Intelligence and Humanized Computing, 11:891 900, 2020.

Yuxin Chen and Emmanuel J. Cand es. The projected power method: An eﬃcient algorithm for joint alignment from pairwise diﬀerences. Communications on Pure and Applied Mathematics, 71(8):1648 1714, 2016.

Mihai Cucuringu. Synchronization over Z2 and community detection in signed multiplex networks with constraints. Journal of Complex Networks, 3(3):469 506, 2015.

Dynamic angular synchronization under smoothness constraints

Mihai Cucuringu. Sync-Rank: Robust Ranking, Constrained Ranking and Rank Aggregation via Eigenvector and Semideﬁnite Programming Synchronization. IEEE Transactions on Network Science and Engineering, 3(1):58 79, 2016.

Mihai Cucuringu, Yaron Lipman, and Amit Singer. Sensor network localization by eigenvector synchronization over the Euclidean group. ACM Trans. Sen. Netw., 8(3):19:1 19:42, 2012a.

Mihai Cucuringu, Amit Singer, and David Cowburn. Eigenvector synchronization, graph rigidity and the molecule problem. Information and Inference, 1(1):21 67, 2012b.

Alexandre d Aspremont, Mihai Cucuringu, and Hemant Tyagi. Ranking and synchronization from pairwise measurements via SVD. Journal of Machine Learning Research (JMLR), 22(19):1 63, 2021.

Chandler Davis and W. M. Kahan. The rotation of eigenvectors by a perturbation. iii. SIAM Journal on Numerical Analysis, 7(1):1 46, 1970.

Carmeline J. Dsilva, Bomyi Lim, Hang Lu, Amit Singer, Ioannis G. Kevrekidis, and Stanislav Y. Shvartsman. Temporal ordering and registration of images in studies of developmental dynamics. Development, 142(9):1717 1724, 2015.

Ludwig Fahrmeir and Gerhard Tutz. Dynamic stochastic models for time-dependent ordered paired comparison systems. Journal of the American Statistical Association, 89(428): 1438 1449, 1994.

Chao Gao and Anderson Y Zhang. Optimal orthogonal group synchronization and rotation group synchronization. Information and Inference: A Journal of the IMA, 12:591 632, 2023.

A. Giridhar and P. R. Kumar. Distributed clock synchronization over wireless networks: Algorithms and analysis. In 45th IEEE Conference on Decision and Control, pages 4915 4920, 2006.

Mark E. Glickman and Hal S. Stern. A state-space model for national football league scores. Journal of the American Statistical Association, 93(441):25 35, 1998.

William W. Hager. Minimizing a quadratic over a sphere. SIAM Journal on Optimization, 12(1):188 208, 2001.

Yixuan He, Gesine Reinert, David Wipf, and Mihai Cucuringu. Robust Angular Synchronization Using Directed Graph Neural Networks. to appear at ICLR 2024; ar Xiv:2310.05842, 2024.

Xiangru Huang, Zhenxiao Liang, Chandrajit Bajaj, and Qixing Huang. Translation synchronization via truncated least squares. Advances in Neural Information Processing Systems, 30, 2017.

Pierre-Emmanuel Jabin and St ephane Junca. A continuous model for ratings. SIAM Journal on Applied Mathematics, 75(2):420 442, 2015.

Araya, Cucuringu and Tyagi

Eglantine Karl e and Hemant Tyagi. Dynamic Ranking with the BTL Model: A Nearest Neighbor based Rank Centrality Method. Journal of Machine Learning Research, 24 (269):1 57, 2023.

Shuang Li and Michael B Wakin. Recovery guarantees for time-varying pairwise comparison matrices with non-transitivity. ar Xiv:2106.09151, 2021.

Shuyang Ling. Generalized Orthogonal Procrustes Problem under Arbitrary Adversaries. ar Xiv:2106.15493, 2024.

H. Liu, Man-Chung Yue, and A. Man-Cho So. On the estimation performance and convergence rate of the generalized power method for phase synchronization. SIAM Journal on Optimization, 27(4):2426 2446, 2017.

Huikang Liu, Man-Chung Yue, and Anthony Man-Cho So. A uniﬁed approach to synchronization problems over subgroups of the orthogonal group. Applied and Computational Harmonic Analysis, 66:320 372, 2023.

Michael J. Lopez, Gregory J. Matthews, and Benjamin S. Baumer. How often does the best team win? A uniﬁed approach to understanding randomness in North American sport. The Annals of Applied Statistics, 12(4):2483 2516, 2018.

Lucas Maystre, Victor Kristof, and Matthias Grossglauser. Pairwise comparisons with ﬂexible time-dynamics. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1236 1246, 2019.

Sahand Negahban, Sewoong Oh, and Devavrat Shah. Rank centrality: Ranking from pairwise comparisons. Operations Research, 65(1):266 287, 2017.

E. Ostrovsky and L. Sirota. Exact value for subgaussian norm of centered indicator random variable. ar Xiv:1405.6749, 2014.

Amelia Perry, Alexander S. Wein, Afonso S. Bandeira, and Ankur Moitra. Message-Passing Algorithms for Synchronization Problems over Compact Groups. Communications on Pure and Applied Mathematics, 71(11):2275 2322, 2018.

Mark Rudelson and Roman Vershynin. Hanson-Wright inequality and sub-gaussian concentration. Electronic Communications in Probability, 18:1 9, 2013.

Veeranjaneyulu Sadhanala, Yu-Xiang Wang, and Ryan J. Tibshirani. Total variation classes beyond 1d: Minimax rates, and the limitations of linear smoothers. NIPS 16, page 3521 3529, 2016.

Yoel Shkolnisky and Amit Singer. Viewing direction estimation in cryo-EM using synchronization. SIAM journal on imaging sciences, 5(3):1088 1110, 2012.

Amit Singer. Angular synchronization by eigenvectors and semideﬁnite programming. Appl. Comput. Harmon. Anal., 30(1):20 36, 2011.

Roman Vershynin. Introduction to the non-asymptotic analysis of random matrices, page 210 268. Cambridge University Press, 2012.

Dynamic angular synchronization under smoothness constraints

Peng Wang, Zirui Zhou, and Anthony Man-Cho So. Non-convex exact community recovery in stochastic block model. Mathematical Programming, 195(1):1 37, 2022.

Shuzhong Zhang and Yongwei Huan. Complex quadratic optimization and semideﬁnite programming. SIAM Journal on Optimization, 16(3):871 890, 2006.

Yiqiao Zhong and Nicolas Boumal. Near-optimal bounds for phase synchronization. SIAM Journal on Optimization, 28(2):989 1016, 2018.

Onur Ozye sil, Vladislav Voroninski, Ronen Basri, and Amit Singer. A survey of structure from motion. Acta Numerica, 26:305 364, 2017.