# homomorphic_sensing_sparsity_and_noise__c8482f93.pdf

Homomorphic Sensing: Sparsity and Noise

Liangzu Peng 1 Boshi Wang 1 Manolis C. Tsakiris 1

Unlabeled sensing is a recent problem encompassing many data science and engineering applications and typically formulated as solving linear equations whose right-hand side vector has undergone an unknown permutation. It was generalized to the homomorphic sensing problem by replacing the unknown permutation with an unknown linear map from a given ﬁnite set of linear maps. In this paper we present tighter and simpler conditions for the homomorphic sensing problem to admit a unique solution. We show that this solution is locally stable under noise, while under a sparsity assumption it remains unique under less demanding conditions. Sparsity in the context of unlabeled sensing leads to the problem of unlabeled compressed sensing, and a consequence of our general theory is the existence under mild conditions of a unique sparsest solution. On the algorithmic level, we solve unlabeled compressed sensing by an iterative algorithm validated by synthetic data experiments. Finally, under the unifying homomorphic sensing framework we connect unlabeled sensing to other important practical problems.

1. Introduction

1.1. Compressed Sensing

The beginning of the 21st century has witnessed the birth of compressed sensing, a subject, as written by Theodoridis (2020), whose starting point is to develop conditions for the solution of an underdetermined linear system of equations. In an attempt at ﬁnding a sparsest solution of the linear equations y = Ax with A Rm n, researchers have focused on the optimization problem

min x Rn x 0 s.t. y = Ax. (1)

1School of Information Science and Technology, Shanghai Tech University, Shanghai, China. Correspondence to: Liangzu Peng, Boshi Wang, Manolis C. Tsakiris <penglz,wangbsh,mtsakiris@shanghaitech.edu.cn>.

Proceedings of the 38 th International Conference on Machine Learning, PMLR 139, 2021. Copyright 2021 by the author(s).

Assuming the existence of a k-sparse solution x to (1), the ﬁrst question is whether x is unique. The answer is typically characterized via two frequently used notions, the spark (Donoho & Elad, 2003) or the Kruskal rank (Kruskal, 1977), and has played a major role in the theoretical foundations of compressed sensing, where x is typically viewed as a sparse signal and A as a measurement matrix. Speciﬁcally, for a generic1 A Rm n, x is the unique sparsest solution to (1) if m 2k. Conversely, for any A Rm n, there are multiple sparsest solutions to (1) whenever m < 2k < n. Later efforts considered the equivalence between the ℓ0 semi-norm minimization (1) and its ℓ1 relaxation known as basis pursuit, typically studied via the nullspace property (Cohen et al., 2009) or the restricted isometry property (Candes & Tao, 2005). Built upon those theoretical grounds are efﬁcient convex optimization algorithms that solve the ℓ1 basis pursuit problem. For a modern account of compressed sensing the reader is referred to the relevant chapters in Theodoridis (2020); Wright & Ma (2020).

1.2. Unlabeled Sensing

Recently, increasing research efforts have concentrated on the unlabeled sensing problem, proposed by Unnikrishnan et al. (2015); Unnikrishnan et al. (2018) in signal processing contexts. With Sm the set of m m permutation matrices, Π Sm an unknown permutation, and y = Π Ax , unlabeled sensing is concerned with solving the equations2

y = ΠAx, Π Sm, x Rn (2)

in the unknown x from the given data y, A. The connection of unlabeled sensing with compressed sensing is that both problems can be cast as subspace classiﬁcation problems. In both cases the given data can be thought of as a union of subspaces S i [ℓ] Vi Rm together with a point y in that union, and the problem is to determine which subspace Vi the point y belongs to. In compressed sensing, the Vi s arise as the images of the ℓ= n k k-dimensional coordinate subspaces of Rn under the measurement matrix A; in unlabeled sensing there are ℓ= m! subspaces R(ΠA) obtained as Π

1For now it is safe to think of generic as random , and for a generic A Rm n as for almost all A Rm n (see 2.1). 2In fact Unnikrishnan et al. (2018) considered the even more challenging problem where only a subset of the entries of y = Π Ax is given; see also 4.

Homomorphic Sensing: Sparsity and Noise

ranges in Sm, where R(ΠA) is the column space of ΠA.

Applications. A prominent data analysis application of unlabeled sensing is record linkage (Fellegi & Sunter, 1969; Domingo-Ferrer & Muralidhar, 2016; Muralidhar, 2017). This relates to linking records collected from different sources, a routine operation of government agencies (e.g., see Antoni & Schnell (2019)), for the purpose of subsequent data analysis. Due to privacy concerns, it is customary for each entry of the records corresponding to some individual to not be associated with a unique identiﬁer of this individual (e.g., the social security number). As a result, domain-speciﬁc algorithms for linking the respective entries in two (or more) records corresponding to the same individual can be error-prone, yielding imperfect data for later analysis. An alternative is to ask whether one can ﬁt a linear regression model between two unlinked records, say y Rm and A Rm n, where y records only one feature while A records n features of the same m individuals. Without linkage, the correspondences between entries of y and rows of A are unknown. Such data imperfection is naturally modeled by an unknown permutation Π Sm, and this gives y = Π Ax with x Rn the unknown linear regression parameters. The aim is to recover x from y, A and this is exactly problem (2). Besides record linkage, other applications abound: signal estimation using distributed sensors (Zhu et al., 2017; Song et al., 2018; Peng et al., 2019), target localization in signal processing (Wang et al., 2020a), neuron matching in computational neuroscience (Nejatbakhsh & Varol, 2021), automated translation of medical codes (Shi et al., 2020) and ﬂow cytometry (Abid et al., 2017; Abid & Zou, 2018) in biology, multi-target tracking (Ji et al., 2019; Xie et al., 2021) and point set registration (Pananjady et al., 2017; Lian et al., 2017) in computer vision; see Pananjady et al. (2018); Xie et al. (2021) for more applications.

Theory. Unnikrishnan et al. (2015) proved the fundamental fact that, if A is generic, then x is the unique solution to (2) if and only if m 2n, a result more recently and independently also obtained by Han et al. (2018) using different techniques. While this result holds for any x Rn, Tsakiris et al. (2020) showed that m n + 1 samples are sufﬁcient for uniqueness, if we additionally assume that x Rm is generic. Once the uniqueness for (2) was settled, noisy measurements were considered. With y := y + ϵ = Π Ax + ϵ for some noise ϵ, Pananjady et al. (2018) showed that the estimator

(ˆx, ˆΠ) argmin x Rn, Π Sm

y ΠAx 2 (3)

is NP-hard to compute if n > 1. Moreover, assuming A has i.i.d. standard Gaussian entries and ϵ has Gaussian distribution N(0, σ2Im), they asserted that Π = ˆΠ with high probability as long as the SNR:= x 2 2/σ2 is exponentially high (e.g., SNR mc for some constant c > 0). Of similar ﬂavor

is a result of Hsu et al. (2017), where it was shown under the same setting of Pananjady et al. (2018) that x can not be approximately recovered, unless the SNR is larger than c min{1, n/ log log m} for some constant c > 0. Later on, under the above assumptions on A and ϵ, Slawski & Ben David (2019) showed that, if Π is p-sparse in the sense that Π permutes at most p rows of A, then the estimator

argmin x Rn y Ax 1 (4)

behaves well with high probability, in the sense that its distance to x is an increasing linear function of σ.

Algorithms. Under the above sparse assumption on Π , the relaxation (3) of Slawski & Ben-David (2019) can be solved via convex optimization. Empirically, this yields a good estimate of x as long as no more than half of the data are shufﬂed, i.e., p/m < 0.5. This was improved by Slawski et al. (2019); Slawski et al. (2021), who synthesized hypothesis testing, expectation maximization, and recursively reweighed least-squares into an efﬁcient algorithm which can handle up to p/m = 0.7 shufﬂed data, with a drawback of being sensitive to the distribution of A. Tsakiris & Peng (2019), Tsakiris et al. (2020), and Peng & Tsakiris (2020) followed a very different route towards solving (3), with the aim of tackling the fully shufﬂed case p/m = 1. The two algorithms of Tsakiris & Peng (2019) are based on branch-and-bound and RANSAC respectively, and have good performance for n 4, while intractable for n 5. The algebraic approach of Tsakiris et al. (2020) is based on solving a system of n polynomial equations in n variables, and selects the most suitable among the ﬁnite set of roots to initialize (3) towards alternating minimization. This gives an algorithm of linear complexity in m, efﬁcient for n 5 and intractable otherwise. The algorithm of Peng & Tsakiris (2020) is based on a concave minimization reformulation of (3) and is solved via branch-and-bound. It can handle dimensions n 8, which to the best of our knowledge is the largest value up to date for the successful operation of an unlabeled sensing algorithm on fully shufﬂed data. Let us also note that, when m < n, both (3) and (4) are bound to have inﬁnitely many solutions and thus all of the above algorithms in principle break down. Finally, recent algorithms that handle other types of unlabeled data include Slawski et al. (2019); Wang et al. (2020b); Zhang et al. (2019); Slawski et al. (2020); Zhang & Li (2020); Abbasi et al. (2020); Marano & Willett (2020); Jeong et al. (2020); Jeong et al. (2021); Abbasi et al. (2021); Yao et al. (2021).

1.3. Homomorphic Sensing

Inspired by Unnikrishnan et al. (2018) a generalization of unlabeled sensing was posed by Tsakiris (2018; 2020); Tsakiris & Peng (2019) under the name homomorphic sensing. This generalization replaces the set Sm of m m

Homomorphic Sensing: Sparsity and Noise

permutations with an arbitrary ﬁnite set T of r m matrices, r m. That is

y = TAx, T T , x Rn, (5)

where we are now given the measurements as y = T Ax

for some unknown T T and the goal is to solve (5) for x. Tsakiris (2018; 2020) proved that if A is generic, then x is the unique solution to (5), providing that the dimension of a certain algebraic variety that depends on T is at most m n. Tsakiris (2018; 2020); Tsakiris & Peng (2019) applied their results to unlabeled sensing (e.g., by setting T to be Sm) and obtained the same conditions of Unnikrishnan et al. (2018) which guarantee the uniqueness of the solution to (2).

1.4. Contributions of this paper

We improve, generalize prior works in several ways; complete proofs of our statements are in Peng & Tsakiris (2021).

Sparse homomorphic sensing. We bring homomorphic sensing and compressed sensing together, and arrive at the problem of sparse homomorphic sensing. Recalling y = T Ax , we consider the following ℓ0 optimization problem:

min x Rn x 0 s.t. y = TAx, T T (6)

Assuming that x is a k-sparse solution to (6), we provide conditions under which x is the unique such solution (Theorem 1). When k = n, our conditions in particular guarantee the uniqueness of the solution to homomorphic sensing (5), and they are tighter and simpler than those of Tsakiris (2018; 2020); Tsakiris & Peng (2019) (Corollary 1).

Noisy homomorphic sensing. We also extend homomorphic sensing (5) to noisy homomorphic sensing, where we are given the noisy measurements y = y + ϵ = T Ax + ϵ. We show in Theorem 2 that, as long as ϵ 2 is sufﬁciently small and (5) admits a unique solution, the following problem (7) produces an estimate ˆx close to x :

(ˆx, ˆT) argmin x Rn,T T

y TAx 2 (7)

When setting T to Sm (3), we obtain an improved result for unlabeled sensing over that of Unnikrishnan et al. (2018).

Unlabeled compressed sensing. We propose unlabeled compressed sensing, where we let y = Π Ax with x a k-sparse solution to the following optimization problem:

min x Rn x 0 s.t. y = ΠAx, Π Sm (8)

Clearly, (8) is a special case of (6), where T is set to Sm. So our theorem for sparse homomorphic sensing can be applied to unlabeled compressed sensing, and in so doing, we get:

Proposition 1. For a generic A Rm n, x is the unique sparsest solution to (8) as long as m 2k.

Proposition 1 seems interesting. Indeed, the number 2k is the threshold for unique recovery of x in compressed sensing (recall 1.1), but this number remains the same in unlabeled compressed sensing, even though there could be m! choices for the potential permutations. In particular, Proposition 1 holds true even when m n.

Computationally, we consider a relaxation of (8), which we solve via an iterative algorithm based on subgradient descend and ℓ1 minimization ( 3.2). This is the ﬁrst algorithm for unlabeled sensing which works even when m < n, a regime unexplored in prior works. By experiments ( 3.3), we empirically show that i) the algorithm returns a correct estimate as long as x and Π are both sufﬁciently sparse (i.e., p, k are small), ii) it is efﬁcient, iii) it is robust to noise.

A broader picture. As it turns out, we ﬁnd that, besides unlabeled sensing, homomorphic sensing contains as special cases other important inverse problems, such as real phase retrieval, mixed linear regression, missing data recovery, to name a few. This allows our theory to be further applied to those special cases, and also allows potential connections between these problems themselves via the unifying framework of homomorphic sensing. We discuss this in 4.

2. Homomorphic sensing theory

In 2.1 we give some preliminaries on algebraic geometry. In 2.2 we formally state the problem and discuss immediate observations. In 2.3 we describe conditions for the uniqueness of the solution to (6). In 2.4 and 2.5, we introduce Theorems 1 and 2 for sparse and noisy homomorphic sensing, respectively. Along the way we give intuition on the proofs when space allows and we refer the reader to Peng & Tsakiris (2021) for the complete arguments.

2.1. Preliminaries on algebraic geometry

Here we review some background from algebraic geometry. A very accessible introduction to this subject is Cox et al. (2013) and a friendly one is Eisenbud (2013).

The basic object in algebraic geometry is a complex (resp. real) algebraic variety, say Q, which is typically deﬁned as the set of complex (resp. real) roots of ﬁnitely many polynomials p1, . . . , ps in m variables with complex (resp. real) coefﬁcients; in other words, Q := {z : pi(z) = 0, i = 1, . . . , s}. A subvariety of some algebraic variety Q is a subset of Q that is itself an algebraic variety. For example, a linear subspace of Rm is an algebraic variety deﬁned by linear forms, and so is the union of ﬁnitely many linear subspaces, s i=1Vi, which is deﬁned by the products of the deﬁning equations of its constituent subspaces Vi s.

Homomorphic Sensing: Sparsity and Noise

Declaring the subvarieties of Rm n as closed sets, we obtain the so-called Zariski topology, and, as usual, the complement of a closed set is called (Zariski) open. By a generic matrix of Rm n having some property, we mean that there is some non-empty Zariski open subset O of Rm n such that every matrix of O satisﬁes this property. A non-empty Zariski open subset is dense, in view of the fact that Rm n

is irreducible (an irreducible algebraic variety is one which can not be written as the union of two proper subvarieties of it). The consequence of such a non-empty Zariski open set O being dense is that a matrix randomly chosen from Rm n

according to some continuous probability distribution will land itself in O, with probability one.

The ﬁnal algebraic-geometric notion which we need is that of dimension, and for technical reasons, we consider this notion over the complex ﬁeld, C. Intuitively, the dimension of a given subset of Cm is a non-negative number that measures the size of the set. Formally, the dimension dim(Q) of an algebraic variety Q is the maximal length t of the chains Q0 Q1 Qt of distinct irreducible algebraic varieties contained in Q; and the dimension of any set is the dimension of its closure, i.e., the smallest algebraic variety which contains it. For example, it is not hard to show by deﬁnition that a linear subspace has its linear-algebraic dimension equal to its algebraic-geometric dimension, and that the dimension of a ﬁnite union of linear subspaces, dim( s i=1Vi), is equal to the maximum dimension of its constituent subspaces, maxi=1,...,s dim(Vi).

2.2. The uniqueness at ﬁrst glance

The uniqueness for (6) involves the measurements y = T Ax , where x is k-sparse and can otherwise be arbitrary, and we seek conditions that guarantee that it is the only ksparse solution to (6). This motivates Deﬁnition 1. Deﬁnition 1 (hsp). Let A Rm n, T Rr m. If for any T1, T2 T and any k-sparse x1, x2 Rn we have

T1Ax1 = T2Ax2 x1 = x2,

then we say that T and A satisfy the homomorphic sensing property for k-sparse vectors, denoted as hsp(T , A, k).

While hsp(T , A) := hsp(T , A, n) is a clearly special case of hsp(T , A, k) (with k set to be n), Deﬁnition 1 suggests that hsp(T , A) implies hsp(T , A, k) for any k n. Moreover, hsp(T , A) is the same as the uniqueness of the solution to the homomorphic sensing problem (5), which, in turn, is equivalent to that x is the only feasible point of (6) and necessarily the unique solution to (6). More generally, Deﬁnition 1 immediately implies the following equivalence. Proposition 2. We have hsp(T , A, k) if and only if Problem (6) always has a unique k-sparse solution.

Next, our main focus will be on hsp(T , A, k). Speciﬁcally,

we will discuss the conditions to put on T under which hsp(T , A, k) holds for a generic A Rm n.

2.3. Conditions for hsp(T , A, k)

Recall that T is a ﬁnite set of matrices. We ﬁrst handle the ﬁniteness of T by the following fact.

Proposition 3. Suppose that hsp({T1, T2}, A, k) holds for every T1, T2 T and for a generic A Rm n, then hsp(T , A, k) holds for A Rm n generic.

Proof. This follows directly from the fact that the intersection of ﬁnitely many non-empty Zariski open subsets of Rm n is again non-empty and Zariski open.

Proposition 3 is intuitive, and it suggests us to focus on hsp(T , A, k) with two matrices T1, T2 T ﬁxed. In what follows we will consider when hsp({T1, T2}, A, k) would be violated, so as to derive conditions for it to hold. One condition will be the rank constraint (10) ( 2.3.1) and the other will be the quasi-variety constraint (12) ( 2.3.2).

2.3.1. THE RANK CONSTRAINT

By Deﬁnition 1, hsp({T1, T2}, A, k) is tightly related to the column spaces of T1 and T2. This motivates us to consider

ZT1,T2 := {u Cm : T1u = T2u}. (9)

Note that ZT1,T2 is a complex3 linear subspace of Cm, and therefore a complex algebraic variety. The role of ZT1,T2 in hsp can be seen as follows. Let x1, x2 be k-sparse and their images Ax1, Ax2 under A are vectors of ZT1,T2. Then T1Ax1 = T2Ax2 is the same as T1A(x1 x2) = 0. Hence, whether T1Ax1 = T2Ax2 implies x1 = x2 reduces to the problem of compressed sensing, and the answer is immediate from inspecting the spark or Kruskal rank of T1A (recall 1.1). As a consequence, to fully understand hsp({T1, T2}, A, k) it sufﬁces to only focus on k-sparse vectors whose images under A are not contained in ZT1,T2.

We now therefore assume, without loss of generality (see Peng & Tsakiris (2021) for full arguments), that ZT1,T2 does not intersect the images of all k-sparse vectors under a generic A. Since the latter form a union of subspaces in Rm of dimension k, we can simply assume that ZT1,T2 has dimension smaller than or equal to m k. Under this assumption we obtain the following characterization.

Proposition 4. Let dim(ZT1,T2) m k, rank[T1 T2] < 2k. hsp({T1, T2}, A, k) is false for A Rm n generic.

Proof. It sufﬁces to show that hsp({T1, T2}, Ai, k) is false for Ai Rm k generic (here one could regard Ai as con-

3While T1 and T2 are real matrices, we deﬁne ZT1,T2 as a complex object on account of technical reasons.

Homomorphic Sensing: Sparsity and Noise

sisting of k distinct columns of A). If rank(T1Ai) < k then hsp({T1}, Ai, k) is false. Hence let us assume m k, r k, rank(T1Ai) = k, and similarly assume rank(T2Ai) = k. But [T1 T2] Rr 2m has rank smaller than 2k, so does the r 2k matrix [T1Ai T2Ai]. As a result, there are non-zero vectors x1, x2 Rk such that T1Aix1 = T2Aix2. Assume x1 = x2. Then Aix1 = Aix2 and Aix1 is an element of ZT1,T2. But ZT1,T2 Rm has dimension at most m k, so, for a generic Ai Rm k, the column space intersects ZT1,T2 only at zero. This gives Aix1 = 0 and x1 = 0, a contradiction. Hence x1 = x2.

To prevent Proposition 4 from happening, we consider:

The Rank Constraint

rank(T) 2k, T T . (10)

The rank constraint (10) ensures that rank[T1 T2] 2k for any T1, T2 T , so that the bad situation in Proposition 4 would never occur. This constraint is perhaps the simplest, because it does not involve any interaction of T1 and T2.

However, the rank constraint does not exclude all possible violations of hsp({T1, T2}, A, k), as illustrated next.

Example 1. Suppose m = 4, r = 4, k = 2. For any A R4 n, let a 4 R1 n be the last row of A. Note that the null space of a 4 has dimension at least n 1, it must intersect any subspace of dimension 2 at some non-zero point. Hence there is a non-zero k-sparse vector u of Rn

which satisﬁes a 4 u = 0. Let T1 and T2 be deﬁned as below.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

3 6 9 20 15 18 21 40 27 30 33 60 39 42 45 80

Then we have T1A(3u) = T2Au, but 3u = u. The rank constraint is satisﬁed, but hsp({T1, T2}, A, k) is violated.

2.3.2. THE QUASI-VARIETY CONSTRAINT

Here we introduce a constraint on {T1, T2} under which the situation of Example 1 would not happen. This involves the following algebraic-geometric object. For a column vector w of m variables, consider all 2 2 determinants of the r 2 matrix [T1w T2w]. Each such determinant is a quadratic polynomial in entries of w. Let YT1,T2 Cm be the complex algebraic variety deﬁned by those determinants.

Example 2. For m = 3, r = 2 and

T1 = 0 0 2 2 4 1

, and T2 = 1 2 3 4 5 6

the variety YT1,T2 consists of the complex roots of the following polynomial p in variables w1, w2, and w3.

p = det 2w3 w1 + 2w2 + 3w3 2w1 + 4w2 + w3 4w1 + 5w2 + 6w3

Alternatively and equivalently, we might describe YT1,T2 as the set of vectors u s of Cm for which T1u and T2u are linearly dependent. Observing that ZT1,T2 of (9) is a subvariety of YT1,T2, we deﬁne the following set

UT1,T2 := YT1,T2\ZT1,T2 (11)

to be the set-theoretical difference between two varieties YT1,T2 and ZT1,T2, with one containing the other. Based on the deﬁnition, UT1,T2 is usually named as a quasi-variety.

Letting b := n k , [b] := {1, . . . , b}, and denoting the set of submatrices of A of size m k by {Ai}b i=1, we show that UT1,T2 is of potential harm to hsp({T1, T2}, A, k):

Proposition 5. If for any i [b], the intersection of UT1,T2 and the column space R(Ai) of Ai is not empty, then hsp({T1, T2}, A, k) is not true.

Proof. Suppose that R(Ai) UT1,T2 is not empty and let u R(Ai) UT1,T2. Then there is some λ R such that T1u = λT2u or λT1u = T2u. Since u UT1,T2, we have u = 0 and λ = 1. Since u R(Ai) we have for some x Rn with Aix = u that T1Aix = T2Ai(λx) or T1Ai(λx) = T2Aix. But x = λx.

Proposition 5 suggests that the bad event where UT1,T2 intersects R(A) must be prevented. We then expect the quasivariety UT1,T2 to be as of small size as possible. Its size can be modeled by dimension, an algebraic-geometric notion that assigns to each subset of Cm a non-negative integer with the convention dim( ) := 1 (recall 2.1). To say that dim(UT1,T2) is small is to say that UT1,T2 is small, which in turn implies that it is unlikely for UT1,T2 to intersect R(A). We formalize this intuition below.

Proposition 6. Suppose dim(UT1,T2) m k. Then, for a generic matrix A Rm n, the column space R(Ai) of Ai does not intersect UT1,T2 for any i [b].

As per Proposition 6, enforcing UT1,T2 to have small dimension is indeed an effective means to exclude the bad event of UT1,T2 intersecting R(Ai); that is, with dim(UT1,T2) k the violation of hsp({T1, T2}, A, k) in Proposition 5 would not happen. This justiﬁes the following constraint:

The Quasi-variety Constraint

dim(UT1,T2) m k, T1, T2 T . (12)

Homomorphic Sensing: Sparsity and Noise

Remark 1. The dimension dim(UT1,T2) that appeared in (12) can be computed as follows. In commutative algebra terms (Cox et al., 2013; Eisenbud, 2013), dim(UT1,T2) is equal to the (Krull) dimension of the vanishing ideal of the closure of UT1,T2. This vanishing ideal, by deﬁnition (11), is the saturation of the vanishing ideal of YT1,T2 with respect to the vanishing ideal of ZT1,T2. The saturation, and thus its Krull dimension, can be computed directly based on the deﬁnitions of YT1,T2 and ZT1,T2, using commutative algebra software, e.g., Macaulay2.

Example 3. In Example 1, we have dim(UT1,T2) = 3 which is larger than m k = 2. As a result, T1 and T2 of Example 1 violate the quasi-variety constraint.

The rank constraint (10) and quasi-variety constraint (12) are sufﬁcient for hsp(T , A, k), as we will see soon.

2.4. Unique recovery in sparse homomorphic sensing

Theorem 1. If a ﬁnite set T Rr m of matrices satisﬁes the rank constraint (10) and quasi-variety constraint (12), we have hsp(T , A, k) for a generic A Rm n.

Constraints (10), (12) of Theorem 1 guarantee the uniqueness of the solution to sparse homomorphic sensing (6).

There are two major hurdles towards proving Theorem 1. One is the non-linearity brought by the set K of k-sparse vectors. The other is from the matrix set T . Indeed, matrices of T might be non-diagonalizable and even non-invertible (e.g., see 4); this makes analysis here harder than in unlabeled sensing. Moreover, we have to consider the interaction of K and T . Overcoming these hurdles requires several novel technical ideas; see Peng & Tsakiris (2021) for details.

We remark that (12) is the tightest in the following sense:

Proposition 7. If dim(UT1,T2) > m k, then a generic A Cm n violates hsp(T , A, k).

2.4.1. UNIQUE RECOVERY IN HOMOMORPHIC SENSING

Theorem 1 implies the following result (with k set to be n).

Corollary 1. Suppose that a ﬁnite set T Rr m satisﬁes that rank(T) 2n for any T T and that

dim(UT1,T2) m n, T1, T2 T . (13)

Then we have hsp(T , A) for a generic A Rm n.

As mentioned, hsp(T , A) is equivalent to the uniqueness of the solution to the homomorphic sensing problem (5).

To compare, note that Tsakiris (2018; 2020); Tsakiris & Peng (2019) used the same rank constraint rank(T) 2n and a different quasi-variety constraint for hsp(T , A). The quasi-variety constraint of Tsakiris (2018; 2020); Tsakiris & Peng (2019) is dim(UP T1,T2) m n, where P is

some unknown projection onto the column space of T2.4

It is easy to verify that UT1,T2 is a subset of UP T1,T2 and so dim(UT1,T2) dim(UP T1,T2). Thus, in comparison, constraint (13) is tighter. In fact, it is the tightest in the sense of Proposition 7. Finally, constraint (13) is also simpler because it dispenses with the unknown projection matrix P.

2.5. Noisy homomorphic sensing

We consider the homomorphic sensing problem in the presence of noise ϵ Rr. Let y := y + ϵ = T Ax + ϵ be our measurements. The questions are i) how we can estimate x , given y, T , A, and ii) how good the estimate is.

For i), we shortly mention that we can in principle solve (7) to obtain an estimate (ˆx, ˆT) of interest via exhaustive search. Indeed, for each T0 T compute the least-squares solution x0 := (T0A) y which minimizes y T0Ax 2 over Rn in variables x, where we used ( ) to denote the pseudoinverse of a matrix. Among all least-squares solutions, then, take ˆx which causes the minimum residual error.

Question ii), or more speciﬁcally whether ˆx is close to x, is our main focus. This question is naturally discrete for the following reason. For arbitrary noise ϵ, the optimal ˆT can be any matrix of T . Since T is an arbitrary discrete set of matrices, the corresponding ˆx can be arbitrarily far from x .

We handle this discreteness by identifying nice matrices contained in T ; by nice we mean a subset T1 of T such that each matrix of T1 will yield a least-squares solution which is close to x . With R( ) denoting the column space of a matrix, a concrete deﬁnition of T1 is given as

T1 = T T : y R(TA) .

With σ( ) denoting the largest singular value of a matrix, the next proposition explains why T1 is a nice set.

Proposition 8. Assume T0 T1 and that hsp(T , A) holds for some A Rm n. Then x0 x = (T0A) ϵ where x0 = (T0A) y. In particular x0 x 2 σ((T0A) ) ϵ 2.

Under the uniqueness assumption for the homomorphic sensing problem (hsp(T , A)), Proposition 8 states that any T0 T results a stable least-squares estimate x0, whose distance to x can be upper bounded in terms of noise and data. As for the estimate (ˆx, ˆT) of (7), the remaining question is whether ˆT is a nice matrix contained in T1.

First note that T1 is not empty because y = T Ax and T T1. Also, if T1 = T then ˆT is of course an element of T1. In fact, our next claim is that ˆT is always nice" (i.e., ˆT T1) in presence of sufﬁciently small noise.

4Here UP T1,T2 is deﬁned by replacing T1 of UT1,T2 with PT1.

Homomorphic Sensing: Sparsity and Noise

Proposition 9. We have ˆT T1 whenever T1 = T or

1 max T T \T1,x Rn y TAx y 2 TAx 2

Since for every T T \T1, the column space of T A does not contain y, the maximization term of (14) is strictly smaller than 1. Hence, the right-hand side of (14) is positive.

From Propositions 8 and 9, we are ready to draw a local stability result for noisy homomorphic sensing.

Theorem 2. Suppose i) hsp(T , A) holds true, ii) T1 = T or (14) holds, then ˆx x = ( ˆTA) ϵ, and in particular ˆx x 2 σ(( ˆTA) ) ϵ 2.

Condition (14) deﬁnes a non-asymptotic regime, where the local stability of ˆx is guaranteed (Theorem 2). In particular, if T = Sm, Theorem 2 is an improvement over the asymptotic result of Unnikrishnan et al. (2018).

3. Unlabeled compressed sensing

3.1. Theory

Recall Proposition 1 where the uniqueness for unlabeled compressed sensing is guaranteed. Here we derive Proposition 1 from Corollary 1. When T is Sm, some algebraic properties of permutations might be utilized to simplify the rank and quasi-variety constraints. Indeed, every permutation of Sm has rank m, so the rank constraint becomes m 2k, a requirement on the number of samples. Moreover, inspired by Tsakiris (2018; 2020); Tsakiris & Peng (2019), an interesting result is that, whenever the rank constraint is fulﬁlled, the quasi-variety constraint is automatically satisﬁed (this is not always true for a set of arbitrary matrices T ; see Example 1):

Proposition 10. For two permutation matrices Π1, Π2 Sm, we have dim(UΠ1,Π2) m k as long as m 2k.

Combining Corollary 1 with Proposition 10 gives:

Corollary 2. The following is true for A Rm n generic:

m 2k hsp(Sm, A, k)

Recalling Proposition 2, we note that, for a generic A Rm n, the condition m 2k of Corollary 2 implies the uniqueness of the solution to unlabeled compressed sensing 8. Thus, Corollary 2 is the same as Proposition 1.

3.2. Algorithm

Besides the k-sparsity assumption on x , we also assume that, in light of Slawski & Ben-David (2019), the groundtruth permutation matrix Π is p-sparse, i.e., y Ax 0 p

(see 1). This naturally leads us to the problem

min x Rn y Ax 0 s.t. x 0 k. (15)

Problem (15) is in general NP-hard, so we relax it into

min x Rn y Ax 1 s.t. x 0 k. (16)

The objective function of (16) is about an old problem, least absolute deviation, also known as sparse error correction; see, e.g., Kendall (1960); Candes & Tao (2005). The next natural choice is further relaxing the sparsity constraint of (16), so as to arrive at the convex problem5 of minimizing y Ax 1 + λ x 1 in n variables x Rn with some hyper-parameter λ > 0. But such relaxation does not yield satisfactory performance for our purpose.

We solve (16) using the idea of hard thresholding pursuit (Foucart, 2011; Cai et al., 2020). Following Cai et al. (2020), we assume that k is known in advance, and use x(0) := 0 as initialization. The iterative update is given as:

x(t+1) Proj K x(t) µA sgn(Ax(t+1) y) (17)

J the support {i : x(t+1) i = 0} of x(t+1)

x(t+1) J argmin x Rn y AJx 1 (18)

In (17), we note that i) A sgn(Ax(t+1) y) is a subgradient of y Ax 1 at the point x(t+1), where sgn: Rn Rn

sends [v1, . . . , vn] to a vector whose i-th entry is 1 if vi 0, or 1 otherwise, ii) µ is a step size to be determined, iii) Proj K( ) projects a vector to its closest k-sparse counterpart. In (18), we update the non-zero entries x(t+1) J of x(t+1) by solving the convex optimization problem, where AJ is the column-submatrix of A with its columns indexed by J; AJ is a tall matrix under the tacit assumption m 2k. We note two differences of the algorithm from (Cai et al., 2020). First, instead of (18) they solved a least-squares problem. Second, they run the algorithm by one iteration.

While (17) is straightforward to compute, we solve (18) by invoking an ADMM algorithm implemented in the FOM toolbox of Beck & Guttmann-Beck (2019).

3.3. Experiments

We evaluate the algorithm with µ := 10 4 and with the number of iterations set to T := 20 on an Intel(R) i7-8650 U, 1.9 GHz, 16 GB machine.6 We have not known obvious baselines or other approaches for the task of interest.

5This problem was considered by Wright & Ma (2009) with λ = 1 in the context of dense error correction, where they assumed that the ground-truth signal x has non-negative entries. 6Other choices of µ did not yield signiﬁcantly better results. The algorithm usually converges in about 10 iterations.

Homomorphic Sensing: Sparsity and Noise

5 15 25 35 45 55

Sample Complexity (m)

Time (in sec.)

(a) sample/time complexity

a a a a a a

[0, 10 10] (10 10, 10 8] (10 8, 10 6] (10 6, 10 4] (10 4, 10 2] (10 2, + )

(b) error distribution

5 15 25 35 45 55 SNR (in d B)

Estimation Error

(c) robustness to noise

5 15 25 35 45 55

(d) phase transition under noise Figure 1. The performance of the algorithm on synthetic data.

Data generation. We generate data by i) randomly sampling the entries of A Rm n from the standard normal distribution N(0, 1), ii) randomly selecting a support of the k-sparse x Rn whose non-zero entries are randomly sampled also from N(0, 1), iii) randomly producing a psparse permutation Π , and iv) computing y = Π Ax .

Evaluation metrics. One evaluation metric which we use is the estimation error, computed as x x(opt) 2/ x 2, where x(opt) is among {x(1), . . . , x(T )} which minimizes (16).7 Inspired by Netrapalli et al. (2013); Netrapalli et al. (2015), the other evaluation metric is the (empirical) sample complexity. Similar to Netrapalli et al. (2013), the algorithm is said to succeed if the estimation error is smaller than 0.01. The sample complexity of the algorithm is then the smallest among {2k, 3k, . . . } for which the algorithm always succeeds over 100 trials for a ﬁxed k.

Results. Figure 1 depicts the performance of the algorithm on synthetic data, with n = 2000 ﬁxed. In Figure 1a we set p := 0.2m , and observed that the sampling complexity m increased as the sparsity k grew, which in turn entailed an increased running time. For example, when n = 2000 and k = 25, it took m = 1400 samples for the algorithm to succeed and 0.47 seconds to ﬁnish computation. Zooming in on the 100 trials at k = 25 of Figure 1a yields Figure 1b, where the estimation errors for the 100 trials were summarized. We saw that the estimation error is no more than 10 10 for 65 trials, and 96% of the 100 errors fall into the intervals (10 6, 10 4] and ( , 10 10].

Keeping m = 1400, n = 2000, k = 25, p = 0.2m ﬁxed, we furthermore evaluated the robustness of the algorithm to noise. We added noise to the measurements y as per the SNR, run the algorithm, and the result was in Figure 1c (100 trials). As the SNR condition improved, the estimation error declined, from 0.3268 (5d B) to 0.011 (30d B) and further to 0.0005 (55d B). Finally, a holistic understanding on the algorithm might be obtained via Figure 1d, where we ﬁxed m = 1400, n = 2000 and SNR= 40d B and presented the estimation errors with the two sparsity levels k and p varying

7Recall that a subgradient might not give a descent direction.

(100 trials). We observed that the algorithm consistently made errors smaller than 0.01 in the presence of 20% shufﬂed data and k 35. In the extremely sparse case k = 5, the algorithm could tolerate up to 45% shufﬂed data (with errors no more than 0.1). On the other hand, the algorithm could fail in an attempt at working at the challenging high-p, high-k region. To summarize, the algorithm was shown to be time-efﬁcient, robust to noise, and to succeed when the ground-truth x and Π are both sufﬁciently sparse.

4. A broader picture

The matrix set T in (sparse) homomorphic sensing ((5), (6)) provides some ﬂexibility to model other important inverse problems than unlabeled sensing. We next present several other choices of T than Sm that arise from data applications.

Unlabeled sensing with missing entries. In fact, (Unnikrishnan et al., 2018) considered a more general version of unlabeled sensing, where some entries of y are missing and the positions of missing entries in y are unknown. In other words, it means that i) one is given y = S Ax Rr with some unknown selection matrix S , i.e., S is a permutation matrix with (m r) rows removed, and ii) one aims to solve

y = SAx, S Sr,m, x Rn, (19)

for x, where Sr,m is the set of r m selection matrices.8

This was known by Tsakiris (2018; 2020); Tsakiris & Peng (2019) as an example of homomorphic sensing.

Missing data recovery. We ﬁnd that the problem of missing data recovery (Zhang, 2006; Liu et al., 2017; Liu et al., 2019) or of signal recovery with erasures at known locations (Han & Sun, 2014) is also a special case of homomorphic sensing (5). In this problem one aims to recover x from y = O Ax , where O is some m m diagonal matrix

8In Unnikrishnan et al. (2018), (19) was called as unlabeled sensing and (2) as an important special case. The follow-up works referred to (2) as unlabeled sensing, or as linear regression without correspondences, or as shufﬂed linear regression; see, e.g., Hsu et al. (2017). In this paper we used unlabeled sensing for (2) and unlabeled sensing with missing entries for (19).

Homomorphic Sensing: Sparsity and Noise

with 0 or 1 on its diagonal. Its differences from (19) are that, i) in general, the positions at which the entries are missing are understood from the positions of non-zero entries of y, and ii) there is no unknown permutation involved.

Real phase retrieval. We also ﬁnd that the perhaps more familiar problem of real phase retrieval is another homomorphic sensing example. This problem can be traced back to the 1910s when the research on X-ray crystallography was launched, and has been receiving increasing attention in recent years; see, e.g., Grohs et al. (2020) for a vivid account. In this problem, we are given

y = BAx, B Bm, x Rn, (20)

where y = B Ax , B Bm, and Bm is the set of m m sign matrices, i.e., diagonal matrices with 1 on the diagonal. Since uniquely recovering x is impossible9, the goal then becomes unique recovery of x up to sign. The problem of symmetric mixture of two linear regressions (Balakrishnan et al., 2017) also admits formulation (20); see, e.g., Chen et al. (2019); Klusowski et al. (2019) for a discussion which connects the two problems.

The ﬁnal example is a combination of (19) and (20), explored by Lv & Sun (2018). This involves the matrix set Sr,m Bm := {SB : S Sr,m, B Bm} and the relation

y = CAx, C Sr,m Bm, x Rn.

To summarize, the above problems are concerned with missing correspondences, missing values, missing signs, or combinations thereof, and they are actually of the same type, where the linear measurements Ax have further undergone some unknown linear map belonging to a speciﬁc set of maps, e.g., Sm, Sr,m, Bm, and Sr,m Bm. In Peng & Tsakiris (2021) we present the applications of our theory to those examples, which yield either i) known results from prior works, e.g., Balan et al. (2006); Unnikrishnan et al. (2018); Han et al. (2018); Lv & Sun (2018); Dokmanic (2019); Akçakaya & Tarokh (2014); Wang & Xu (2014), or ii) even novel results for those examples. Finally, it is natural to consider our theory as having potential wider applicability to new examples of homomorphic sensing yet to discover.

5. Discussion and future work

On the theoretical part, we presented conditions guaranteeing the uniqueness for sparse homomorphic sensing (6), from which a uniqueness result for unlabeled compressed sensing follows. The next step for research is to ﬁnd conditions under which the corresponding ℓ1 relaxation (e.g., (16)) has a unique solution, which we leave as future work. Taking noise into consideration, we provided a deterministic

9Both (B , x ) and ( B , x ) satisfy (20).

condition for the local stability in homomorphic sensing, from which a probabilistic condition might be derived.

On the algorithmic front, we initiated a computational investigation into unlabeled compressed sensing. Future work might include developing theoretical guarantees for the algorithm of 3.2, tackling the case where more data are shufﬂed, dispensing with the hyper-parameters, and so on. Finally, we presented a broader picture in 4 using the homomorphic sensing framework. Tools from other ﬁelds might be key to advancing the research for unlabeled (compressed) sensing.

Abbasi, A., Tasissa, A., and Aeron, S. Unlabeled sensing with local permutations. Technical report, ar Xiv:1911.06382v3 [eess.SP], 2020.

Abbasi, A., Tasissa, A., and Aeron, S. R-local sensing: A novel graph matching approach for multiview unlabeled sensing under local permutations. IEEE Open Journal of Signal Processing, pp. 1 1, 2021.

Abid, A. and Zou, J. A stochastic expectation-maximization approach to shufﬂed linear regression. In Annual Allerton Conference on Communication, Control, and Computing, pp. 470 477, 2018.

Abid, A., Poon, A., and Zou, J. Linear regression with shufﬂed labels. ar Xiv preprint ar Xiv:1705.01342, 2017.

Akçakaya, M. and Tarokh, V. New conditions for sparse phase retrieval. Technical report, ar Xiv:1310.1351v2 [cs.IT], 2014.

Antoni, M. and Schnell, R. The past, present and future of the german record linkage center. Jahrbücher für Nationalökonomie und Statistik, 239(2):319 331, 2019.

Balakrishnan, S., Wainwright, M. J., and Yu, B. Statistical guarantees for the em algorithm: From population to sample-based analysis. Annals of Statistics, 45(1):77 120, 2017.

Balan, R., Casazza, P., and Edidin, D. On signal reconstruction without phase. Applied and Computational Harmonic Analysis, 20(3):345 356, 2006.

Beck, A. and Guttmann-Beck, N. Fom a matlab toolbox of ﬁrst-order methods for solving convex optimization problems. Optimization Methods and Software, 34(1): 172 193, 2019.

Cai, J.-F., Li, J., Lu, X., and You, J. Sparse signal recovery from phaseless measurements via hard thresholding pursuit. Technical report, ar Xiv:2005.08777v2 [math.NA], 2020.

Homomorphic Sensing: Sparsity and Noise

Candes, E. J. and Tao, T. Decoding by linear programming. IEEE Transactions on Information Theory, 51(12):4203 4215, 2005.

Chen, Y., Chi, Y., Fan, J., and Ma, C. Gradient descent with random initialization: fast global convergence for nonconvex phase retrieval. Mathematical Programming, 176(1):5 37, 2019.

Cohen, A., Dahmen, W., and De Vore, R. Compressed sensing and best k-term approximation. Journal of the American mathematical society, 22(1):211 231, 2009.

Cox, D., Little, J., and OShea, D. Ideals, Varieties, and Algorithms: An Introduction to Computational Algebraic geometry and Commutative Algebra. Springer Science & Business Media, 2013.

Dokmanic, I. Permutations unlabeled beyond sampling unknown. IEEE Signal Processing Letters, 26(6):823 827, 2019.

Domingo-Ferrer, J. and Muralidhar, K. New directions in anonymization: Permutation paradigm, veriﬁability by subjects and intruders, transparency to users. Information Sciences, 337-338:11 24, 2016.

Donoho, D. L. and Elad, M. Optimally sparse representation in general (nonorthogonal) dictionaries via l1 minimization. Proceedings of the National Academy of Sciences, 100(5):2197 2202, 2003.

Eisenbud, D. Commutative Algebra: with a view toward algebraic geometry, volume 150. Springer Science & Business Media, 2013.

Fellegi, I. P. and Sunter, A. B. A theory for record linkage. Journal of the American Statistical Association, 64(328): 1183 1210, 1969.

Foucart, S. Hard thresholding pursuit: An algorithm for compressive sensing. SIAM Journal on Numerical Analysis, 49(6):2543 2563, 2011.

Grohs, P., Koppensteiner, S., and Rathmair, M. Phase retrieval: Uniqueness and stability. SIAM Review, 62(2): 301 350, 2020.

Han, D. and Sun, W. Reconstruction of signals from frame coefﬁcients with erasures at unknown locations. IEEE Transactions on Information Theory, 60(7):4013 4025, 2014.

Han, D., Lv, F., and Sun, W. Recovery of signals from unordered partial frame coefﬁcients. Applied and Computational Harmonic Analysis, 44(1):38 58, 2018.

Hsu, D., Shi, K., and Sun, X. Linear regression without correspondence. In Advances in Neural Information Processing Systems, 2017.

Jeong, M., Dytso, A., Cardone, M., and Poor, H. V. Recovering data permutations from noisy observations: The linear regime. IEEE Journal on Selected Areas in Information Theory, 1(3):854 869, 2020.

Jeong, M., Dytso, A., and Cardone, M. Retrieving data permutations from noisy observations: High and low noise asymptotics. Technical report, ar Xiv:2105.03015v1 [cs.IT], 2021.

Ji, R., Liang, Y., Xu, L., and Zhang, W. A concave optimization-based approach for joint multi-target track initialization. IEEE Access, 7:108551 108560, 2019.

Kendall, M. G. Studies in the history of probability and statistics. where shall the history of statistics begin? Biometrika, 47(3/4):447 449, 1960.

Klusowski, J. M., Yang, D., and Brinda, W. D. Estimating the coefﬁcients of a mixture of two linear regressions by expectation maximization. IEEE Transactions on Information Theory, 65(6):3515 3524, 2019.

Kruskal, J. B. Three-way arrays: rank and uniqueness of trilinear decompositions, with application to arithmetic complexity and statistics. Linear Algebra and its Applications, 18(2):95 138, 1977.

Lian, W., Zhang, L., and Yang, M. An efﬁcient globally optimal algorithm for asymmetric point matching. IEEE Trans. Pattern Anal. Mach. Intell., 39(7):1281 1293, 2017.

Liu, G., Liu, Q., and Yuan, X. A new theory for matrix completion. In Advances in Neural Information Processing Systems, pp. 785 794, 2017.

Liu, G., Liu, Q., Yuan, X. T., and Wang, M. Matrix completion with deterministic sampling: Theories and methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, pp. 1 1, 2019.

Lv, F. and Sun, W. Real phase retrieval from unordered partial frame coefﬁcients. Advances in Computational Mathematics, 44(3):879 896, 2018.

Marano, S. and Willett, P. Making decisions by unlabeled bits. IEEE Transactions on Signal Processing, 68:2935 2947, 2020.

Muralidhar, K. Record re-identiﬁcation of swapped numerical microdata. Journal of Information Privacy and Security, 13(1):34 45, 2017.

Homomorphic Sensing: Sparsity and Noise

Nejatbakhsh, A. and Varol, E. Neuron matching in c. elegans with robust approximate linear regression without correspondence. In IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2837 2846, 2021.

Netrapalli, P., Jain, P., and Sanghavi, S. Phase retrieval using alternating minimization. In Advances in Neural Information Processing Systems, pp. 2796 2804, 2013.

Netrapalli, P., Jain, P., and Sanghavi, S. Phase retrieval using alternating minimization. IEEE Transactions on Signal Processing, 63(18):4814 4826, 2015.

Pananjady, A., Wainwright, M. J., and Courtade, T. A. Denoising linear models with permuted data. In IEEE International Symposium on Information Theory, pp. 446 450, June 2017.

Pananjady, A., Wainwright, M. J., and Courtade, T. A. Linear regression with shufﬂed data: Statistical and computational limits of permutation recovery. IEEE Transactions on Information Theory, 64(5):3286 3300, 2018.

Peng, L. and Tsakiris, M. C. Linear regression without correspondences via concave minimization. IEEE Signal Processing Letters, 27:1580 1584, 2020.

Peng, L. and Tsakiris, M. C. Homomorphic sensing of subspace arrangements. Technical report, ar Xiv:2006.05158v3 [cs.LG], 2021.

Peng, L., Song, X., Tsakiris, M. C., Choi, H., Kneip, L., and Shi, Y. Algebraically-initialized expectation maximization for header-free communication. In IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 5182 5186, 2019.

Shi, X., Li, X., and Cai, T. Spherical regression under mismatch corruption with application to automated knowledge translation. Journal of the American Statistical Association, 0(0):1 12, 2020.

Slawski, M. and Ben-David, E. Linear regression with sparsely permuted data. Electronic Journal of Statistics, 13(1):1 36, 2019.

Slawski, M., Diao, G., and Ben-David, E. A pseudolikelihood approach to linear regression with partially shufﬂed data. Technical report, ar Xiv:1910.01623 [stat.ME], 2019.

Slawski, M., Rahmani, M., and Li, P. A sparse representation-based approach to linear regression with partially shufﬂed labels. In Conference on Uncertainty in Artiﬁcial Intelligence, pp. 38 48, 2019.

Slawski, M., Ben-David, E., and Li, P. Two-stage approach to multivariate linear regression with sparsely mismatched data. Journal of Machine Learning Research, 21 (204):1 42, 2020.

Slawski, M., Diao, G., and Ben-David, E. A pseudolikelihood approach to linear regression with partially shufﬂed data. Journal of Computational and Graphical Statistics, 0(0):1 31, 2021.

Song, X., Choi, H., and Shi, Y. Permuted linear model for header-free communication via symmetric polynomials. In IEEE International Symposium on Information Theory, pp. 661 665, 2018.

Theodoridis, S. Machine Learning: A Bayesian and Optimization Perspective. Academic Press, 2020.

Tsakiris, M. C. Eigenspace conditions for homomorphic sensing. Technical report, ar Xiv:1812.07966v1 [math.CO], 2018.

Tsakiris, M. C. Determinantal conditions for homomorphic sensing. Technical report, ar Xiv:1812.07966v6 [math.CO], 2020.

Tsakiris, M. C. and Peng, L. Homomorphic sensing. In International Conference on Machine Learning, 2019.

Tsakiris, M. C., Peng, L., Conca, A., Kneip, L., Shi, Y., and Choi, H. An algebraic-geometric approach for linear regression without correspondences. IEEE Transactions on Information Theory, 66(8):5130 5144, 2020.

Unnikrishnan, J., Haghighatshoar, S., and Vetterli, M. Unlabeled sensing: Solving a linear system with unordered measurements. In Annual Allerton Conference on Communication, Control, and Computing, pp. 786 793, 2015.

Unnikrishnan, J., Haghighatshoar, S., and Vetterli, M. Unlabeled sensing with random linear measurements. IEEE Transactions on Information Theory, 64(5):3237 3253, May 2018.

Wang, G., Marano, S., Zhu, J., and Xu, Z. Target localization by unlabeled range measurements. IEEE Transactions on Signal Processing, 68:6607 6620, 2020a.

Wang, Y. and Xu, Z. Phase retrieval for sparse signals. Applied and Computational Harmonic Analysis, 37(3): 531 544, 2014.

Wang, Z., Ben-David, E., and Slawski, M. Estimation in exponential family Regression based on linked data contaminated by mismatch error. Technical report, ar Xiv:2010.00181v2 [stat.ME], 2020b.

Wright, J. and Ma, Y. Dense error correction via l1minimization. In IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 3033 3036, 2009.

Homomorphic Sensing: Sparsity and Noise

Wright, J. and Ma, Y. High-Dimensional Data Analysis with Low-Dimensional Models: Principles, Computation, and Applications. Cambridge University Press, 2020.

Xie, Y., Mao, Y., Zuo, S., Xu, H., Ye, X., Zhao, T., and Zha, H. A hypergradient approach to robust regression without correspondence. In International Conference on Learning Representations, 2021.

Yao, Y., Peng, L., and Tsakiris, M. C. Unlabeled Principal Component Analysis. Technical report, ar Xiv:2101.09446v1 [cs.LG], 2021.

Zhang, H. and Li, P. Optimal estimator for unlabeled linear regression. In International Conference on Machine Learning, pp. 11153 11162, 2020.

Zhang, H., Slawski, M., and Li, P. Permutation recovery from multiple measurement vectors in unlabeled sensing. In IEEE International Symposium on Information Theory, pp. 1857 1861, 2019.

Zhang, Y. When is missing data recoverable? Technical report, 2006.

Zhu, J., Cao, H., Song, C., and Xu, Z. Parameter estimation via unlabeled sensing using distributed sensors. IEEE Communications Letters, 21(10):2130 2133, 2017.