# adversarial_crowdsourcing_through_robust_rankone_matrix_completion__351fcafe.pdf

Adversarial Crowdsourcing Through Robust Rank-One Matrix Completion

Qianqian Ma Boston University maqq@bu.edu

Alex Olshevsky Boston University alexols@bu.edu

We consider the problem of reconstructing a rank-one matrix from a revealed subset of its entries when some of the revealed entries are corrupted with perturbations that are unknown and can be arbitrarily large. It is not known which revealed entries are corrupted. We propose a new algorithm combining alternating minimization with extreme-value ﬁltering and provide sufﬁcient and necessary conditions to recover the original rank-one matrix. In particular, we show that our proposed algorithm is optimal when the set of revealed entries is given by an Erd os-Rényi random graph.

These results are then applied to the problem of classiﬁcation from crowdsourced data under the assumption that while the majority of the workers are governed by the standard single-coin David-Skene model (i.e., they output the correct answer with a certain probability), some of the workers can deviate arbitrarily from this model. In particular, the adversarial workers could even make decisions designed to make the algorithm output an incorrect answer. Extensive experimental results show our algorithm for this problem, based on rank-one matrix completion with perturbations, outperforms all other state-of-the-art methods in such an adversarial scenario.1

1 Introduction

Matrix completion [10] [9] [13] refers to the problem of recovering a low-rank matrix from a subset of its entries. A fundamental challenge in the study of matrix completion is that, in some applications, the revealed entries will be inaccurate or corrupted. When these perturbations can be arbitrarily large, we will refer to the problem as robust matrix completion. In particular, the motivating application for this paper is estimation of worker reliability in crowdsourcing [30] [41] [21][19][45], where this issue appears if some workers deviate from their instructions.

The simplest and one of the most widely used crowdsourcing models is Dawid & Skene s (D&S) single coin model [6]. In the D&S model, the workers are assumed to make mistakes independently of other workers and with the same error probability for each task. It has previously been observed that optimal estimation in the D&S model requires estimation of worker reliabilities [1], which in turn can be framed as a rank-one matrix completion problem [30].

In the case when some of the workers are poorly described by the D&S model or even consciously acting to subvert the underlying algorithm, the problem is naturally framed as a robust rank-one matrix completion problem. The motivating scenario is either a platform such as Amazon Mechanical Turk where workers without a long record of reliability typically have cheaper rates, or estimation from online ratings on a platform such as Yelp, where there is strong incentive for business owners to write fake reviews.

1The code is available on https://github.com/maqqbu/MMSR

34th Conference on Neural Information Processing Systems (Neur IPS 2020), Vancouver, Canada.

We propose a new algorithm for robust rank-one matrix completion which, in at least one regime, is provably optimal. We then perform a computational study which compares our method to twelve state-of-the-art methods from the crowdsourcing literature on both synthetic and real world datasets (in the latter case, we introduce the corrupted workers ourselves), and show that our method strongly outperforms all of them in this adversarial setting.

Notation and conventions: [n] = {1, , n}; |S| is the size of set P; x is the smallest integer greater than x; x is the largest integer smaller than x; X is the nuclear norm of matrix L, i.e., the sum of the singular values of matrix X; Z+ is the set of positive integers; Z i is the set of integers which are greater than i; Given S1, S2, the reduction of S1 by S2 is denoted as S1\S2 = {i S1 : i / S2}; ﬁnally, A(n) B(n) means A(n)/B(n) 1 as n .

2 Related Work

Matrix Completion: the standard approach to low-rank matrix completion [2][3] usually proceeds by nuclear norm minimization:

min L L s.t. [L]ij = [L0]ij, (i, j) Ω, (1)

where L is the nuclear norm of matrix L, Ωis the set of locations of the observed entries, L0 is the matrix to be recovered. Candès and Recht [2] proved that L0 can be recovered with high probability via solving (1) if L0 is incoherent and Ωis sampled uniformly at random. These are strong assumptions and many papers, including this one, have sought to relax them. A popular approach has been to focus on non-uniform sampling. In particular, Negahban et al. [31] relaxed the condition of uniform sampling to weighted entrywise sampling. Király et al. [20] considered deterministic sampling. Liu et al. proposed a new hypothesis called "isomeric condition" in [27], which is weaker than uniform sampling, and proved that the matrix L0 can be recovered by a nonconvex approach under this condition.

Unlike these general methods, we consider rank-one matrix completion problem with arbitrary Ωas well as adversarial corruptions. We do not assume any kind of incoherence of the underlying matrix. Of course, the rank-one matrix completion problem for uncorrupted cases is trivial and can be solved by going through the revealed answers recursively. Nevertheless, of particular note to us is Gamarnik et al. [10] which studied how well an alternating minimization methods in the uncorrupted case; we will build on those results in this work. Very closely related to our work is the recent paper Fatahi et al. [9], which considered solving the robust rank-one matrix completion with perturbations by solving the optimization problem

minu Rm + ,v Rn + PΩ(X uv ) 1 + Rβ(u, v),

where Ωrepresents the set of observed entries and Rβ represents a regularization term. The approach in [9] can deal with asymmetric matrices as well as sparse unknown perturbations. However, their approach has strong assumption for the structure of the graph G(Ω) and can only deal with sparse perturbations.

Crowdsourcing: For crowdsourcing problems [46] [39] [35], the D&S model has become a standard theoretical framework and has led to a ﬂurry of research in recent years, e.g., [41][11][15] among many others. In the D&S model, the workers are assumed to make errors independently with each other and the error probabilities of the workers are task-independent. In the single-coin D&S model, each worker is further assumed to have the same accuracy on each task.

Ghosh et al. [11] considered a binary classiﬁcation problem based on single coin D&S model, and a SVD-based algorithm was proposed. The underlying assumption was that the observation matrix (representing which workers give answers for which tasks) is dense. To relax this constraint, Dalvi et al. [5] proposed another SVD-based algorithm which allows the observation matrix to be sparser. Karger et al. [18] extended the single coin D&S model to multi-class labeling issues, and they proposed an iterative algorithm to solve it. Zhang et al. [41] developed a two-stage spectral method for multi-class crowdsourcing labeling problems based on general D&S model. In [30], the skill estimate of workers in single one-coin D&S model were formulated as a rank-one matrix completion problem, which is also the approach we will adopt in this paper. We mention that in Section 6, we will compare our proposed algorithm with these existing approaches from [11][5][18][41][30].

The adversarial version of the D&S model, where adversaries represent workers who deviate from the model, has previously been investigated in a number of papers. Raykar et al. [34] assumed the adversaries are workers who give answers randomly, and they proposed approach to eliminate such adversaries in general D&S model. Jagabathula [17] also considered the speciﬁc types of adversaries (e.g all the adversaries provide a label +1) and tried to detect these adversaries from normal workers and eliminate their impact to the ﬁnal predictions. Kleindessner et al. [21] proposed an approach to deal with arbitrary and colluding adversaries in the David&Skene model. Overall, the algorithm in [21] can deal with cases when nearly half of the workers are adversaries. However, this theoretical guarantee requires the task assignment matrix to be full matrix or dense matrix. Our work considers the same problem as [21] but without assuming the task assignment matrix is dense.

3 Problem Setup and Formulation

Let X = M + P Rm n, where M = ab is a positive rank-1 matrix with a = [a1 a2 am] , b = [b1 b2 bn] . Let XΩbe a subset of the entries of X, i.e.,

XΩ= {Xij = Mij + Pij| (i, j) Ω}, (2)

where Ω [m] [n]. Our goal is to efﬁciently reconstruct M given XΩ. The elements of the matrix P can take on any value; if Pij = 0 we will say that the (i, j) th entry of X is corrupted. An entry that is not corrupted will be referred to as normal. We will be considering the situation where a number of rows and columns of X are completely corrupted, and our goal is to correctly recover the remaining rows and columns.

We begin with a sequence of deﬁnitions. The structure of the set of pairs Ωcan be conveniently represented by an undirected bipartite graph G(Ω) as follows:

Deﬁnition 1. G(Ω) is deﬁned to be a bipartite graph with vertex partitions Vu := {1, 2, , m} and Vv := {1, , n} and includes the edge (i, j), with i Vu, j Vv, if and only if (i, j) Ω.

For i Vu, we let Ωi denotes the set of neighbors of node i, and similarly for j Vv, let Ω j denote the set of neighbors of node j (the apostrophe will be useful to be able to tell at a glance whether a node belongs to Vu or Vv). Our next step is to formalize the fault model, i.e., what we will be assuming about the corruption matrix P.

Deﬁnition 2. A set S V (G) is F-local if its intersection with each Ωi and each Ω j has at most F nodes.

Deﬁnition 3. A node i Vu is corrupted if Pij = 0 for some (i, j) Ω. Likewise, a node j Vv is corrupted if Pij = 0, for some (i, j) Ω. The graph G(Ω) is said to be F-local corrupted if the set of corrupted nodes is F-local.

Next, we will introduce some concepts from [25] dealing with the redundancy of edges between subsets of nodes; later, these will turn out to be closely related to the robustness of the graph G(Ω) to corruptions.

Deﬁnition 4. A set S V (G) is an r-reachable set if there exists a node in S with at least r neighbors outside S. G(Ω) is r-robust if for every pair of nonempty, disjoint subsets of V (G), at least one of the subsets is r-reachable.

Crowdsourcing: we now explain the connection between robust rank-one matrix completion and crowdsourcing. We consider the single-coin D&S model where W workers are asked to provide labels for a series of M-class classiﬁcation tasks. The ground truths gt (t [T]) for these tasks are unknown (here T is the number of tasks). The set A [W] [T] is a worker-task assignment set. The observations (Yw,t)(i,t) A are a collection of independent random variables. The single-coin D&S model supposes the accuracy of the worker i is pi which means that the answer it returns is:

P(Yi,t = ℓ|gt) = pi1{ℓ=gt} + 1 pi

M 11{ℓ =gt}.

In words, worker i returns the correct answer with probability pi and a random incorrect answer with probability 1 pi. Under such assumptions, it was observed [30] that the probability of each worker can be estimated via solving a rank-one matrix completion problem as follows: letting

si = M M 1pi 1 M 1 we have that [30]

E M M 1 e C 1 M 111 = ss , (3)

where e C is the covariance matrix between agents i and j, 11 is the all-ones matrix which has the same size as e C. Since the RHS of (3) is a rank-one matrix, the skill level vector s can be estimated by computing the empirical covariance matrix and applying a rank-one matrix completion method.

In the adversarial setting, we need to further consider the case when worker i may deviate from the D&S model. In that case, all the entries in row i and column i should be viewed as corruptions as the derivation of Eq. (3) is no longer valid for that row; this is why we adopt a model in this paper where entire rows/columns are corrupted. Naturally, we do not know which rows/columns are corrupted. Extending our notation from above, we will refer to uncorrupted rows and columns as normal.

The hope is that identiﬁcation of skill levels for the uncorrupted agents is still possible if the number of corrupted agents is not too large, or the corrupted agents are not placed in central location in the graph G(Ω); making this intuition into a precise theorem is one of the goals of this paper.

With the above background in place, we can now state the main concerns of our paper formally: (i) Given XΩ, how can we reconstruct the normal rows and columns of the rank-one matrix M under F-local fault-models? (ii) How can we estimate the workers skill level s and consequently give accurate predictions for tasks in the single-coin D&S model with adversaries?

4 The M-MSR method

In this section, we present the details of our approach. We will start by explaining how our algorithm was constructed. For the uncorrupted rank-one matrix completion problem, we begin by observing that if (a, b) is an optimal solution, then we actually have the group of optimal solutions (u, v) = (ka, k 1b). On the other hand, given arbitrary positive vectors u, v, we can represent them as

u = [a1k1 a2k2 amkm] , v = h b1 k 1 b2 k 2 bn k n

where ai, bj represent the values of optimal solution (a, b). We will refer to ki as the value of vertex i and likewise for k j. Observe that to ﬁnd the optimal solution, we need some algorithms to update u and v so that all the vertices can have the same value. To accomplish this, our starting point is the update

ui(t + 1) = X

j Ωi wij Xij vj(t), i [m],

vj(t + 1) = X

i Ω j w ij Xij ui(t + 1), j [n], (4)

where the coefﬁcients wij, w ij form a convex combination. This update rule is motivated by [10], to which it is closely related, and can be interpreted in terms of an minimization method which alternates between ﬁnding the best u and the best v. In the unperturbed case, via elementary algebra this can be rewritten in terms of the variables ki, k j introduced above as

ki(t + 1) = X

j Ωi wijkj(t), i [m],

1 k j(t + 1) = X

i Ω j w ij 1 ki(t + 1), j [n]. (5)

We will show with update rule (5), all the vertices can converge to the same value.

The main difﬁculty is what to do to account for corruptions: since the corrupted elements can be arbitrary, even a single corrupted element can completely destabilize this iteration. A natural approach is to ﬁlter the extreme values in each update of Eq. (4). To that end, let us deﬁne

Ji(t) = Xij

We will then set Ri(t) to be the set of nodes with the F largest and smallest values in Ji(t) (if there are fewer than F values strictly smaller/larger than ui(t), then Ri(t) contains the ones that are strictly smaller/larger than ui(t)); the quantities J j(t), R j(t) are deﬁned similarly for nodes j Vv. Our algorithm is presented next; we will call it the Matrix-Mean-Subsequence-Reduced (M-MSR) algorithm.

Algorithm 1 M-MSR

Input: Positive matrix X, set Ω, F and v(0) > 0 Output: ˆ X = u(T)v(T)

1: for t = 1, 2, . . . , T do 2: For each i = 1, , m, let

ui(t + 1) = X

j Ωi\Ri(t) wij Xij

where the coefﬁcients wij form a convex combination. 3: For each j = 1, , n, let

vj(t + 1) = X

i Ω j\R j(t) w ij Xij ui(t + 1), (7)

where the coefﬁcients w ij form a convex combination. 4: end for 5: return u(t) = [u1(t), u2(t), , um(t)] , v(t) = [v1(t), v2(t), , vn(t)]

For convenience, we will not consider the case m = n = 1. We will be assuming that G(Ω) is connected. Finally, introducing the notation α for the smallest of wij, w ij, we will be assuming that α 1/2. This can easily satisﬁed by e.g., choosing wij = 1/degree(i) and likewise for w ij.

5 Convergence Analysis

We begin by considering the case where the revealed entry set Ωis randomly chosen, which corresponds to the case of G(Ω) being an Erd os-Rényi bipartite graphs. For simplicity, we assume m = n.

Theorem 1(a). Suppose G(Ω) is a random bipartite graph Gn,n,p where each edge is generated with probability p. Then a sufﬁcient condition for M-MSR algorithm (with parameter F) to successfully recover the normal rows and columns of X under the assumption the the corruptions are F-local is

p log n + 2F log log n + x

where x = o(log log n) , when n .

Thus, on a random graph, the M-MSR method can successfully recover the true matrix provided the graph is not too sparse. We will later discuss the guarantees this theorem implies on the total fraction of corruptions (note that F is an upper bound on the number of corruptions in each neighborhood). For now we observe that Theorem 1(a) is tight in a sense described next.

As we will argue in the supplementary information, M-MSR has the property that it is skew nonamplifying in the following sense: at every step of the method, it maintains estimates u(t), v(t) such that ui(t)vj(t) converges to the correct answer aibj when i, j are normal and which satisfy

min normal i

ai , bi vi(t)

min normal i

ai , bi vi(0)

max normal i

ai , bi vi(t)

max normal i

ai , bi vi(0)

Intuitively, the quantities ui(t)/ai and bi/vi(t) measure the skew between the vectors u(t), v(t) and the optimal solution. Let us call any algorithm that satisﬁes Eq. (9) and Eq. (10) skew-nonamplifying.

Skew nonampliﬁcation is clearly a desirable property. In principle, any algorithm for robust rank-1 matrix factorization can converge to (αa)(α 1b T ) where ab T is the true rank-1 matrix and α can

be any real number. It is natural to bound how large the constants α and α 1 can be. The skewnonamplifying property does that by ensuring the ﬁnal skew is not worse than the skew on the initial conditions.

Our next result shows that M-MSR is essentially optimal on random graphs among all skewnonamplifying methods.

Theorem 1(b). Suppose G(Ω) is a random bipartite graph Gn,n,p where each edge is generated with probability p. Suppose

p = log n + 2F log log n x

where x = o(log log n) , when n . Then, with probability approaching 1 as n , the normal rows and columns of X cannot be recovered by any skew-nonamplifying algorithm in the presence of F-local corruptions.

In other words, if Eq. (8) just barely fails due to the replacement of x by x, then Theorem 1(b) tells us that Gn,n,p will, with high probability, be a graph for which there exists a set of F-local corruptions which prevent any skew-nonamplifying algorithm from recovering the true matrix X.

Theorem 1 in parts (a) and (b) provides a justiﬁcation for the M-MSR algorithm: it is essentially an optimal algorithm to use on bipartite random graphs. This theorem is actually derived from the following somewhat more general theorem, which gives the exact conditions for the M-MSR algorithm to work on an arbitrary graph.

Theorem 2. Suppose G(Ω) is a connected graph where the nodes are updated according to M-MSR algorithm with parameter F. Under F-local nodes-corrupted model, the rows and columns of X without corruptions can be correctly recovered by the M-MSR method if and only if G(Ω) is 2F + 1-robust.

Theorem 2 gives substance to the intuition that recovery is possible if the number of adversarial agents is not too large, and if their placement in the graph is not central. It quantiﬁes this intuition through the concept of 2F + 1 robustness. The proofs of these theorems can be found in the supplementary information. Our results are connected to earlier work in resilient consensus [25] which introduced the concept of robustness, as well as the work [40] which analyzed the threshold of robustness in general random graphs.

5.1 Applying M-MSR to crowdsourcing

We have already spelled out how skill determination in crowdsourcing with adversaries can be reduced to rank-one matrix completion with perturbations. Here we discuss additional details that are needed to apply the M-MSR method.

Prediction. When the skills are known, according to [26], the optimal prediction method under D&S model is weighted majority voting, i.e.

γ s,A(Y ) = arg max ℓ [M]

i:(i,t) A v i 1{Yi,t = ℓ}, (11)

where v i = log (M 1)pi

1 pi , i [W]. As discussed above, we will use perturbed rank-one matrix completion to estimate the skills.

Implications of Theorem 1. Consider the case where a total of βn adversaries exist among n workers, where β < 1. Of course, it is unknown who is an adversary. As explained earlier, a certain correlation matrix between the normal workers is rank-1 in expectation. Of course, we may not know all the entries of this correlation matrix, since only correlations among workers with tasks in common are revealed. A natural approach is to create a random bipartite graph of revealed entries by assigning tasks randomly.

This can be done in a number of ways. We may, for example, generate a random bipartite graph G from Gn,n,p ﬁrst. Then, assuming there is a sufﬁciently large incoming stream of tasks, we assign each task to a random pair of workers i and j such (i, j) is an edge in G. After each pair of agents has been assigned enough tasks, the empirical correlation matrix is approximately rank-1, and we reveal the entries of this matrix corresponding to G. Note that, even though the correlation matrix is symmetric, this method will reveal an asymmetric subset of entries. Other methods to generate the

graph randomly via random task assignment are also possible, for example by assigning tasks to more than two workers. The key point, however, is that the fraction of adversaries in every neighborhood will then concentrate around β: for each node i, each of its randomly chosen neighbors is adversarial with probability β.

How many adversaries can we have and still correctly recover the skills of all the normal workers? Unfortunately, using the strategy of the previous paragraph, any constant fraction β of adversaries will result in a failure. Indeed, glancing at Eq. (8), if F scales as βpn (pn is the expected degree, and an expected fraction β of these nodes will be adversarial), then Eq. (8) can never be satisﬁed.

Although this sounds discouraging, the guarantees of Theorem 1 are still useful, as we explain now. Glancing at Eq. (8), it is easy to see that the choice of β = 1/[(2+ϵ) log(log(n))] (and corresponding F = βpn) leads to that equation being satisﬁed with the choice of p = Ω((1 + ϵ 1) log n)/n. In other words, we can tolerate a fraction of 1/[(2 + ϵ) log(log(n))] of adversaries.

Fortunately, 1/ log(log(n)) decays to zero quite slowly. Recall that in the crowdsourcing scenario, n will be the number of users; using an upper bound of 10 billion people for the population of planet Earth, we see that on any real-world data set, we have 1/[2 log(log(1010))] 16%, which is a healthy proportion of adversaries to tolerate.

Sign Determination. In the M-MSR algorithm, we assume the rank-one matrix to recover is positive. However the rank-one matrix which we aim to recover in crowdsourcing problem is not necessarily positive. In fact, a worker s skill level si [ 1 M 1, 1] as si = M M 1pi 1 M 1, pi [0, 1]. To solve this issue, we can compute the entry-wise absolute value of the rank-one matrix, then apply M-MSR to get |s|, ﬁnally, we apply a post-processing step to identify the sign pattern of s. Details are available in Supplementary Sec. I.

6 Experiments

In several crowdsourcing experiments, we will compare the average prediction error ( 1

T P t=1, ,T { ˆYt = gt}) of M-MSR algorithm with the straightforward majority voting (referred to as MV) and the following methods from the literature: [18](KOS), [11](Ghost-SVD), [5]( Eo R), [41] ( MV-D&S and OPT-D&S), [30](PGD), [28](BP-twocoin, EM-twocoin, and MFA-twocoin), [42](Entropy(O)), [44] (Minmax) . A detailed description of all of these methods can be found in Supplementary Sec. A. In all cases we choose the corrupted workers at random, and the reported results are from an average of 50 runs, with the shaded region denoting the standard deviation.

Synthetic Experiments. Figure 1 shows the results of a number of experiments we have generated. We want to study the impact of graph types (Figure 1(b), 1(d)), task assignments (Figure 1(c), 1(f)) and skill level of normal workers (Figure 1(e)) to the performance of the crowdsourcing methods. Additionally, we want to study the impact of different adversarial strategies (Figure 1(g), 1(h), 1(i), 1(k), 1(j)). To do this, we applied an adversarial model, by varying the parameters of this model, we can generate a number of different adversarial strategies (e.g., always return the wrong answer, return a random answer, return a certain fraction of the correct answer, some of the adversaries return exactly the same answers, some adversaries return the perfectly colluding answers, assign each adversary with every task, assign each adversary with every task with a ﬁxed probability). The details about the adversarial model as well as a deﬁnition of these parameters appears in Supplementary Sec. C.

Each graph in Figure 1 shows what happens when we vary one parameter. It can be seen that the M-MSR algorithm strongly outperforms the baseline methods on various datasets and under almost all of the adversarial strategies. Besides, the most damaging strategy in terms of reducing prediction error across the methods we tried seems to return a correct answer a fraction q < 1/2 of the time and an incorrect answer 1 q of the time, and the adversaries should locate at the central places and be highly dependent with each other.

Experiments on real data. We implemented similar experiments on 17 publicly available data sets that are commonly used to evaluate the crowdsourcing algorithms. A detailed discussion of all the datasets can be found in Supplementary Sec. E, and the details of the how the experiments were conducted can be found in supplementary Sec. D. As shown in Figure 2 and Figure 6 (Supplementary Sec. D), the M-MSR algorithm consistently outperforms all the baseline methods. In particular, when the number of the corrupted workers increases, the prediction error of M-MSR algorithm maintains the smallest on almost every dataset.

MV KOS Ghost SVD Eo R MV-D&S PGD OPT-D&S BP-twocoin EM-twocoin MFA-twocoin Entropy(O) Minmax M-MSR-twocoin M-MSR

0.02 0.04 0.06 0.08 0.1 Mean of observation sparsity

Prediction error

(b) Graph sparsity

500 1000 1500 2000 2500

Prediction error

(c) Number of tasks

50 100 150 200 # workers

Prediction error

(d) Graph size

0.2 0.4 0.6 0.8 Skill interval upper bound

Prediction error

(e) Skill level

10 20 30 40 # workers in group B

Prediction error

(f) Task-assignment variance

0 0.5 1 Accuracy of adversaries

Prediction error

(g) Adversary accuracy

0.2 0.4 0.6 0.8 Adversary observation sparsity

Prediction error

(h) Adversary ob-sparsity

0 10 20 30 # corrupted workers

Prediction error

(i) Number of adversaries

0.1 0.2 0.3 0.4 Fraction of colluding adversaries

Prediction error

(j) Adv. dependence level

0.1 0.2 0.3 0.4 Fraction of colluding adversaries

Prediction error

(k) Adv. dependence level

Figure 1: Experiments on synthetic data (see Supplementary Section C for a full explanation).

0 20 40 60 80 # corrupted workers

Prediction error

0 10 20 30 40 # corrupted workers

Prediction error

0 20 40 60 80 # corrupted workers

Prediction error

0 20 40 # corrupted workers

Prediction error

0 100 200 # corrupted workers

Prediction error

0 5 10 15 20 # corrupted workers

Prediction error

0 15 30 45 60 # corrupted workers

Prediction error

0 15 30 45 60 # corrupted workers

Prediction error

0 5 10 15 20 # corrupted workers

Prediction error

0 5 10 15 20 # corrupted workers

Prediction error

MV KOS Ghost SVD Eo R MV-D&S PGD OPT-D&S BP-twocoin EM-twocoin MFA-twocoin Entropy(O) Minmax M-MSR-twocoin M-MSR

Figure 2: Experimental results on real datasets (see Supplementary Section D for a full explanation).

The M-MSR algorithm performs especially well on large datasets like RTE, Temp, TREC, Fashion1 and Fashion2. It is the only method which can handle around n

2 (n is the number of the total workers) corrupted workers on these datasets. Out of 17 real datasets, our algorithm is the best on 16 of them. The only exception is dataset Surprise ( Figure 6 in Supplementary Sec. D) the reason is that the normal workers on this dataset do not appear to be reliable. Besides, on some of the original real datasets, i.e., no adversaries are introduced, M-MSR algorithm is not the best one among all the baseline methods. This shows that the superiority of the M-MSR algorithm is mainly on large datasets in adversarial setting.

6.1 Exact Recovery

We compare the M-MSR algorithm with PCA and RPCA algorithms in [9]. We study the recovery rate of the rank-one matrices with different level of noises. The experiment results are shown in Figure 3 (the details of this experiment can be found in Supplementary Sec. F). It can be seen that our algorithm strongly outperforms these methods for robust rank-one matrix completion. We also note that the M-MSR algorithm is very efﬁcient: when the dimension of the matrices increases from 10 to 1000, the running time of M-MSR increases from 0.01 seconds to 0.42 seconds while RPCA increases

Noise probability

Noise probability

Noise probability

200 400 600 800 1000 Dimension

Running time

(d) Running Time

Figure 3: Experimental results of exact recovery experiment. (a) Recovery rate heatmap of subgradient method for PCA (The intensity of the color is proportional to the recovery rate). (b) Recovery rate heatmap of subgradient method for RPCA (c) The recovery rate heatmap for M-MSR (d) Running time of the subgradient method for RPCA and running time for M-MSR (for each dimension, the average running time and the standard deviation conﬁdence interval over 100 independent trials are shown). Details are provided in Supplementary Section D.

from 0.46 seconds to 80.62 seconds. This is an additional advantage of the M-MSR algorithm when dealing with large datasets.

7 Discussion and Conclusions

We studied a crowdsourcing model with (i) The presence of users who might choose adversarial responses (ii) General worker-task assignment sets resulting in arbitrary interaction graphs G(Ω) among workers. Because approaches based on sparse recovery are not able to handle arbitrary A, we proposed a new algorithm, M-MSR, for skill determination (and consequently prediction) in this context. Our algorithm is based on a connection to the robust rank-1 matrix completion.

Our main results are: (i) A necessary and sufﬁcient condition for our algorithm to work on any graph, and a proof that our algorithm is optimal on random graphs (ii) An empirical evaluation which shows that our algorithm outperforms existing methods on both synthetic and real data sets.

Future work will analyze M-MSR when the graph is partially random. While some scenarios, like Amazon s Mechanical Turk, allow any set A to be speciﬁed, other practical scenarios do not; consider, for example, estimation of item quality from online ratings (e.g., Yelp), where the assignment of users to items is not random. Adversarial interactions are particularly important in this context, as business owners might be tempted to skew the ratings by leaving reviews from fake accounts.

Theorem 2 provides a bound to how many adversaries can be tolerated in this setting, and this bound will be quite good as long as the underlying graph is dense. For a sparse graph, however, this is not the case: in that case, Theorem 2 could fail to guarantee that even a small number of adversaries can t skew the result. One possibility is to strategically add more random edges (by giving users suggestions of items to rate) to make the resulting graph 2F + 1-robust for a large F. Subsequent work could consider how well such schemes perform both in theory and in practice.

Broader Impact

In this work, we provide a new robust matrix completion methods which can make recommendation systems more accurate in the presence of spam. This can beneﬁt users of platforms like Amazon Mechanical Turk and Yelp.

In terms of negative impact, our work could allow the same platforms to learn more about the preferences of their users. It is possible that this data could be leaked, resulting in privacy loss.

Acknowledgments and Disclosure of Funding

This work is supported by NSF awards 1914792 and 1933027.

[1] D. Berend and A. Kontorovich. Consistency of weighted majority votes. In Proceedings of Advances in Neural Information Processing Systems, pages 3446 3454, 2014.

[2] E. J. Candès and B. Recht. Exact matrix completion via convex optimization. Foundations of Computational mathematics, 9(6):717 772, 2009.

[3] E. J. Candès and T. Tao. The power of convex relaxation: Near-optimal matrix completion. IEEE Transactions on Information Theory, 56(5):2053 2080, 2010.

[4] I. Dagan, O. Glickman, and B. Magnini. The pascal recognising textual entailment challenge. In Proceedings of Machine Learning Challenges Workshop, pages 177 190. Springer, 2005.

[5] N. Dalvi, A. Dasgupta, R. Kumar, and V. Rastogi. Aggregating crowdsourced binary ratings. In Proceedings of the 22nd International Conference on World Wide Web, pages 285 294, 2013.

[6] A. P. Dawid and A. M. Skene. Maximum likelihood estimation of observer error-rates using the em algorithm. Journal of the Royal Statistical Society: Series C (Applied Statistics), 28(1): 20 28, 1979.

[7] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. Image Net: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248 255. Ieee, 2009.

[8] D. Dolev, C. Dwork, O. Waarts, and M. Yung. Perfectly secure message transmission. Journal of the ACM, 40(1):17 47, 1993.

[9] S. Fattahi and S. Sojoudi. Exact guarantees on the absence of spurious local minima for nonnegative robust principal component analysis. Journal of Machine Learning Research, 21:1 51, 2020.

[10] D. Gamarnik and S. Misra. A note on alternating minimization algorithm for the matrix completion problem. IEEE Signal Processing Letters, 23(10):1340 1343, 2016.

[11] A. Ghosh, S. Kale, and P. Mc Afee. Who moderates the moderators? crowdsourcing abuse detection in user-generated content. In Proceedings of the 12th ACM Conference on Electronic Commerce, pages 167 176, 2011.

[12] N. Hartsﬁeld and G. Ringel. Pearls in graph theory: a comprehensive introduction. Courier Corporation, 2013.

[13] J. M. Hendrickx, A. Olshevsky, and V. Saligrama. Minimax rank-1 factorization. In Proceedings of 23rd International Conference on Artiﬁcial Intelligence and Statistics, 2020.

[14] J. Hromkoviˇc, R. Klasing, A. Pelc, P. Ruzicka, and W. Unger. Dissemination of Information in Communication Networks: Broadcasting, Gossiping, Leader Election, and Fault-tolerance. Springer Science & Business Media, 2005.

[15] S. Ibrahim, X. Fu, N. Kargas, and K. Huang. Crowdsourcing via pairwise co-occurrences: Identiﬁability and algorithms. In Proceedings of Advances in Neural Information Processing Systems, pages 7845 7855, 2019.

[16] P. G. Ipeirotis, F. Provost, and J. Wang. Quality management on amazon mechanical turk. In Proceedings of the ACM SIGKDD Workshop on Human Computation, pages 64 67, 2010.

[17] S. Jagabathula, L. Subramanian, and A. Venkataraman. Identifying unreliable and adversarial workers in crowdsourced labeling tasks. The Journal of Machine Learning Research, 18(1): 3233 3299, 2017.

[18] D. R. Karger, S. Oh, and D. Shah. Efﬁcient crowdsourcing for multi-class labeling. In Proceedings of the ACM SIGMETRICS/international conference on Measurement and modeling of computer systems, pages 81 92, 2013.

[19] A. Khetan and S. Oh. Achieving budget-optimality with adaptive schemes in crowdsourcing. In Advances in Neural Information Processing Systems 29, pages 4844 4852. 2016.

[20] F. J. Király, L. Theran, and R. Tomioka. The algebraic combinatorial approach for low-rank matrix completion. Journal of Machine Learning Research, pages 1391 1436, 2015.

[21] M. Kleindessner and P. Awasthi. Crowdsourcing with arbitrary adversaries. In Proceedings of International Conference on Machine Learning, pages 2708 2717, 2018.

[22] W. Kordecki. Poisson convergence of numbers of vertices of a given degree in random graphs. Discussiones Mathematicae Graph Theory, 16(2):157 172, 1996.

[23] H. Landau and A. Odlyzko. Bounds for eigenvalues of certain stochastic matrices. Linear Algebra and its Applications, 38:5 15, 1981.

[24] M. Lease and G. Kazai. Overview of the trec 2011 crowdsourcing track. In Proceedings of the Text Retrieval Conference, 2011.

[25] H. J. Le Blanc, H. Zhang, X. Koutsoukos, and S. Sundaram. Resilient asymptotic consensus in robust networks. IEEE Journal on Selected Areas in Communications, 31(4):766 781, 2013.

[26] H. Li and B. Yu. Error rate bounds and iterative weighted majority voting for crowdsourcing. ar Xiv preprint ar Xiv:1411.4086, 2014.

[27] G. Liu, Q. Liu, and X. Yuan. A new theory for matrix completion. In Proceedings of Advances in Neural Information Processing Systems, pages 785 794, 2017.

[28] Q. Liu, J. Peng, and A. T. Ihler. Variational inference for crowdsourcing. In Advances in Neural Information Processing Systems 25, pages 692 700. 2012.

[29] B. Loni, M. Menendez, M. Georgescu, L. Galli, C. Massari, I. S. Altingovde, D. Martinenghi, M. Melenhorst, R. Vliegendhart, and M. Larson. Fashion-focused creative commons social dataset. In Proceedings of the 4th ACM Multimedia Systems Conference, pages 72 77, 2013.

[30] Y. Ma, A. Olshevsky, C. Szepesvari, and V. Saligrama. Gradient descent for sparse rankone matrix completion for crowd-sourced aggregation of sparsely interacting workers. In Proceedings of International Conference on Machine Learning, pages 3335 3344, 2018.

[31] S. Negahban and M. J. Wainwright. Restricted strong convexity and weighted matrix completion: Optimal bounds with noise. Journal of Machine Learning Research, 13(May):1665 1697, 2012.

[32] S. Pradhan, E. Loper, D. Dligach, and M. Palmer. Semeval-2007 task-17: English lexical sample, srl and all words. In Proceedings of the fourth international workshop on semantic evaluations (Sem Eval-2007), pages 87 92, 2007.

[33] J. Pustejovsky, P. Hanks, R. Sauri, A. See, R. Gaizauskas, A. Setzer, D. Radev, B. Sundheim, D. Day, L. Ferro, et al. The TIMEBANK Corpus. In Proceedings of Corpus Linguistics, pages 647 656. Lancaster, UK., 2003.

[34] V. C. Raykar and S. Yu. Eliminating spammers and ranking annotators for crowdsourced labeling tasks. Journal of Machine Learning Research, 13(Feb):491 518, 2012.

[35] N. B. Shah, S. Balakrishnan, and M. J. Wainwright. A permutation-based model for crowd labeling: Optimal estimation and robustness. ar Xiv preprint ar Xiv:1606.09632, 2016.

[36] R. Snow, B. O connor, D. Jurafsky, and A. Y. Ng. Cheap and fast but is it good? evaluating non-expert annotations for natural language tasks. In Proceedings of Conference on Empirical Methods in Natural Language Processing, pages 254 263, 2008.

[37] C. Strapparava and R. Mihalcea. Semeval-2007 task 14: Affective text. In Proceedings of the Fourth International Workshop on Semantic Evaluations, pages 70 74, 2007.

[38] P. Welinder, S. Branson, P. Perona, and S. J. Belongie. The multidimensional wisdom of crowds. In Proceedings of Advances in Neural Information Processing Systems, pages 2424 2432, 2010.

[39] H. Xiao, J. Gao, Q. Li, F. Ma, L. Su, Y. Feng, and A. Zhang. Towards conﬁdence interval estimation in truth discovery. IEEE Transactions on Knowledge and Data Engineering, 31(3): 575 588, 2018.

[40] H. Zhang, E. Fata, and S. Sundaram. A notion of robustness in complex networks. IEEE Transactions on Control of Network Systems, 2(3):310 320, 2015.

[41] Y. Zhang, X. Chen, D. Zhou, and M. I. Jordan. Spectral methods meet em: A provably optimal algorithm for crowdsourcing. In Proceedings of Advances in Neural Information Processing Systems, pages 1260 1268, 2014.

[42] D. Zhou, S. Basu, Y. Mao, and J. C. Platt. Learning from the wisdom of crowds by minimax entropy. In F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 25, pages 2195 2203. 2012.

[43] D. Zhou, S. Basu, Y. Mao, and J. C. Platt. Learning from the wisdom of crowds by minimax entropy. In Proceedings of Advances in Neural Information Processing Systems, pages 2195 2203, 2012.

[44] D. Zhou, Q. Liu, J. Platt, and C. Meek. Aggregating ordinal labels from crowds by minimax conditional entropy. Proceedings of Machine Learning Research, 32(2):262 270, 2014.

[45] Y. Zhou and J. He. Crowdsourcing via tensor augmentation and completion. In IJCAI, pages 2435 2441, 2016.

[46] Y. Zhou, L. Ying, and J. He. Multic2: an optimization framework for learning from task and worker dual heterogeneity. In Proceedings of the 2017 SIAM International Conference on Data Mining, pages 579 587. SIAM, 2017.