# optimal_complex_task_assignment_in_service_crowdsourcing__c254fcc4.pdf

Optimal Complex Task Assignment in Service Crowdsourcing

Feilong Tang Department of Computer Science and Engineering, Shanghai Jiao Tong University, China tang feilong@sjtu.edu.cn

Existing schemes cannot assign complex tasks to the most suitable workers because they either cannot measure skills quantitatively or do not consider assigning tasks to workers who are the most suitable but unavailable temporarily. In this paper, we investigate how to realize optimal complex task assignment. Firstly, we formulate the multiple-skillbased task assignment problem in service crowdsourcing. We then propose a weighted multi-skill tree (WMST) to model multiple skills and their correlations. Next, we propose the acceptance expectation to uniformly measure the probabilities that different categories of workers will accept and complete speciﬁed tasks. Finally, we propose an acceptance-expectation-based task assignment (AE-TA) algorithm, which reserves tasks for the most suitable workers even unavailable temporarily. Comprehensive experimental results demonstrate that our WMST model and AE-TA algorithm signiﬁcantly outperform related proposals.

1 Introduction Service crowdsourcing provides a convenient platform to collaboratively perform complex tasks (i.e., macrotasks) [Haas et al., 2015], such as literature translation, document editing and product design, besides traditional simple tasks like labeling [Liu et al., 2018]. These macrotasks usually require multiple skills [Alon et al., 2015] and have speciﬁed deadlines. Specialized translation, for example, requires not only ability in at least two languages but also domain knowledge. The most desirable goal of macrotask assignment is to assign tasks to the most suitable workers [Zheng et al., 2015; Qiu et al., 2016]. However, it is very challenging in multiworker and multi-task service crowdsourcing. Essentially, there are two key challenges. The ﬁrst is how to precisely measure multiple skills of workers because different workers usually possess different skills and diverse degree of proﬁciency in the same skill [Xu et al., 2018]. The second is how to assign tasks to the most suitable workers in the case that states of workers are uncertain, i.e., some of more suitable workers may be temporarily unavailable at the time of task assignment. In realistic scenarios, although the most suitable

workers often are not available during task assignment, they potentially can complete tasks before tasks deadlines. Some researchers have proposed skill models [Roy et al., 2015; Salek et al., 2013]. However, these models have not solved the above ﬁrst challenge because they cannot reﬂect correlations among multiple skills. The existing hierarchical skill models also cannot quantitatively capture multiple skills. On the other hand, few existing proposals considered workers who are more suitable for speciﬁed tasks but unavailable currently. In [Boutsis and Kalogeraki, 2013; Boutsis and Kalogeraki, 2014], for example, the authors only estimated whether the currently available workers can complete tasks. In summary, from either precise skill estimation or optimal task assignment point of view, existing work cannot achieve to assign tasks to the most suitable workers . In this paper, we propose a multi-skill-based assignment approach for complex macrotasks in service crowdsourcing, which addresses the above both challenges simultaneously. Firstly, we propose a Weighted Multi-Skill Tree (WMST) to model multiple skills of workers and correlations among these skills, and deﬁne the match quality to capture how well works ﬁt for tasks quantitatively. To assign tasks to the most suitable workers, we divide workers into three categories: available, busy and ofﬂine workers, and then quantitatively estimate their acceptance-and-completion expectation. Based on this expectation, we propose an adaptive algorithm that dynamically decides how many tasks should be reserved for workers more suitable but temporarily unavailable. We focus on real macrotask-oriented service crowdsourcing systems with the following features. Firstly, these macrotasks, e.g., translating a book on Java programming, have professional skill requirements and should be ﬁnished before given deadlines. Secondly, workers have professional skills and their workloads are relatively stable even predictable. Obviously, senior software engineers can provide higher translation quality than taxi drivers. So, if workers with more related skills but unavailable temporarily have enough time to complete tasks, it is more desirable to assign such tasks to them. Our main contributions can be summarized as follows. We formally deﬁne the MS-TA (Multi-Skill-based Task Assignment) problem in service crowdsourcing systems, with the objective of maximizing global task assignment quality. We then prove that this problem is NP-hard. We propose the multi-skill model WMST (Weighted

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)

Multi-Skill Tree) that can quantitatively capture multiple skills and correlations among them. We propose a new acceptance expectation model and design an Acceptance-Expectation-based Task Assignment (AE-TA) algorithm. Experimental results demonstrate the high performance of our MS-TA and AE-TA.

2 Related Work 2.1 Online Task Assignment Push-based task assignment currently is predominant methods. In [Ho and Vaughan, 2012], the authors proposed a twophase exploitation assignment algorithm. [Ho et al., 2013] proposed a near-optimal algorithm. In [Goel, Nikzad, and Singla, 2014], the authors modeled the task-to-worker assignment as a bipartite graph matching problem and proposed a mechanism for maximizing the utility. In [Assadi, Hsu, and Jabbari, 2015], the authors extended the problem by considering the online aspect. To increase the efﬁciency of task completion, [Yuen, King, and Leung, 2011] proposed a matching mechanism that keeps workers working over the long run. [Chandra et al., 2015] proposed an interaction model in which workers are strategic about service cost.

2.2 Knowledge-Intensive Skill Models Most expert crowdsourcing platforms are both time-sensitive and skill-related [Tang, 2019]. [Desmarais and Baker, 2012] proposed a review of techniques for learner skill assessment. Few work on crowdsourcing has discussed workers skills in details. Most of them used keyword vectors as a simple representation of the skills [Fan et al., 2015; Cheng et al., 2016]. As an alternative, [El Maarry et al., 2014] proposed an ontology-based skill model. [Mavridis et al., 2016] proposed a skill model with a hierarchical structure, but this model can only cope with tasks related to single skill and cannot reﬂect proﬁciency degree.

3 System Model and Problem Statement 3.1 System Model Let the service crowdsourcing system run in a set of equal timeslots Q = {qk|1 k NQ}. There are a set of workers W = {wi|1 i NW } and a set of tasks T = {tj|1 j NT }, where NW and NT are the number of workers and tasks entering the system during NQ timeslots, respectively. Each worker wi W possesses a set of attributes {Swi,Vwi}, where Swi is the set of skills of wi and

Vwi =<v(q1) wi , ...,v (q NQ) wi > is a vector of states. v(qk) wi is the state of wi in qk such that v(qk) wi =1 if wi is online in qk; otherwise, v(qk) wi =0. A worker is allowed to accept multiple tasks simultaneously. We use L(qk) wi to represent the list of current tasks that wi has accepted but has not yet completed in qk. Each task tj T is associated with an attribute set {btj, dtj, gtj, Rtj}, where btj and dtj respectively are the starting time and deadline of tj with q1 btj<dtj q NQ; gtj is the number of workers required for tj; and Rtj denotes the set of skills required for tj. Note that gtj and Rtj are predetermined by task requesters. e(wi, tj) refers to the execution time of tj with wi. Finally, ||X|| denotes the size of a set X.

3.2 Problem Formulation Let a(qk) wi,tj denote a match that tj is assigned to wi in

qk. We propose a general model match quality m(a(qk) wi,tj) to measure the degree of task matching, which will be discussed in the next section. The global assignment A={a(qk) wi,tj| qk Q, wi W, tj T} speciﬁes the set of matches in all timeslots [Tang et al., 2019]. Accordingly, the global assignment quality Ψ(A) can be deﬁned as

a (qk) wi,tj A

m(a(qk) wi,tj) (1)

We now formulate the MS-TA problem as follows max Ψ(A) (2) s.t. v(qk) wi = 1, a(qk) wi,tj A (3)

(a(qk1) wi,tj A) (a(qk2) wi,tj A) = false, qk1 = qk2 (4)

{a(q) w,t A|t = tj, w W, q Q} ntj, tj T (5)

qk + e(wi, tj) dtj, a(qk) wi,tj A (6)

(2) deﬁnes the optimization objective. Constraint (3) ensures that any wi can be assigned tasks only in timeslots qk when he is available. (4) ensures that a worker cannot be assigned tasks that he has already worked on, and (5) ensures that the number of workers assigned to any task cannot exceed the number ntj required for that task. Finally, (6) guarantees that when a task is assigned to a worker, he will have sufﬁcient time to complete it. Theorem 1. The MS-TA problem is NP-hard.

Proof. We prove the above theorem by reducing the multiple knapsack problem to the MS-TA problem. The former is a more complex version of the 0-1 knapsack problem, and both problems have been proofed to be NP-hard [Magazine and Chern, 1984]. The multiple knapsack problem can be described as follows. Given x items, each item has an associated value vp and a weight cp such that p<x. Let m knapsacks have a set of capacities Cq such that q m. The problem is to put items into those knapsacks such that the total value among all knapsacks is maximized without exceeding any knapsack s capacity. The above multiple knapsack problem is just a special case of our MS-TA problem. Speciﬁcally, for any task, the execution times and match qualities for all workers are assumed to be equal; i.e., m(a(qk) wi,tj)=vj and e(wi, tj)=cj for any wi W, tj T and qk Q. Let workers be available at consecutive times; we denote the length of available time by Ci. Moreover, let btj=q1, dtj=q NQ, and gtj=1 for all tj T, meaning that all tasks can be assigned only once. We regard the workers as analogous to knapsacks and the tasks as analogous to items, and we wish to ﬁnd an assignment that maximizes the total match quality P (w,t) W T m(a(qk) wi,tj) subject to P

tj T e(wi, tj) Ci for every wi W. The above reduction demonstrates that the MS-TA problem is more complex than the multiple knapsack problem. Therefore, the MS-TA is a NP-hard problem.

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)

4 Multiple Skill Model 4.1 Weighted Multi-Skill Tree (WMST) Hierarchical structure can effectively describe multiple skills of a worker [Mavridis et al., 2016]. We deﬁne the weighted multi-skill tree WMST S= {s, [S[1], S[2], ..., S[k]]} to model not only multiple skills but also their correlations of a worker, where s is a root node and S[1], S[2], ..., S[k] are sub-trees at lower levels. Each node s S corresponds to a key-value pair <key(s),d(s)>, where key(s) is a skill and d(s) is the weight of that skill. Let dep(s) be the depth of a node s in the WMST. We use fth(n)(s) (0 n dep(s)) to denote a uplevel node that is n levels higher than s. We set fth(0)(s)=s. Finally, chd(s) denotes the set of child nodes of s. Our WMST can model multiple skills for complex tasks. Here, a leaf node denotes an atomic skill (e.g., Java language skill), and a middle node captures a composite skill (e.g., coding skill). Each worker wi W has an associated WMST Swi. Weights represent the proﬁciency of skills of a worker. The weight of any node except a leaf node is the sum of weights of its child nodes. Let dwi(s) denote the weight of s for wi. We have dwi(s) = X

s chd(s) dwi (s ) (7)

4.2 Match Quality Workers with required skills are more suitable for a given task [Roy et al., 2015]. However, other related skills also contribute to the task execution [Choi, Yoo, and Lee, 2018]. Complexities of tasks and skill proﬁciency of workers are unobserved random variables. Instead, observed variables are responses from workers. We employ the Rasch model [Rasch, 1993] to estimate initial skill in leaf nodes of a WMST, using questionnaire as a pre-test. The weights of middle nodes are calculated using (7). We update weights in a WMST of a worker whenever he ﬁnishes a new task. So, the WMST can capture the proﬁciency degree more and more accurately. Each task tj T is associated with a set of skill requirements Rtj={rtj(s1), ..., rtj(sn)}, where si represents a non-root skill node in the WMST, with the constraints that fth(x)(sj) = si for any x>0 and dep(si)<dep(sj) for si, sj S. rtj(si) is the proportion of the total skill requirement for tj that is associated with si, under the constraint such that P si rtj(si)=1. The values of these proportions are determined by the task requester. We use the match quality, m(a(qk) wi,tj), to represent how well a worker wi is ﬁt for a task tj, considering both the required skill and related skills. The principle is that the larger the distance between a required skill and a related skill is, the less contribution the related skill will have. For example, a software engineer is more suitable for a software development than a good driver, however, a software engineer adept to both programming languages (required skill) and software engineering methodology (related skill) will be more qualiﬁed than that only adept to a programming language. So, we can formulate the match quality as

m(a(qk) wi,tj) = X

s Rtj rtj(s)

σn(dwi(fth(n)(s)) dwi(fth(n 1)(s)))

chd(fth(n)(s)) 1

(8) where σ (0<σ<1) is the attenuation coefﬁcient. All inﬂuences are exerted through common ancestors.

5 Adaptive Task Assignment Algorithm 5.1 Acceptance Expectation We propose the acceptance expectation model, which calculates the expected number of workers who will accept and complete a task before its deadline. We estimate the execution time for a task using a power law distribution [Boutsis and Kalogeraki, 2014; Ipeirotis, 2010]. Here, we assume that workers be permanent, like [Kobren et al., 2015]. Let pwi,tj(x) be the probability that wi will spend an amount of time x on tj. The corresponding CDF Pwi,tj(x) represents the probability that wi will spend time more than

x. Mathematically, we have pwi,tj(x) = α 1 xmin

α , where α>0 is a constant and xmin is a lower time bound, and

Pwi,tj(x) = Pr(X > x) = R x p(X)d X = x xmin

where α=1+n Pn i=1 ln xi xmin

. For each worker, we initially set xmin to his lowest previous execution time. We divide workers U (qk) into three pairwise disjoint subsets: available workers U (qk) A who are online and free, i.e., v(qk) wi =1 and ||L(qk) wi ||=0; busy workers U (qk) B who are online but working, i.e., v(qk) wi =1 and ||L(qk) wi || 1; and ofﬂine workers U (qk) O who are ofﬂine, i.e, v(qk) wi =0. Note that in any timeslot qk, we have U (qk)= U (qk) A U (qk) B U (qk) O . We use e PU,wi,tj to denote probability that wi U will can complete tj before its deadline, where U is a set of workers. ρtj denotes probability that tj will be accepted. The acceptance expectation E(qk) U,tj of workers in U for tj in qk will be

E(qk) U,tj = X

wi U ρtj e PU,wi,tj (9)

Available Workers The probability that tj will be completed by available workers before its deadline is calculated by e PUA,wi,tj = 1 Pwi,tj(dtj qwi,tj), (10) where qwi,tj is the time when tj is assigned to wi.

Busy Workers Let wi be busy on tb L(qk) wi . We call the list exclusive of the ﬁrst element as pending task list L

(qk) wi . The probability e P (qa,qb) UB,wi,tj that wi completes tj during [qa, qb] such that btj qwi,tj qa <qb dtj is

e P (qa,qb) UB,wi,tj =Pr qa qwi,tj < X < qb qwi,tj

=Pwi,tj qa qwi,tj Pwi,tj(qb qwi,tj) 1 Pwi,tj qa qwi,tj (11)

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)

When a new task tj is assigned to a busy worker wi, he has to have sufﬁcient time to complete tj after he has completed all the previous tasks in his current list. Therefore, we have

e PUB,wi,tj =

e P (ql,ql+1) UB,wi,tb e P (ql ,dtj ) UB,wi,tj (12)

where ql = ql + 1 + ewi||L(qk) wi tb|| and ewi is the average

execution time of wi, with e P (ql ,dtj ) UB,wi,tj=0 if ql >dtj.

Ofﬂine Workers We use a binary state vector to represent a worker s state, where 0 and 1 represent the ofﬂine and online states, respectively. We estimate the probability using Markov model. Let M (qk) denote the transition matrix from qk to qk+1, where 1 k<n. We assume a period of timeslots e Q={q1, q2, ..., qm} Q such that M (qk)=M (qk+m) for any

1 k<n m. Formally, we have M (qk)=

p(qk) 00 p(qk) 01 p(qk) 10 p(qk) 11

where p(qk) ij is the probability from state i in qk to j in qk+1. This probability can be computed by counting the frequency of state transitions from i in qk to j in qk+1. Let I(qk) wi =[p(qk)(x=0), p(qk)(x=1)] denote the initial state of wi in qk. We use the l previous states to estimate the future states. The probability of a given state in qk+s can be formulated as

I(qk+s) wi =

n=0 ζn I(qk n) wi

m= n M (qk+m) (13)

where ζn is the weight of the state in qk n subject to the constraint Pl n=0 ζn=1. We use b P (qk,qk+s) UO,wi to denote probability that an ofﬂine worker wj will login during [qk, qk+s], using

b P (qk,qk+s) UO,wi = p(qk+s)(x = 1)

m=0 p(qk+m)(x = 0) (14)

There are two possible situations for any ofﬂine worker: the current task list is empty, or there may be one or more tasks in his current task list. So, we have e PUO,wi,tj=

s=1 b P (qk+s) UO,wi e P (qk+s+1,dtj ) UO,wi,tj , ||L(qk) wi ||=0

s=1 b P (qk+s) UO,wi

e P (ql,ql+1) UO,wi,tb e P (ql ,dtj ) UO,wi,tj, ||L(qk) wi || 1

(15) Combining the formulas (9)-(15), when we assign tj to workers U (qk) W in qk, the acceptance expectation of tj is

E(qk) U,tj = E(qk) UA,tj + E(qk) UB,tj + E(qk) UO,tj (16)

5.2 Task Assignment Algorithm (AE-TA)

We deﬁne the concept of task urgency degree u(qk) tj = g(qk) tj /

(g(bj) tj (dtj qk)), where g(qk) tj is the number of remaining po-

sitions for tj in qk. u(qk) tj is proportional to the remaining

Algorithm 1: AE-TA Algorithm

Input: W, T, Q, {ρ (btj ) tj |tj T} Output: the set of matches A and rejected matches A 1: for qk = q1 to qn: 2: Clear pending task lists for all wi W; 3: T (qk) tj T with btj < qk < dtj, g(qk) tj > 0; 4: sort T (qk) by u(qk) tj ; 5: for tj T (qk): 6: W (qk) tj {w|a(qs) w,tj / A, qs < qk}; 7: sort W (qk) tj by m(a(qk) w,t );

8: Utj arg min U E(qk) U,tj g(qk) tj , Utj W (qk) tj 9: for wi Utj: /* Utj=UA,tj UB,tj UO,tj*/ 10: A = A a(qk) wi,tj if wi UA,tj ; 11: L(qk) wi L(qk) wi tj; 12: end for 13: for all w i rejects tj: /* after feedback */ 14: A A a(qk) wi,tj; L(qk) wi L(qk) wi tj; 15: end for 16: g(qk+1) tj max(g(qk) tj ||UA,tj||+||A tj||, 0);

17: ρ(qk+1) tj

Pqk q=btj ||U (q) A,tj || ||A tj || Pqk q=btj ||U (q) A,tj || ;

18: end for 19: end for

number of required workers and inversely proportional to the remaining time. We propose the two types of assignments. The ﬁrst is actual assignment for available workers. In this case, we directly assign each of them a suitable task. If he accepts a task, his state will change to busy. Instead, the second is dummy assignment for busy and ofﬂine workers. Here, a task is added to the pending task list. Possibly, a busy worker completes his task more slowly than expected and an ofﬂine worker maybe login later than expected. So, the pending task list is updated at the beginning of every timeslot. Our AE-TA is presented in Algorithm 1. In every timeslot, tasks are sorted by their urgency degrees in lines 3-4 before assignment. For each task, AE-TA ﬁnds the group of workers who are the most suitable for that task (lines 6-8) and assigns the task to them (lines 9-12). This process includes both actual assignment (line 10) and dummy assignment (line 11). The key point is to ﬁnd the minimum set Utj of workers who are the most suitable for tj; then, AE-TA directly assigns the task to the workers in UA,tj. In this way, AE-TA reserves positions for the workers who are the most suitable but temporarily unavailable. After feedback, AE-TA records rejected matches in lines 13-15 and updates parameters in lines 16-17.

6 Performance Evaluations

6.1 Compared Algorithms and Skill Models We compare our AE-TA with the four task assignment algorithms: (1) Random, which randomly assigns tasks to available workers only; (2) Match Participant First [Mavridis et al., 2016], in which tasks are assigned ﬁrstly to the workers with

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)

Figure 1: Ψ(A) vs. time

Figure 2: Ψ(A) with the number of workers

Figure 3: Ψ(A) with task generating speed

the fewest skills; (3) TAS-Online [Lykourentzou and Schmitz, 2015], which ﬁnds the maximum-weight bipartite match among tasks and available workers; and (4) g-D&C [Cheng et al., 2016], which decomposes tasks and workers into g groups and greedily chooses the maximum task-to-worker pairs. Moreover, we compared our WMST with (1) Taxonomybased skill model [Mavridis et al., 2016] based on a nonweight tree structure, where each task is associated with only one main skill, and (2) Keyword-based skill model [Fan et al., 2015], where skills are modeled as a set of keywords.

6.2 Performance on Synthetic Data The timeslot was set as one hour. We simulated 1500 workers, and built one WMST for each worker. In each WMST, weights of skills in leaf nodes were randomly set in [0, 1] and weights of non-leaf nodes were computed using formula (7). All WMSTs had the same structure, with a maximum depth of 5, and each node except the leaf nodes had ﬁve children. Tasks were generated at an average speed of µ=10 tasks /hour. The required number gt of workers for each task t was randomly initialized such that gt [5, 50]. The deadline dt was set by adding bt with a time drawn from a normal distribution with a mean of 50 and a variance of 20. Any task randomly requires 1 to 5 skills in leaf nodes of WMSTs of the workers who involved in this task. Let task execution time follow the power law distributions p(x)=cx α, where c and α were randomly selected in c [10, 30] and α [1.5, 2.5]. To obtain the distribution of active workers, we extracted the active time of 300 workers from Weibo within one week. Following this distribution of active workers, we generated worker state vectors. We set ρtj=0.8 for all tasks. Four thousand tasks were randomly generated over 15 days. The experiment ended when all tasks had expired, after 18 days in total. We tested how global assignment quality Ψ(A) changes with time. Figure 1 shows that after 280 hours, our AE-TA algorithm can achieve the highest Ψ(A) among all algorithms, although its Ψ(A) is lower in the early stage. The reason is that our AE-TA algorithm reserves tasks for workers who are more suitable but currently unavailable, whereas the other algorithms always assign tasks to available workers only. To evaluate how Ψ(A) is affected by the number of workers, we changed the worker pool size from 100 to 6000 with the ﬁxed task generating speed (µ=10 tasks/hour) and execution time (18 days). As shown in Figure 2, Ψ(A) increases

with the number of workers if there are not enough workers. When the number of workers reaches saturation, however, Ψ(A) tends to be stable. The results demonstrate that our AE-TA algorithm always outperforms the other algorithms in terms of global assignment quality. Similarly, Figure 3 illustrates how Ψ(A) changes with the number of tasks. Here, we ﬁxed the number of workers as 1500 and generated tasks with a speed distribution of µ [0.5, 75] tasks/hour. As illustrated in Figure 3, saturation is eventually reached as tasks increase.

6.3 Performance on Real Data To evaluate real performances of our task assignment algorithm AE-TA and skill model WMST, we recruited workers with different skills, conducted one-shot and long-run evaluations. In the one-shot evaluation, the workers were asked to complete a questionnaire consisting of single-option questions. For the long-run evaluation, we designed specialized translation tasks and assigned them over 7 days. In both parts of evaluations, WMSTs of all the workers have the same structure with different weights and a maximum depth of 5.

One-Shot Performance For this evaluation, we designed a questionnaire that consists of 70 single-option questions. Each question is associated with 4 alternative answers, which requires 1 to 5 atomic skills described by leaf nodes of the WMST. To built WMSTs of 80 recruited workers, we carefully selected and assigned 20 questions for initial estimation of skill weights of these workers. In case that a worker did not ﬁnish all 20 questions, the corresponding weights in his WMST were set as 0. Based on these WMSTs, we then evaluated the relationship between the proposed match quality and the correctness of the answers to demonstrate the effectiveness of our approach. Here, we asked the 20 workers with individual WMSTs built above to complete the remaining 50 questions, which generated 1000 answers. Figure 4 presents histograms showing the numbers of correct and incorrect answers. Among these answers, the average match quality scores for correct and incorrect answers are 3.46 and 2.12, respectively. The results also reveal that these workers have different ﬁelds of expertise, and the workers who perform higher match quality achieve higher correct ratios of answers. Next, we arranged other 60 workers to evaluate different assignment algorithms and skill models, using the 50 questions. We set gt=5 for all tasks. Since the tasks were assigned

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)

Figure 4: Match quality with answers

Figure 5: One-shot performance

Figure 6: Long-run performance of models

in a batch in each timeslot, every worker was regarded as available and could only be assigned one task. Therefore, not all tasks could be fully completed, but we counted the number of correct answers and computed the corresponding ratio of correct answers with respect to the number of questions answered. According to Figure 5, our WMST always significantly outperforms the taxonomyand keyword-based skill models. The taxonomy-based model exhibits the worst performance, indicating that it is not suitable for the multi-skill scenario. The keyword-based model cannot capture the correlations among skills so that it cannot ensure that workers are assigned the most suitable tasks. By contrast, our WMST is designed to consider multiple skills and their correlations. From Figure 5, we also ﬁnd that our AE-TA algorithm can achieve higher correct answer ratios except TAS-Online. TAS-Online strives to achieve the maximum-weight bipartite match, which corresponds to the theoretical maximum performance; however, it is only suitable for application in a single timeslot. Real-world scenarios are more complex, and worker states can change over time. Meanwhile, it is easy to see that both the one-shot version of AE-TA and g-D&C perform greedy matching so that they achieve similar performances.

Long-Run Performance We also conducted long-run experiments over 7 days. 84 workers were recruited for the long-run performance tests. They were evenly divided into four groups in Match Participant First, TAS-Online, g-D&C and our AE-TA. Their WMSTs were built by the 20 above single-option questions. We chose 70 articles, which varied in length from 500 to 5000 words. Each article translation was treated as a task. We equally partitioned each article into several parts with 500 words. These tasks were released randomly at an average speed of µ=10 tasks/day and were set to expire in 1 to 3 days. In Figure 6, we used AE-TA as the common assignment algorithm to evaluate different skill models. Based on the common skill model WMST, we then tested four task assignment algorithms, as shown in Figure 7. We requested more 20 independent experts to evaluate every translation with a score from 0 to 100. Since all tasks have almost the same length, the payment depends on only the quality of translations. As shown in Figure 6, where we combined our AE-TA with the three skill models respectively, our WMST model always outperforms Taxonomy-based and Keyword-based skill models in multi-skill scenarios. Further, Figure 7 shows the average translation scores over 7 days. The results are similar to those in the synthetic experiments: our AE-TA ex-

Figure 7: Long-run performance of algorithms

hibits the overall best performance among all four task assignment algorithms. Although TAS-Online has better performance in the ﬁrst two days, its average score declines in subsequent days. The reason is that this algorithm assigns tasks to all available workers; and during the ﬁrst few days, there are more available workers, whereas in later timeslots, not enough suitable workers are available. Consequently, AETA outperforms over TAS-Online. Thus, the long-run evaluation indicates that the AE-TA algorithm assigns tasks to more skill-qualiﬁed workers, thereby ensuring that more tasks are completed by workers with more suitable skills.

7 Conclusion and Future Work We investigated the problem of multi-skill-based optimal complex task assignment in service crowdsourcing. We propose a WMST model to capture correlations among skills, and a match quality model to provide a comprehensive and quantitative description of how well a worker is suitable for a task. We design the AE-TA algorithm based on the proposed acceptance expectation. Among the most related task assignment algorithms and skill models, our AE-TA algorithm and WMST model exhibit the best performance on both synthetic and real data sets. We will investigate how to solve and prove the task assignment problem theoretically.

Acknowledgments This work was supported by the National Natural Science Foundation of China projects (Nos. 61832013 and 61672351), the STCSM AI project (No. 19511120300), National Key R&D Program of China (No. 2019YFB2102204), Aerospace Fund (No. 20GFC-JJ02-012), Alibaba AIR project (No. SCCT622019010803), and Huawei Technologies Co., Ltd projects (Nos. YBN2019105155 and YBN2019075061). Feilong Tang is the corresponding author of this paper.

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)

References [Alon et al., 2015] Noga Alon, Michal Feldman, Omer Lev, and Moshe Tennenholtz. How robust is the wisdom of the crowds? In IJCAI, pages 2055 2061, 2015. [Assadi, Hsu, and Jabbari, 2015] Sepehr Assadi, Justin Hsu, and Shahin Jabbari. Online assignment of heterogeneous tasks in crowdsourcing markets. In HCOMP, pages 12 21, 2015. [Boutsis and Kalogeraki, 2013] Ioannis Boutsis and Vana Kalogeraki. Crowdsourcing under real-time constraints. In IPDPS, pages 753 764, 2013. [Boutsis and Kalogeraki, 2014] Ioannis Boutsis and Vana Kalogeraki. On task assignment for real-time reliable crowdsourcing. In ICDCS, pages 1 10, 2014. [Chandra et al., 2015] Praphul Chandra, Yadati Narahari, Debmalya Mandal, and Prasenjit Dey. Novel mechanisms for online crowdsourcing with unreliable, strategic agents. In AAAI, pages 1256 1262, 2015. [Cheng et al., 2016] Peng Cheng, Xiang Lian, Lei Chen, and Jizhong Zhao. Task assignment on multi-skill oriented spatial crowdsourcing. IEEE Transactions on Knowledge and Data Engineering, 28(8):2201 2215, 2016. [Choi, Yoo, and Lee, 2018] Jihun Choi, Kang Min Yoo, and Sang-goo Lee. Learning to compose task-speciﬁc tree structures. In AAAI, 5094 5101, 2018. [Desmarais and Baker, 2012] Michel Desmarais and Ryan Baker. A review of recent advances in learner and skill modeling in intelligent learning environments. User Modeling and User-Adapted Interaction, 22(1-2):9 38, 2012. [El Maarry et al., 2014] Kinda El Maarry, Wolf-Tilo Balke, Hyunsouk Cho, Seung-won Hwang, and Yukino Baba. Skill ontology-based model for quality assurance in crowdsourcing. In DASFAA, pages 376 387, 2014. [Fan et al., 2015] Ju Fan, Guoliang Li, Beng Chin Ooi, Kianlee Tan, and Jianhua Feng. icrowd: An adaptive crowdsourcing framework. In SIGMOD, pages 1015 1030, 2015. [Goel, Nikzad, and Singla, 2014] Gagan Goel, Afshin Nikzad, and Adish Singla. Allocating tasks to workers with matching constraints: truthful mechanisms for crowdsourcing markets. In WWW, pages 279 280, 2014. [Haas et al., 2015] Daniel Haas, Jason Ansel, Lydia Gu, and Adam Marcus. Argonaut: macrotask crowdsourcing for complex data processing. Proceedings of the VLDB Endowment 8(12):1642 1653, 2015. [Ho and Vaughan, 2012] Chien-Ju Ho and Jennifer Wortman Vaughan. Online task assignment in crowdsourcing markets. In AAAI, pages 45 51, 2012. [Ho et al., 2013] Chien-Ju Ho, Shahin Jabbari, and Jennifer Wortman Vaughan. Adaptive task assignment for crowdsourced classiﬁcation. In ICML, pages 534 542, 2013. [Ipeirotis, 2010] Panagiotis G Ipeirotis. Analyzing the amazon mechanical turk marketplace. XRDS: Crossroads, The ACM Magazine for Students, 17(2):16 21, 2010.

[Kobren et al., 2015] Ari Kobren, Chun How Tan, Panagiotis Ipeirotis, and Evgeniy Gabrilovich. Getting more for less: Optimized crowdsourcing with dynamic tasks and goals. In WWW, pages 592 602, 2015. [Liu et al., 2018] Liyuan Liu, Jingbo Shang, Xiang Ren, Frank Fangzheng Xu, Huan Gui, Jian Peng, Jiawei Han Empower sequence labeling with task-aware neural language model. In AAAI, pages 5253 5260, 2018. [Lykourentzou and Schmitz, 2015] Ioanna Lykourentzou, and Heinz Schmitz. An online approach to task assignment and sequencing in expert crowdsourcing. In HCOMP, pages 1 2, 2015. [Magazine and Chern, 1984] Michael J Magazine and Maw Sheng Chern. A note on approximation schemes for multidimensional knapsack problems. Mathematics of Operations Research, 9(2):244 247, 1984. [Mavridis et al., 2016] Panagiotis Mavridis, David Gross Amblard, and Zolt an Mikl os. Using hierarchical skills for optimized task assignment in knowledge-intensive crowdsourcing. In WWW, 843 853, 2016. [Qiu et al., 2016] Chenxi Qiu, Anna Cinzia Squicciarini, Barbara Carminati, James Caverlee, and Dev Rishi Khare. Crowdselect: Increasing accuracy of crowdsourcing tasks through behavior prediction and user selection. In CIKM, 539 548, 2016. [Rasch, 1993] Georg Rasch. Probabilistic models for some intelligence and attainment tests. ERIC, 1993. [Roy et al., 2015] Senjuti Basu Roy, Ioanna Lykourentzou, Saravanan Thirumuruganathan, Sihem Amer-Yahia, and Gautam Das. Task assignment optimization in knowledgeintensive crowdsourcing. The VLDB Journal, 24(4):467 491, 2015. [Salek et al., 2013] Mahyar Salek, Yoram Bachrach, and Peter Key. Hotspotting-a probabilistic graphical model for image object localization through crowdsourcing. In AAAI, pages 1156 1162, 2013. [Tang, 2019] Feilong Tang. Bidirectional Active Learning with Gold-Instance-Based Human Training. In IJCAL, pages 5989 5996, 2019. [Tang et al., 2019] Feilong Tang, Heteng Zhang, and Laurence T. Yang. Multipath cooperative routing with efﬁcient acknowledgement for LEO satellite networks. IEEE Transactions on Mobile Computing, 18(1): 179 192, 2019. [Xu et al., 2018] Tong Xu, Hengshu Zhu, Chen Zhu, Pan Li, and Hui Xiong. Measuring the popularity of job skills in recruitment market: A multi-criteria approach. In AAAI, pages 2572 2579, 2018. [Yuen, King, and Leung, 2011] Man-Ching Yuen, Irwin King, and Kwong-Sak Leung. Task matching in crowdsourcing. In i Things/CPSCom 2011, pages 409 412, 2011. [Zheng et al., 2015] Yudian Zheng, Jiannan Wang, Guoliang Li, Reynold Cheng, and Jianhua Feng. Qasca: a qualityaware task assignment system for crowdsourcing applications. In SIGMOD, pages 1031 1046, 2015.

Proceedings of the Twenty-Ninth International Joint Conference on Artiﬁcial Intelligence (IJCAI-20)