# hiring_under_uncertainty__74f4b3af.pdf Hiring Under Uncertainty Manish Raghavan 1 Manish Purohit 2 Sreenivas Gollapudi 2 In this paper we introduce the hiring under uncertainty problem to model the questions faced by hiring committees in large enterprises and universities alike. Given a set of n eligible candidates, the decision maker needs to choose the sequence of candidates to make offers so as to hire the k best candidates. However, candidates may choose to reject an offer (for instance, due to a competing offer) and the decision maker has a time limit by which all positions must be filled. Given an estimate of the probabilities of acceptance for each candidate, the hiring under uncertainty problem is to design a strategy of making offers so that the total expected value of all candidates hired by the time limit is maximized. We provide a 2-approximation algorithm for the setting where offers must be made in sequence, an 8-approximation when offers may be made in parallel, and a 10-approximation for the more general stochastic knapsack setting with finite probes. 1. Introduction Hiring is a core activity of any enterprise where the timely fulfillment of staffing needs is critical to its functioning. In addition to estimating the quality and suitability of a candidate, the enterprise also needs to deal with uncertainty that arises as the result of good candidates rejecting the job offer. Balancing this trade-off between hiring good quality candidates while at the same time ensuring that all staffing needs are met by a deadline is one of the most challenging aspects of hiring in practice. A number of algorithmic questions that are inspired by hiring settings have been well-studied in literature (see Section 1.2) including the popular secretary problem and its many variants. This line of work focuses on the online nature of 1Department of Computer Science, Cornell University 2Google Research. Correspondence to: Manish Raghavan . Proceedings of the 36 th International Conference on Machine Learning, Long Beach, California, PMLR 97, 2019. Copyright 2019 by the author(s). the problem and the key question tackled is how to find a good set of candidates when the pool of future candidates is unknown. However, this line of research does not model the other source of uncertainty, i.e., the candidate itself may choose to reject the job offer (for instance, due to a better competing offer), which in turn raises the question of hiring enough candidates by the deadline. During the hiring process, the quality (or value) of a candidate is often estimated by traditional hiring processes such as resume screening and formal interviews and even via algorithmic techniques (see Section 1.2). On the other hand, machine learning models can estimate the probability that a given candidate will accept a job offer based on various features such as the candidate s educational background, salary expectations, location preferences, and so on. Considering both the value as well as offer acceptance probability of each candidate leads to a rich collection of optimization problems. In this paper, we initiate the study of the hiring under uncertainty problem that aims to tackle the inherent trade-off at the heart of the hiring process - how should we make job offers under uncertainty so that all staffing needs are met by a deadline, and yet hire the best available candidates? Formally, we consider the following model as the basis for all the variants we present in this paper. There is a set of n candidates, and we need to hire k of them. We do this by making offers to candidates, which we ll also refer to more abstractly as probing a candidate. Each candidate i has a known value vi and probability pi of accepting an offer, independent of all other candidates. We have a deadline of t time steps, after which we can t make any further offers. It takes one time step to make an offer and receive an answer from a candidate. Job offers are irrevocable, i.e., once a candidate accepts an offer, that position is filled and we cannot replace that candidate with a better candidate in the future. The total value of a chosen set of candidates is simply the sum of the individual candidate values. Our goal is to maximize the total expected value of the hired candidates. We also consider two natural generalizations of this model. First, we allow making parallel offers to multiple candidates in a given time step. Second, we consider the knapsack hiring problem where each candidate i has a size si and we have a budget B on the total size of hired candidates. Hiring Under Uncertainty The knapsack hiring problem models the scenario when the enterprise has a fixed budget and different candidates need to be offered different salaries. We note that in all settings, we do not require vi to be known precisely; all of our results hold if vi is only known in expectation. However, our results are sensitive to errors in pi and si. Making them robust to such errors is an interesting subject for future work. 1.1. Our Contributions We summarize our contributions in this study. In Section 2, we offer a 2-approximation algorithm for hiring k candidates with a constraint of making at most t sequential offers. In Section 3, we consider the parallel offers model where we are allowed to make as many parallel offers each time step as the number of unfilled positions remaining and design a 8-approximation algorithm. In Section 4, we present a 10-approximation for the knapsack hiring problem where each candidate has a different size and the decision-maker is constrained by a total budget (as opposed to hiring k candidates). We offer a connection to other stochastic optimization problems such as stochastic matching and present a lower-bound for the stochastic matching problem. Finally, we show the efficacy of our algorithms using simulations on data drawn from different distributions. 1.2. Related Work Theoretical questions inspired by hiring scenarios have long been studied in the online setting under the names of optimal stopping or secretary problems (Dynkin, 1963; Chow et al., 1964). A few extensions of this setting incorporate elements of our model. Kleinberg considers the case of hiring multiple candidates instead of the traditional single-hire case (Kleinberg, 2005). An older line of work considers a version of the secretary problem in which candidates may stochastically reject offers, although this is typically modeled as a fixed rejection probability (Smith, 1975; Tamaki, 1991; 2000; Ano & Ando, 2000). In addition, more recent work on stochastic optimization considers a variety of related problems in the offline setting. This includes stochastic versions of submodular optimization (Asadpour et al., 2008; Gupta et al., 2017), knapsack (Dean et al., 2004; 2005; Bhalgat et al., 2011), bandits (Gupta et al., 2011; Ma, 2014), and matching (Bansal et al., 2010; Adamczyk et al., 2015; Baveja et al., 2018). Some special cases of our model (specifically, when one candidate is being hired) can be considered a special case of matching, and in fact, the results we derive here will provide lower bounds for stochastic matching. However, our model cannot in general be captured by any of these prior works. Algorithmic and data-driven approaches to hiring have become increasingly common with the rise of machine learning (Miller, 2015; Carmichael, 2015). In particular, there is a long line of work focused on predicting teacher quality from data (Kane & Staiger, 2008; Dobbie, 2011; Chalfin et al., 2016; Jacob et al., 2018). More broadly, Mullainathan & Spiess (2017) describe the integration of machine learning with traditional econometric techniques to better estimate quantities like employee performance. Furthermore, studying the gig economy, Kokkodis et al. (2015) use machine learning to estimate the likelihood that freelancers get hired. 2. Hiring Problem: How to fill k positions sequentially? In this section, we consider the basic hiring problem where we want to hire k employees out of n potential candidates with a constraint of making at most t sequential offers. 2.1. Special case: Hiring a single employee (k = 1) To develop some intuition about the problem as well as to illustrate some of the challenges posed, we begin with the case where k = 1, i.e. , we only want to hire one candidate. One might hope that a simple greedy algorithm is optimal in this special case. Unfortunately, as we will show, a number of seemingly natural greedy algorithms1 do not yield optimal solutions. However, we can still take advantage of structural properties of the solution. In particular, given a set of t candidates, the optimal order in which to make offers to them is in decreasing order of vi. To see why, for any two candidates i and j, consider the four possible outcomes of making offers to them: both i and j accept, both reject, i accepts and j rejects, and vice versa. The only outcome in which the order of offers matters is when they both accept, since the position will go to the candidate receiving first offer, and the second offer will never be made. In this case, it is clearly better to make the first offer to the candidate with higher value. Since the optimal algorithm must always make offers to candidates in decreasing order by value, we can write a dynamic program to compute the optimal subset of t candidates to potentially make offers to. Assume the candidates are sorted in non-increasing order of vi, i.e., v1 v2 . . . vn. Let S(i, s) be the optimal expected value that can be achieved with s time steps remaining by only considering candidates i through n. Then, we have the recurrence S(i, s) = max{pivi + (1 pi)S(i + 1, s 1), S(i + 1, s)} where the two terms correspond to either making an offer to candidate i or not. Note that S(1, t) then gives the value of 1For instance, sorting the candidates by decreasing pi, vi, or pi vi and then making at most t offers until one accepts. Hiring Under Uncertainty the optimal solution, and the offer strategy can be found by retracing the choices of the dynamic program. 2.2. General Problem: Hiring k employees (k > 1) While the k = 1 case admits a clean solution, the general case where k > 1 is more complex. First, note that a simple k-approximation exists: with the dynamic program from Section 2.1, we can optimally fill a single slot. Doing so yields a candidate who is in expectation at least as good as any of the k candidates hired by the optimal strategy. In general, however, the optimal solution may display several non-monotonicities that make it difficult to extend the k = 1 solution. Example 1. Consider the following instance with n = 4, t = 3, and k = 2. (p1, v1) = (1, 1) (p2, v2) = (0.5, 1) (p3, v3) = (0.5, 1) (p4, v4) = (0.1, 2) We will show that in the optimal strategy, the offers made are not necessarily monotone in acceptance probability, value, or expected value. First, note that any deterministic strategy can be represented as a binary decision tree, where each node in the tree corresponds to a candidate to whom an offer is made. The two branches are the resulting strategies if the offer is accepted or rejected. Taking the convention that the right branch corresponds to acceptance, the optimal solution for the above instance is as shown in Figure 1. Figure 1. An optimal solution to Example 1 Note that there are several counter-intuitive effects at play here. First, despite having the lowest acceptance probability and expected value, candidate 4 still receives an offer with probability 1/2. Second, the candidate with the highest expected value (candidate 1) receives an offer either last or not at all. Finally, despite the fact that candidates 1, 2, and 3 all have the same value, it is strictly optimal to make an offer to candidate 2 (or 3) before candidate 1, even though candidate 1 accepts with higher probability. Thus, unlike in the k = 1 scenario, the optimal solution may not be value-ordered, so the dynamic programming approach discussed above cannot be optimal here. We conjecture that this problem is NP-hard for general k. In the remainder of this section, we present an approximation algorithm that runs in polynomial time and yields at least half of the optimal expected value. We first show that there exists a non-adaptive algorithm yielding a 2-approximation. Then, we show that a dynamic program similar to that in Section 2.1 gives an adaptive algorithm that is better than any non- adaptive algorithm, and hence is also a 2-approximation. 2.2.1. ESTABLISHING AN ADAPTIVITY GAP OF 2 Gupta et al. (2017) study adaptivity gaps for stochastic probing problems where the goal is to maximize a given submodular function (or XOS function) over the set of active, probed elements. In this setting, each element e is active independently with probability pe and the set of elements that are probed must satisfy given prefix-closed constraints. The HIRING WITH UNCERTAINTY problem does not quite fit into their framework, since their framework allows one to choose the best set of active, probed elements, while in our setting we are forced to hire the first k candidates that are active. Nevertheless, we can leverage some insights from (Gupta et al., 2017) to show an adaptivity gap of 2 (as opposed to 3 obtained by them for stochastic probing). Similar to the one shown in Figure 1, the optimal solution to any instance can be represented by a binary tree T . Each node u of T corresponds to a candidate i (denoted by cand(u)) and has two outgoing edges leading to subtrees in case the candidate i is active (happens with probability pi) or inactive (happens with probability (1 pi)). Any root to leaf path in this tree represents the sequence of offers made by the optimal algorithm in a particular realization. The tree T naturally defines a probability distribution πT over root to leaf paths - start at the root and at each node u, follow the yes edge with probability pi where i = cand(u) and the no edge otherwise. Since the optimal strategy can make offers to at most t candidates, any such path must have at most t nodes. Figure 2. A path with 3 segments: ABC, DE, and FGHI. Further, since any strategy can only hire at most k candidates, any root to leaf path in T must have at most k 1 yes edges. Thus any root to leaf path P can be decomposed into at most k segments where a segment is a maximal sub-path composed of only no edges as shown in Figure 2. Let segments(P) = {S1, S2, . . . , Sℓ} denote the set of segments in P. For each segment S segments(P), let last(S) denote the last node on segment S. Given the optimal tree T , Procedure 1 samples a single path P according to the distribution πT and then probes the candidates on each segment of P in descending order by value to hire at most one candidate from each segment. In the rest of this section, we show that Procedure 1 yields at least half of the total expected value of T in expectation. Let val(T ) = EP πT h Pℓ j=1 vlast(Sj) i be the total ex- Hiring Under Uncertainty Procedure 1 Approx Given Tree(T ) 1: P a random path sampled from πT 2: S1, . . . , Sℓ P divided into at most k segments {Each Sj is a list of candidates} 3: for j 1, . . . , ℓdo 4: S j Sj sorted in decreasing order of value 5: for each candidate i in S j do 6: Make an offer to candidate i 7: if i accepts then 8: break 9: end if 10: end for 11: end for pected value of the tree T (note that ℓis a random variable). Similarly, let proc1(T ) be the expected value obtained by Procedure 1 on tree T . For any segment Sj, we define segval(Sj) to be the expected value of the active candidate from Sj with largest value. Formally, if Sj consists of candidates {1, 2, . . . , |Sj|} sorted in non-increasing order of their values, then j