# riskaware_proactive_scheduling_via_conditional_valueatrisk__48928a7d.pdf

Risk-Aware Proactive Scheduling via Conditional Value-at-Risk

Wen Song, Donghun Kang, Jie Zhang,a Hui Xib

Rolls-Royce@NTU Corp Lab, Nanyang Technological University, Singapore a School of Computer Science and Engineering, Nanyang Technological University, Singapore b Rolls-Royce Singapore Pte Ltd, Singapore {songwen, donghun.kang, azhangj}@ntu.edu.sg, bhui.xi@rolls-royce.com

In this paper, we consider the challenging problem of riskaware proactive scheduling with the objective of minimizing robust makespan. State-of-the-art approaches based on probabilistic constrained optimization lead to Mixed Integer Linear Programs that must be heuristically approximated. We optimize the robust makespan via a coherent risk measure, Conditional Value-at-Risk (CVa R). Since traditional CVa R optimization approaches assuming linear spaces does not suit our problem, we propose a general branch-and-bound framework for combinatorial CVa R minimization. We then design an approximate complete algorithm, and employ resource reasoning to enable constraint propagation for multiple samples. Empirical results show that our algorithm outperforms stateof-the-art approaches with higher solution quality.

1 Introduction

Real-world scheduling applications often face dynamic situations due to the existence of considerable amounts of uncertainty, such as equipment breakdowns, delays of certain tasks, unforeseen weather conditions, etc. Therefore, practical scheduling approaches should take uncertainty into account. Speciﬁcally, the scheduling problem we consider is the challenging Resource-Constrained Project Scheduling Problem (RCPSP) with uncertain activity durations. In line with (Beck and Wilson 2007; Fu et al. 2012; Varakantham, Fu, and Lau 2016; Fu, Varakantham, and Lau 2016), we investigate proactive scheduling approaches for optimizing a risk-aware objective, i.e. minimization of the α-robust makespan, which focuses on controlling the probability that the actual makespan exceeds the α-robust makespan within a predeﬁned risk parameter α (0, 1). Compared to the expected makespan as adopted in (Lombardi, Milano, and Benini 2013; Creemers 2015; Song et al. 2017), α-robust makespan is more practical since the actual makespan may be much worse than the expected value with a high chance. The incorporation of uncertainty brings additional challenges to the already intractable RCPSP, since it is hard to even evaluate a solution (Beck and Wilson 2007). Existing approaches often resort to sampling-based techniques to mitigate the complexity. For example, Beck and Wilson (2007)

Copyright c 2018, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

and Fu et al. (2012) use sampling and simulation to evaluate solutions. State-of-the-art approaches (Varakantham, Fu, and Lau 2016; Fu, Varakantham, and Lau 2016) consider αrobust makespan minimization as a probabilistic constrained problem, which can be approximated by Sample Average Approximation (SAA) with the guarantee of converging to the optimal solution (Luedtke and Ahmed 2008). However, these approaches result in Mixed Integer Linear Programs (MILP) which are computationally prohibitive even with sophisticated solvers. To scale up the MILPs, the summarization heuristic is applied to aggregate the samples into a representative one. However, this heuristic compromises the convergence guarantee, and decreases solution quality. In this paper, we propose to optimizing the α-robust makespan via Conditional Value-at-Risk (CVa R), a popular measure in risk-sensitive applications (Rockafellar and Uryasev 2002). Our approach scales up to hundreds of samples without the need of sample summarization, hence can provide better robust makespan and better control of the risk parameter α. Based on the expectation form of CVa R minimization, we approximate the proactive problem using SAA. However, we show that the resulting problem is NP-hard due to the combinatorial nature of RCPSP. This also excludes the traditional CVa R minimization approaches that assume the decision space is linear (Hong, Hu, and Liu 2014). Thus, we design a general branch-and-bound framework for CVa R minimization in combinatorial space. We then instantiate this framework to develop a branch-and-bound algorithm based on constraint propagation and the network ﬂow theory, by extending the related components in (Leus and Herroelen 2004) designed for single resource problems. Our algorithm shares similarities with the Precedence Constraint Posting (PCP) approaches in (Laborie 2005; Lombardi, Milano, and Benini 2013). However, existing PCP approaches are designed for deterministic RCPSP where temporal reasoning can be applied for branching and constraint propagation. In contrast, our problem is built on multiple duration samples, which makes temporal reasoning very difﬁcult. Thereby, our algorithm purely reasons with resource constraints, except the lower bound computation. Finally, we conduct extensive experiments on benchmark instances and commonly used uncertainty models. The results show that our approach scales well to a large number of samples, and can produce solutions with lower α-robust

The Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18)

makespan than state-of-the-art approaches, especially when the uncertainty level is high.

2 Preliminaries This section introduces the basic concepts and notations.

2.1 Minimization of Va R and CVa R We ﬁrst describe two widely used measures for risk management, Value-at-Risk (Va R) and Conditional Value-at-Risk (CVa R). Let x be a decision vector with domain X, and g = g(x, y) be the loss function of x on a random vector y. Given a conﬁdence level β (0, 1), the β-Va R of x is deﬁned as ζβ(x) = min{ζ|Pr(g ζ) β}, which is the β quantile of the random loss g. The β-CVa R of x is deﬁned as φβ(x) = E[g|g ζβ(x)], which is the expected loss beyond β-Va R. For risk-aware settings, x with smaller Va R or CVa R is more preferable. Hence, the best decision x can be found by minimizing Va R or CVa R in X. In the theory of risk management, CVa R is believed to be a more realistic and desirable objective than Va R, mainly for two reasons. Firstly, CVa R is computationally more tractable than Va R since it is mathematically coherent. Also, CVa R is an upper bound for Va R (i.e. φβ(x) ζβ(x)), and the decision with a smaller CVa R tends to have a smaller Va R too. Secondly, Va R only provides a bound for loss g but does not quantify the loss beyond that bound. In contrast, by definition, CVa R explicitly captures this using the conditional expectation. Detail discussion about the superiority of CVa R over Va R can be found in (Rockafellar and Uryasev 2002). The minimization of CVa R is often done by minimizing a function Fβ = Fβ(x, ω) deﬁned as follows:

Fβ(x, ω) = ω + 1 1 β E{[g(x, y) ω]+}, (1)

where ω is an additional real variable and [ ]+ = max{ , 0}. It has been shown in (Rockafellar and Uryasev 2002) that CVa R minimization in X has an equivalent form:

(x , ω ) = argmin (x,ω) X R Fβ(x, ω), (2)

where x minimizes CVa R. Since Fβ has an expectation form, Sample Average Approximation (SAA) (Kleywegt, Shapiro, and Homem-de Mello 2002) is immediately applicable to approximate Problem (2), by optimizing ˆFβ = ˆFβ(x, ω) deﬁned below in the joint space X R:

ˆFβ(x, ω) = ω + 1 (1 β)Q

q=1 [g(x, yq) ω]+, (3)

where (y1, ..., y Q) is Q samples independently drawn from y. Guaranteed by the property of SAA, the optimal solution (ˆx , ˆω ) is proven to converge to (x , ω ) in Equation (2) exponentially fast as the increase of sample size Q.

2.2 RCPSP with Uncertain Durations Deterministic RCPSP involves N non-preemptive activities A = {a1, ..., a N} and K renewable resources R =

{r1, ..., r K}. Throughout its ﬁxed duration d0 i , each ai A requires bik N units of rk R with limited capacity Ck N. Two dummy activities a0 and a N+1 with zero durations are often added to represent project start and ﬁnish. Precedence constraints (i, j) could be imposed on any two activities ai, aj AP = A {a0, a N+1}, indicating aj must start after the completion of ai, i.e. sj si + d0 i , where si is the start time of ai. Let EP be the set of all precedence constraints, then the temporal relations of the project can be represented using the Activity-On-Node (AON) network GP = (AP , EP ), which is a Directed Acyclic Graph (DAG) with vertex set AP and edge set EP . Below we will use V (G), E(G), and Tr(G) to represent the vertex set, edge set, and transitive closure of a graph G. A (start-time) schedule is a vector S = (s0, ..., s N+1). Without loss of generality, we assume a0 always starts at 0, then the makespan is MS(S) = s N+1. A schedule is feasible if no precedence or resource constraint is violated. A feasible schedule is optimal if it has the minimal makespan. Under duration uncertainty, each ai has a random duration di instead of a ﬁxed one. Let d = (d1, ..., d N) be the vector of random durations, and d = (d1, ..., d N) be a realization of d. Since di may not equal to d0 i , a feasible starttime schedule may become infeasible in execution. Alternatively, a project can also be executed following a partial order schedule (POS), a ﬂexible policy that determines activity start times during execution. A POS GH = (AP , EP EH) is an augmented DAG of GP with additional precedence constraints in EH, such that any temporal feasible schedule is resource feasible (Policella et al. 2004). In other words, all possible resource conﬂicts are removed by the edges in EH. Hence the start time of each ai can be determined at the execution time very easily:

si = max{sj + dj|(j, i) E(GH)}, (4)

i.e. ai starts right after the completion of all predecessors speciﬁed by GH. Along with execution, a feasible schedule with respect to the actual durations d is obtained. Let S(GH, d) and MS(GH, d) be this schedule and its makespan. Since d is random, the makespan of GH is also a random variable MS(GH, d). Let GH be the set of POSs. Another concept used here is the AON-ﬂow Network (Artigues, Michelon, and Reusser 2003), which is also an augmented DAG GF = (AP , EP EF ) of GP , but EP EF is not necessarily . For each (i, j) EF , a ﬂow vector fij = (fij1, ..., fij K) is associated to denote the amount of rk transfered from ai to aj. Let the requirement for each rk of the two dummy activities be b0k = b N+1,k = Ck. Then GF should satisfy the below conditions: 1) Positive ﬂow: rk R fijk > 0, (i, j) EF ; 2) Inﬂow balance: (j,i) EF fjik = bik, rk R, ai Ap \ {a0}; 3) Outﬂow balance:

(i,j) EF fijk = bik, rk R, ai Ap\{a N+1}. Let GF be the set of AON-ﬂow Networks. Apparently, for any GF GF , GH = (AP , E(GF )) is a POS since the resource conﬂicts are resolved by the ﬂows. In Section 5.1 we will show that we can determine if a DAG GV is a POS, by checking if it can accommodate a feasible ﬂow.

3 CVa R based Proactive Scheduling As mentioned, we aim to optimize the α-robust makespan. Here we give its deﬁnition following the one in (Beck and Wilson 2007) for job-shop problems. Given a risk parameter α (0, 1), a real value D is α-achievable for POS GH if Pr(MS(GH, d) D) 1 α, i.e. the probability that the random makespan exceeds D is bounded by α. Then the α-robust makespan of GH is the minimum of all the αachievable D, i.e. Dα(GH) = min{D|Pr(MS(GH, d) D) 1 α}. The proactive problem can be formulated as: G H = argmin GH GH Dα(GH). (5)

It is easy to verify that Dα(GH) is the β-Va R of GH on d, with β = 1 α and the loss being MS(GH, d). Hence Problem (5) is equivalent to minimizing the β-Va R. The typical way for Va R minimization is to transform it into a chance constrained optimization problem (Hong, Hu, and Liu 2014). However, for Problem (5), this results in MILPs that are computationally prohibitive, as presented in (Varakantham, Fu, and Lau 2016; Fu, Varakantham, and Lau 2016). Therefore, we take a different approach here which optimizes CVa R instead of Va R. To be more speciﬁc, we optimize the approximate function ˆFβ deﬁned in Equation (3) on independent samples {d1, ..., d Q} drawn from d:

( ˆG H, ˆω ) = argmin (GH,ω) GH R ˆFβ(GH, ω). (6)

Then the optimal solution ( ˆG H, ˆω ) for the above problem converges to the actual CVa R minimizing solution (G H, ω ) exponentially fast with the increase of Q. As mentioned in (Hong, Hu, and Liu 2014), most CVa R optimization approaches assume that the decision space X is linear. In this case, the optimization of ˆFβ can be transformed to a linear program which can be solved efﬁciently. However, our problem does not comply with this assumption since the POS space GH is combinatorial. In fact, we can show that it is NP-hard to optimize ˆFβ(GH, ω): Proposition 1. The optimization problem (6) is NP-hard.

Proof. Our proof follows the one for RCPSP in (Blazewicz, Lenstra, and Kan 1983), which reduces the NP-complete problem Partition Into Triangles (PIT) to RCPSP. The PIT problem is: given graph G = (V, E) with |V | = 3t, can we partition G into t disjoint subsets, each containing three pairwise adjacent vertices? For any PIT instance, we construct an instance of Problem (6) as follows. First, we create a RCPSP as in (Blazewicz, Lenstra, and Kan 1983): for each i V , create an activity ai; for each (i, j) / E, create a resource rij with Cij = 1, which is consumed by ai and aj only, with a requirement of 1. Next, we add one sample d with di = 1 for all ai, and obtain an instance of problem (6) with the objective ˆFβ(GH, ω) = ω + 1/(1 β)[MS(GH, d) ω]+. We claim that this problem has a solution (GH, ω) with ˆFβ(GH, ω) t if and only if the corresponding PIT instance has a solution. Firstly, if the PIT instance has a solution, then a feasible schedule S can be obtained immediately with MS(S) t. Then on each resource unit, we sort

the consuming activities in ascending order based on their start times in S, and create a POS GH by adding precedence constraints for each activity and its immediate successor on each resource it consumed. Clearly MS(GH, d) t, hence we have a solution (GH, ω) with ω = MS(GH, d) and ˆFβ(GH, ω) = MS(GH, d) t. Secondly, if problem (6) has a solution (GH, ω) with ˆFβ(GH, ω) t, then we must have MS(GH, d) t. This is because if MS(GH, d) = t > t, then function ˆFβ(ω) = ω+1/(1 β)[t ω]+ has an inﬁmum t , indicating ˆFβ(GH, ω) > t for any ω R. Hence schedule S = S(GH, d) has a makespan MS(S) t, indicating the PIT instance has a solution.

Therefore, linear approaches cannot be applied to Problem (6). In the next section, we design a general branch-andbound framework for combinatorial CVa R minimization.

4 A Branch-and-bound Framework for Combinatorial CVa R Minimization In general, a branch-and-bound algorithm iteratively partitions the solution space into smaller pieces, and uses a bounding function to fathom searching in certain solution pieces. Here we aim at designing a branch-and-bound framework for minimizing ˆFβ deﬁned in Equation (3) in the solution space X R where X is combinatorial. Our framework only partitions X, since ω is an unbounded real variable which is relatively easy to optimize. The core component of a branch-and-bound algorithm is the bounding function. Below we design a lower bounding function for minimizing ˆFβ. Given a subset of decisions X X, suppose we can lower bound the loss function g on each sample yq for X , by calling a function g LB(X , yq). Next, we deﬁne an auxiliary function Lβ as follows:

Lβ(X , ω) = ω + 1 (1 β)Q

q=1 [g LB(X , yq) ω]+. (7)

Then we can have the following conclusion: Proposition 2. Deﬁne LB(X ) as follows:

LB(X ) = min ω RLβ(X , ω), (8)

then LB(X ) is a lower bound for X .

Proof. For any decision x X and sample yq, we have g LB(X , yq) g(x, yq) since g LB is a lower bound. Then for any ω R, we have Lβ(X , ω) ˆFβ(x, ω). In other words, for any x X , function Lβ is pointwise smaller than or equal to ˆFβ with respect to ω. Therefore the minimum value of Lβ with respect to ω, i.e. LB(X ), should satisfy LB(X ) ˆFβ(x, ω) for any (x, ω) X R.

Remark. We can also conclude that if g LB is stronger, then LB(X ) is also stronger which leads to more effective pruning. For any X and yq, if g1 LB(X , yq) g2 LB(X , yq), then the corresponding two functions L1 β and L2 β satisfy L1 β(X , ω) L2 β(X , ω) for any ω R. Therefore, minω RL1 β(X , ω) minω RL2 β(X , ω).

According to Proposition 2, the lower bound value can be computed in two steps: 1) compute the lower bound on each sample; 2) solve the optimization problem (8). It is easy to verify that Lβ is convex (though not differentiable) with respect to ω, therefore (8) is a univariate convex optimization problem, which can be solved by standard techniques such as the subgradient method. Nevertheless, we can show that Problem (8) can be solved more efﬁciently by simply ranking the sample lower bounds. Proposition 3. If Q sample lower bound values are ranked ascendingly as g1 LB ... g Q LB, then ω = gk LB with k = βQ solves Problem (8) optimally.

Proof. The ranked sample lower bounds split R into a set of intervals ( , g1 LB],...,[gk LB, gk+1 LB ],...,[g Q LB, ). It is easy to verify that Lβ is linear in each interval, and is continuous in R. We can then write the derivative of Lβ with respect to ω: when ω g1 LB, L β = β/(1 β) < 0; for any integer k [1, Q 1], when ω [gk LB, gk+1 LB ], L β = (k βQ)/(Q βQ); when ω g Q LB, L β = 1 > 0. Hence, along with the increase of ω in R, L β increases from negative to positive. The smallest k that makes L β 0 is k = βQ , meaning that Lβ stops decreasing in [gk LB, gk +1 LB ] and ω = gk LB is an optimal solution to Problem (8).

Therefore, LB(X ) can be obtained very easily after computing the sample lower bounds. With proper branching functions to partition the solution space X, the branch-andbound process can be executed correctly to ﬁnd the optimal solution (ˆx , ˆω ). Note that when a feasible decision x X is reached, a candidate (x , ω ) for the optimal solution can be obtained by ﬁxing the loss g(x , yq) in Equation (3) for each sample, and minimizing ˆFβ with respect to ω following a similar ranking procedure as minimizing Lβ. The sample losses can also be used to retrieve the (approximate) β-Va R of x on the samples (y1, ..., y Q), i.e. ˆφβ(x ) = max{g(x , yq)|g(x , yq) ω } (Rockafellar and Uryasev 2002). This framework is general and applicable for any combinatorial CVa R minimization problem, as long as the sample bounding function g LB is available.

5 The Proactive Scheduling Algorithm In this section, we instantiate our CVa R minimization framework to solve Problem (6). We ﬁrst describe how to check if a given augmented DAG is a partial order solution (POS).

5.1 POS Checking When there is only one resource r with capacity C, it has been shown in (Leus and Herroelen 2004) that for a given augmented DAG GV = (AP , EP EV ), the existence of a feasible ﬂow can be checked by computing a maximum ﬂow in a transformed network G V constructed as follows: 1) create two vertices is and it for each ai A, and one vertex for a0 and a N+1 named as 0s and (N + 1)t, respectively; 2) create two vertices, s and t with an edge (t, s) as the virtual source and sink, and add edges (s, is), (it, t) for all ai AP ; 3) for each (i, j) E(GV ), add an edge (is, jt).

Figure 1: An example of network transformation (left: original DAG GV ; right: transformed network G V , where integers beside edges represent capacities)

Each (s, is) and (it, t) has a capacity bi that is equal to the resource requirement of ai, while the capacities of other edges are + . An example of this transformation is shown in Figure 1. Let f(G V ) be the maximum (s, t) ﬂow value in G V , then there exists an AON-ﬂow Network GF with E(GF ) E(GV ) if and only if f(G V ) = fmax, where fmax = C +

ai A bi. Moreover, a feasible ﬂow in GV can be obtained by setting fij to the ﬂow value on the edge (is, jt) in G V . Due to the integrality property of the maximum ﬂow problem, all fij should be integers. The above procedure can be done efﬁciently using maximum ﬂow algorithms (e.g. Edmonds-Karp algorithm). Here we extend the above procedure to support multiple resources. For each rk, we maintain a transformed network G k V for a given DAG GV . Notice that these networks have the same edge sets, while the edge capacities are set to bik for the corresponding G k V . Let f k max = Ck + ai A bik for rk, then we can conclude that there exists an AONﬂow Network GF with E(GF ) E(GV ) if and only if f(G k V ) = f k max holds for all rk R. Furthermore, we can show that whether a DAG is POS can be checked in polynomial time, by checking the existence of AON-ﬂow Network.

Proposition 4. For any POS GH GH, there must be an AON-ﬂow Network GF GF such that E(GF ) E(GH).

Proof. If no such AON-ﬂow Network exists, then there must be a resource rk R with f(G k H) < f k max. This means there must be some activity ai which cannot secure enough amount of rk by the edges in E(GH), since the ﬂow in G k H is already maximized. Hence in the actual execution, it is possible that rk is not enough for ai to start at the time determined by GH, which implies that potential precedence constraints are needed to resolve resource conﬂicts.

5.2 Branching Scheme Starting from GP , our algorithm employs a depth-ﬁrst branch-and-bound process to add edges to GP until a POS is obtained1. Since a POS must be acyclic, the set of feasible edges that can be added is FS = {(i, j) / E(GP )|(j, i) / Tr(GP )}. For each (i, j) FS, we maintain lower bound f L ijk and upper bound f U ijk of the (integer) ﬂow fijk that

1Note that when a POS GH is reached, the algorithm can backtrack safely. Because for any G H with E(GH) E(G H), MS(GH, d) MS(G H, d) holds for any sample d, therefore minω R ˆFβ(GH, ω) ˆFβ(G H, ω) holds for any ω.

can be imposed on it for rk, with 0 f L ijk f U ijk. Initially, f L ijk = 0 and f U ijk = min{bik, bjk}. During searching, these bounds are tightened by constraint propagation which will be detailed in Section 5.3. The ﬂow bounds are also imposed to the transformed networks: for G k V transformed from GV , for any (i, j) E(GV ), the ﬂow of rk carried by edge (is, jt) should be within [f L ijk, f U ijk]. Let sum L ij =

rk R f L ijk and sum U ij =

rk R f U ijk. Then sum L ij > 0 means there must be a ﬂow on (i, j) while sum U ij = 0 indicates (i, j) cannot carry ﬂow for any rk. Based on the bound values and branching decisions, an edge (i, j) ES has four status: 1) included, if sum L ij > 0; 2) banned, if sum U ij = 0; 3) undecided, if sum L ij = 0 and sum U ijk > 0; 4) conﬂicted, if (j, i) Tr(GV ) where GV is the current partial solution. The skeleton of our algorithm is described by the pseudo code in Algorithm 1. The ﬁrst step is to check if the current partial solution GV is a POS, using the procedure in Section 5.1. If true, meaning a feasible solution is reached, then we compute an optimal candidate (GV , ω) using the ranking procedure in Section 4, and compare it with the incumbent (G H, ω ) to check if a better solution is found. Otherwise, GV is not a POS and more edges need to be added, which leads the algorithm to a two-level branching scheme. In the ﬁrst level, an eligible edge (i, j) FS is chosen for branching (Line 6), if it is undecided and does not conﬂict with GV . The heuristic for choosing edge will be discussed in Section 5.4. Then the lower bound of adding (i, j) to GV is computed to determine if the search path should be pruned or not (Line 7). If not, Algorithm 1 enters the second level where (i, j) is ﬁrst included in GV by imposing f L ijk = 1 for a chosen resource rk, until all rk R have been tried, i.e. choose Resource returns null (Lines 9-13). Then, (i, j) is banned by removing it from GV and imposing f U ijk = 0 (which automatically imposes f L ijk = 0) for all rk R (Lines 14-16). When a lower (upper) bound needs to be tightened, a function propagate LB (propagate UB) is called in Line 11 (15) to maintain the consistency of the ﬂow bounds. The search path is expanded by calling Bn B if the propagation successes, otherwise it is pruned. The searching process starts by calling Bn B(GP , null, null, + ). Upon termination, the β-Va R of the best solution is also returned as the α-robust makespan. Lower Bound. In function compute LB (Line 7), we only need to compute the lower bounds of adding (i, j) to GV on each sample dq. To this end, we relax all resource constraints and compute MS( GV , dq) for each dq as the sample lower bounds, where GV = (AP , E(GV ) {i, j}).

5.3 Constraint Propagation

For single resource problems, Leus and Herroelen (2004) propose to maintain the ﬂow bound consistency by conducting constraint propagation on the remainder network GR = (AP , EP ER), where ER = {(i, j) ES|f U ij > 0} is the set of edges not banned by the current branching decisions. For (i, j) E(GR), let Oij = {(i, l) E(GR)|l = j} and Iij = {(l, j) E(GR)|l = i} be the set of other edges in

Algorithm 1: Bn B(GV , G H, ω , ˆF β)

Input: GV : current partial solution; G H: current best POS; ω : the best ω with G H; ˆF β : current best objective

1 if GV is a POS then

2 (GV , ω) minimize F(GV );

3 if ˆFβ(GV , ω) < ˆF β then

4 Update G H, ω , and ˆF β ;

6 (i, j) choose Edge(GV );

7 if compute LB(GV , i, j) < ˆF β then

8 GV (AP , E(GV ) {(i, j)});

9 k choose Resource(GV );

10 while k = null do

11 if propagate LB(i, j, k)=true then

12 Bn B(GV , G H, ω , ˆF β ); restore();

13 k choose Resource(GV );

14 GV (AP , E(GV ) \ {(i, j)});

15 if propagate UB(i, j)=true then

16 Bn B(GV , G H, ω , ˆF β ); restore();

E(GR) that starts from ai and ends at aj, respectively. Since an AON-ﬂow Network must satisfy inﬂow and outﬂow balance, the bounds of fij can be tightened as:

f L ij = max

(i,l) Oij f U il , bj

(l,j) Iij f U jl

f U ij = min

f U ij , bi

(i,l) Oij f L il , bj

(l,j) Iij f L jl

Consistency can be achieved by updating bounds for all edges in E(GR) till no bound changes. The network G R transformed from GR using the procedure in Section 5.1 is also used for detecting infeasibility in (Leus and Herroelen 2004). If f(G R) < fmax, then clearly the current branching decisions cannot lead to any AON-ﬂow Network. For our problem with multiple resources, we maintain the ﬂow bounds independently for each rk based on Equations (9) and (10). The branching decisions in the second level of Algorithm 1 enable the independent bound updates: when an edge (i, j) is included, f L ijk of a chosen rk changes from 0 to 1 which makes the positive ﬂow condition satisﬁed, and function propagate LB only maintains consistency for rk; when (i, j) is banned, function propagate UB maintains consistency for all resources by setting f U ijk to 0 (so as f L ijk) for all rk and propagating to other bounds. If any bound infeasibility (i.e. f U ijk < f L ijk) is detected during propagation, a false value is returned to signal the algorithm for backtracking. Note that constraint propagation may imply that certain edges (i, j) / E(GP ) should be included in GV (if sum L ij > 0) or banned (if sum U ij = 0). If the ﬂow bounds

are updated successfully, propagate LB and propagate UB try to detect ﬂow infeasibility. For each rk, we maintain transformed network G k R for GR, and try to maximize ﬂows in G k R and G k V for rk affected by constraint propagation. If f(G k R) < f k max or G k V is an infeasible network, then according to Proposition 4, the current branching decisions cannot lead to any POS and a false value is returned.

5.4 Heuristics Essentially, by adding edges to a partial solution GV , we wish to increase the maximum ﬂow in each G k V to f k max so that a POS is obtained. Hence, we prefer the edge that can bring the largest increment for each f(G k V ) so that a solution is reached as early as possible. Here we design a heuristic Resource Score to estimate the contribution that an eligible edge (i, j) could have for the ﬂow increment as follows:

RSk(i, j) = f R ijk f kmax f(G k V )

where RSk is a normalized estimate for the contribution of (i, j) to rk, with the nominator f R ijk being the ﬂow for rk on edge (i, j) in the remainder network GR and the denominator being the current ﬂow gap for G k V to reach f k max. Function choose Edge in Line 6 of Algorithm 1 returns the edge (i, j) with the highest RS value as the next branching choice, and choose Resource in Line 9 returns the unexplored rk with the highest RSk value for (i, j).

6 Experimental Results In this section, we empirically evaluate our approach on benchmark instances, and compare with two state-of-theart approaches SORU-H (Varakantham, Fu, and Lau 2016) and BACCHUS (Fu, Varakantham, and Lau 2016). SORUH computes start-time schedule, while BACCHUS generates POS as our approach does. Our algorithm is implemented in JAVA 1.8, while SORU-H and BACCHUS are coded using Java API for CPLEX 12.7.1. All algorithms run on an Intel Xeon E5 Workstation (3.5GHz, 16GB). We generate RCPSP instances using Ran Gen2 (Vanhoucke et al. 2008), which requires ﬁve parameters: numbers of activities N and resources K, order strength OS, resource factor RF, and resource-constrainedness RC.2 OS speciﬁes the density of GP , RF describes the average number of resources required per activity, and RC indicates the average fraction of resource capacities consumed per activity. N and K are chosen from {10, 20, 30} and {1, 2, 3} respectively, while RF, OS, and RF are chosen from {0.2, 0.7} which denote the low and high levels. For each parameter conﬁguration, we generate 10 instances as a subset, hence totally 720 instances are generated. Two distributions are used to model the uncertainty: 1) a normal distribution N(d0 i , σ2) with d0 i being the deterministic

2Here we use this test set instead of public benchmarks such as PSPLIB, because we intend to evaluate the performance of our algorithm against different problem parameters, especially the resource-related ones (K, RF and RC) since our algorithm mainly reasons on the resource constraints.

0 50 100 150 200 250 300 350 400 450 500

Average Execution Time (millisecond)

Average Gap and Variance

Avg Gap Avg Var Avg Time

Figure 2: Results for sample size test

duration of ai (an integer in [1, 10] in the generated instances) and σ = 0.5, which is used in (Fu et al. 2012; Varakantham, Fu, and Lau 2016); 2) an exponential distribution Exp(1/d0 i ) used in (Creemers 2015; Leus and Herroelen 2004). The uncertainty level of Exp is higher than N, since its squared coefﬁcient of variance (SCV) is much higher (Creemers 2015)3. Following (Varakantham, Fu, and Lau 2016), we employ two evaluation metrics: 1) α-robust makespan (αRM) output by an algorithm, and 2) Probability of Failure (Po F) which is the ratio of instances either having an actual makespan larger than α-RM (for POS) or violating any constraint (for start-time schedule). Po F is computed on a large number of Qt = 2000 testing samples. Time limits for all algorithms are 10 minutes, and the returned best results are used for analysis.

6.1 Analysis of Our Algorithm We ﬁrst examine our algorithm against different Q and α. Impact of sample size. Since we (approximately) optimize Fβ, we evaluate the impact of Q on Fβ based on the gap estimator ρ of SAA (Kleywegt, Shapiro, and Homem-de Mello 2002). Speciﬁcally, we solve problem (6) Qr = 20 times independently, and let (Gr H, ωr) be the solution with the lowest ˆFβ. Then ρ is estimated as ρ = | ˆF β ˆF r β(Gr H, ωr)|, where ˆF β is the average objective of the Qr replications, and ˆF r β(Gr H, ωr) is computed on Qt testing samples. The variance of ρ is estimated as varρ = S2 r/Qr +S2 t /Qt, where S2 r and S2 t are variances of the Qr objectives and the Qt values of function ωr + 1/(1 β)[g ωr]+, respectively. We plot the average gap, variance and execution time of our algorithm on a representative subset (10 instances) with Exp and α = 0.2 in Figure 2, which clearly shows the tradeoff between solution quality and computational effort. This is an expected observation, which is theoretically guaranteed by the properties of SAA. As shown, while the gap is relatively stable, its variance drops with the increase of Q, indicating the solution becomes more stable. The increase of execution time is not vary fast, which shows good scalability of our approach on large Q. Intuitively, this is because the samples are only used for lower bound computation which is very efﬁcient in our CVa R minimization framework. We also

3SCV of a distribution is deﬁned as σ2/μ2, where σ and μ are the standard deviation and mean, respectively.

0.35 Normal, α=0.2 Normal, α=0.15 Normal, α=0.1

Exp, α=0.2 Exp, α=0.15 Exp, α=0.1

Figure 3: Po F distributions for different risk levels

have similar observations in other instance subsets. Here we use this instance set for illustration because all instances in this set are solved optimally in all replications for all Q values, which is more ideal for computing the gap estimator. In the remaining experiments, we set Q = 100.

Table 1: Results for different risk levels

N Exp α α-RM Po F #Vio 1 #Vio-ϵ 2 α-RM Po F #Vio #Vio-ϵ

0.2 65.74 0.13 5 0 88.36 0.19 21 4 0.15 65.86 0.09 8 0 92.76 0.14 26 2 0.1 66.72 0.06 6 0 100.79 0.09 24 1

1 The number of instances with Po F> α. 2 The number of instances with Po F> α when ϵ = 0.05.

Impact of risk parameter. To examine the impact of α, we select 72 instances by randomly picking one in each subset. In Table 1, we present the average α-RM and Po F for the two distributions, with different risk levels α {0.2, 0.15, 0.1}. We can observe that with stricter risk requirement (smaller α), α-RM increases since it needs to tolerate more execution scenarios. The increase of α-RM is higher for Exp than N, since the uncertainty level of the former is higher. On average, Po F is close to α which shows a precise risk control. We plot the Po Fs of the 72 instances in Figure 3, which shows most of Po Fs are below the required level α. But still, some instances have higher Po F than α, as shown in the columns #Vio of Table 1, and Exp has more violations than N. This is because Problem (6) is built on limited samples which cannot cover all situations. This is also observed in (Luedtke and Ahmed 2008; Varakantham, Fu, and Lau 2016), and they propose to solve the SAA problems with stricter risk level α than required, i.e. α = α ϵ. Following this idea, we set ϵ = 0.05 for our algorithm, as can be observed in the columns #Vio-ϵ in Table 1 that this value can effectively reduce Po F violations.

6.2 Comparison with other Approaches

We ﬁrst tune the parameter ϵ for SORU-H and BACCHUS. α is set to 0.2 in this section. We conduct experiments on the 72 instances used in Section 6.1, with ϵ {0, 0.05, 0.1}. We report the number of violations in Table 2. As shown, ϵ = 0.05 is reasonable for BACCHUS with a violation ratio of at most 12/72 = 16.7% for both N and Exp. For SORUH, ϵ = 0.1 leads to satisfying results for N, which is also the recommended value in (Varakantham, Fu, and Lau 2016).

Branch-and-bound BACCHUS

Figure 4: Po F distributions for Exp

But for Exp which has a higher uncertainty level, all test instances are violated with ϵ = 0.1. This is because SORU-H generates start-time schedule as proactive solution, which is too rigid and has a high chance to violate when the duration uncertainty level is high. In contrast, BACCHUS and our approach generate ﬂexible solution POS, which provides better robustness (Bidot et al. 2009). In the remaining experiments, we only report the results of SORU-H on N.

Table 2: Number of violations for different ϵ values

N Exp ϵ BACCHUS SORU-H BACCHUS SORU-H

0 43 71 24 72 0.05 0 57 12 72 0.1 0 14 2 72

Table 3: Summary of results

N Exp Ours BACCHUS SORU-H Ours BACCHUS

Po F α (%) 98.06 98.19 86.94 82.08 84.44 α-RM 1 64.66 65.85 68.4 102.97 133.12 Lowest RM (%) 2 77.81 26.99 7.78 89.71 14.12

1 The average α-RM on instances that are successful for all algorithms. 2 The ratio of successful instances with the lowest α-RM among all algorithms (summation may larger than 100% since different algorithms may give the same α-RM).

We then execute the three algorithms on all the 720 instances, and summarize the results in Table 3. We say an instance test is successful, if its Po F α. For distribution N, BACCHUS and our algorithm succeed on over 98% of the instances, which are more than SORU-H. On the 604 instances that are successful for all three algorithms, the average α-RM values are comparable. However, our algorithm achieves the lowest α-RM in over 77% of the successful instances, which is signiﬁcantly higher than the other two. For distribution Exp, BACCHUS and our algorithm are able to succeed on over 80% of the instances. However, on the 517 instances successfully solved by both algorithms, the average α-RM produced by our algorithm is signiﬁcantly lower than that of BACCHUS, with a 25% improvement. In fact, our algorithm achieves lower α-RM on nearly 90% of the successful instances. We believe this performance gap is caused by the summarization heuristic used in BACCHUS. To verify our intuition, we plot the 720 Po F values produced by BACCHUS and our algorithm with Exp in Figure 4. As shown, Po F values of our algorithm distribute densely around the required level α = 0.2, with over 93% Po Fs within [0.1, 0.3] and 0.6% (4 instances) higher than 0.3. In contrast, Po Fs of BACCHUS distribute rather sparsely, with

N=10 N=20 N=30 K=1 K=2 K=3 OS=0.2 OS=0.7 RF=0.2 RF=0.7 RC=0.2 RC=0.7

Percentage of Time-out Instances (%)

Ours BACCHUS SORU-H

Figure 5: Percentage of time-out instances

28% Po Fs within [0.1, 0.3], 5% higher than 0.3, and nearly 67% smaller than 0.1. In addition, most of the instances with OS=0.7 have Po Fs smaller than 0.1. These results indicate that the summarization heuristic tends to over-compensate for α, i.e. produces α-RM that is higher than required, especially for instances with higher OS. Since our algorithm solves SAA problems with tens to hundreds of samples instead of a representative one, better estimation and control of the risk level can be achieved. Finally, we brieﬂy report the computational efﬁciency. Note that while our algorithm solves Problem (6) with hundreds of samples, SORU-H and BACCHUS essentially solve a much simpler deterministic RCPSP since only one summarized sample is used. In general, our algorithm ﬁnds the optimal solutions of Problem (6) for nearly 60% of all instances, while SORU-H and BACCHUS optimally solve over 90% of the corresponding deterministic instances. However, it is common in the experiments that a sub-optimal solution given by our algorithm is much better than the optimal solutions given by SORU-H and BACCCHUS. We plot the percentage of time-out instances with the problem parameters in Figure 5. As shown, the three algorithms share the same trend, i.e. the hardness increases with N, K, RF and RC, but decreases with RF. For BACCHUS and our approach, this is perhaps because higher RF and RC implies more alternatives for generating POS hence a larger search space, while higher OS means the AON network GP is denser and already contains many edges that should present in a POS hence a smaller search space. The reason for N and K is straightforward, since they decide the size of an instance.

7 Conclusions and Future Work In this paper, we propose a novel approach for risk-aware project scheduling, by exploiting a mathematically coherent measure CVa R. We design a general branch-and-bound framework with efﬁcient bound computation for combinatorial CVa R minimization, and instantiate it to solve the proactive scheduling problem. Empirical results show that our approach scales well to a large number of samples, and provides better risk control and robust makespan. For future work, ﬁrstly we plan to improve the computational efﬁciency using stronger lower bounds and more effective heuristics; secondly, we will extend our approach to RCPSP/max, which is considered in (Varakantham, Fu, and Lau 2016; Fu, Varakantham, and Lau 2016). Acknowledgments. This work was supported by the National Research Foundation (NRF) Singapore.

References Artigues, C.; Michelon, P.; and Reusser, S. 2003. Insertion techniques for static and dynamic resource-constrained project scheduling. European Journal of Operational Research 149(2):249 267. Beck, J. C., and Wilson, N. 2007. Proactive algorithms for job shop scheduling with probabilistic durations. Journal of Artiﬁcial Intelligence Research 28(1):183 232. Bidot, J.; Vidal, T.; Laborie, P.; and Beck, J. C. 2009. A theoretic and practical framework for scheduling in a stochastic environment. Journal of Scheduling 12(3):315 344. Blazewicz, J.; Lenstra, J. K.; and Kan, A. R. 1983. Scheduling subject to resource constraints: classiﬁcation and complexity. Discrete Applied Mathematics 5(1):11 24. Creemers, S. 2015. Minimizing the expected makespan of a project with stochastic activity durations under resource constraints. Journal of Scheduling 18(3):263 273. Fu, N.; Lau, H. C.; Varakantham, P.; and Xiao, F. 2012. Robust local search for solving rcpsp/max with durational uncertainty. Journal of Artiﬁcial Intelligence Research 43:43 86. Fu, N.; Varakantham, P.; and Lau, H. C. 2016. Robust partial order schedules for rcpsp/max with durational uncertainty. In ICAPS, 124 130. Hong, L. J.; Hu, Z.; and Liu, G. 2014. Monte carlo methods for value-at-risk and conditional value-at-risk: A review. ACM Transactions on Modeling and Computer Simulation 24(4):22. Kleywegt, A. J.; Shapiro, A.; and Homem-de Mello, T. 2002. The sample average approximation method for stochastic discrete optimization. SIAM Journal on Optimization 12(2):479 502. Laborie, P. 2005. Complete mcs-based search: application to resource constrained project scheduling. In IJCAI, 181 186. Leus, R., and Herroelen, W. 2004. Stability and resource allocation in project planning. IIE transactions 36(7):667 682. Lombardi, M.; Milano, M.; and Benini, L. 2013. Robust scheduling of task graphs under execution time uncertainty. IEEE transactions on computers 62(1):98 111. Luedtke, J., and Ahmed, S. 2008. A sample approximation approach for optimization with probabilistic constraints. SIAM Journal on Optimization 19(2):674 699. Policella, N.; Smith, S. F.; Cesta, A.; and Oddi, A. 2004. Generating robust schedules through temporal ﬂexibility. In ICAPS, 209 218. Rockafellar, R. T., and Uryasev, S. 2002. Conditional value-atrisk for general loss distributions. Journal of banking & ﬁnance 26(7):1443 1471. Song, W.; Kang, D.; Zhang, J.; and Xi, H. 2017. Proactive project scheduling with time-dependent workability uncertainty. In AAMAS, 221 229. Vanhoucke, M.; Coelho, J.; Debels, D.; Maenhout, B.; and Tavares, L. V. 2008. An evaluation of the adequacy of project network generators with systematically sampled networks. European Journal of Operational Research 187(2):511 524. Varakantham, P.; Fu, N.; and Lau, H. C. 2016. A proactive sampling approach to project scheduling under uncertainty. In AAAI, 3195 3201.