# stochastic_network_design_in_bidirected_trees__5730f3a8.pdf Stochastic Network Design in Bidirected Trees Xiaojian Wu1 Daniel Sheldon1,2 Shlomo Zilberstein1 1 School of Computer Science, University of Massachusetts Amherst 2 Department of Computer Science, Mount Holyoke College We investigate the problem of stochastic network design in bidirected trees. In this problem, an underlying phenomenon (e.g., a behavior, rumor, or disease) starts at multiple sources in a tree and spreads in both directions along its edges. Actions can be taken to increase the probability of propagation on edges, and the goal is to maximize the total amount of spread away from all sources. Our main result is a rounded dynamic programming approach that leads to a fully polynomial-time approximation scheme (FPTAS), that is, an algorithm that can find (1 ϵ)-optimal solutions for any problem instance in time polynomial in the input size and 1/ϵ. Our algorithm outperforms competing approaches on a motivating problem from computational sustainability to remove barriers in river networks to restore the health of aquatic ecosystems. 1 Introduction Many planning problems from diverse areas such as urban planning, social networks, and transportation can be cast as stochastic network design, where the goal is to take actions to enhance connectivity in a network with some stochastic element [1 8]. In this paper we consider a simple and widely applicable model where a stochastic network G is obtained by flipping an independent coin for each edge of a directed host graph G = (V, E) to determine whether it is included in G . The planner collects reward rst for each pair of vertices s, t V that are connected by a directed path in G . Actions are available to increase the probabilities of individual edges for some cost, and the goal is to maximize the total expected reward subject to a budget constraint. Stochastic network design generalizes several existing problems related to spreading phenomena in networks, including the well known influence maximization problem. Specifically, the coin-flipping process captures the live-edge characterization of the Independent Cascade model [7], in which the presence of edge (u, v) in G allows influence (e.g., behavior, disease, or some other spreading phenomenon) to propagate from u to v. Influence maximization seeks a seed set S of at most k nodes to maximize the expected number of nodes reachable from S, which is easily modeled within our model by assigning appropriate rewards and actions. The framework also captures more complex problems with actions that increase edge probabilities a setup that proved useful in various computational sustainability problems aimed to restore habitat or remove barriers in landscape networks to facilitate the spread and conserve a target species [4 6, 8]. The stochastic network design problem in its general form is intractable. It includes influence maximization as a special case and is thus NP-hard to approximate within a ratio of 1 1/e + ϵ for any ϵ > 0 [7], and it is #P-hard to compute the objective function under fixed probabilities [9, 10]. Unlike the influence maximization problem, which is a monotone submodular maximization problem and thus admits a greedy (1 1/e)-approximation algorithm, the general problem is not submodular [6]. Previous problems in this class were solved by a combination of techniques including the sample average approximation, mixed integer programming, dual decomposition, and primal-dual heuristics [6, 11 13], none of which provide both scalable running-time and optimality guarantees. It is therefore of great interest to design efficient algorithms with provable approximation guarantees for restricted classes of stochastic network design. Wu, Sheldon, and Zilberstein [8] recently showed that the special case in which G is a directed tree where influence flows away from the root (i.e., rewards are non-zero only for paths originating at the root) admits a fully polynomial-time approximation scheme (FPTAS). Their algorithm rounded dynamic programming (RDP) is based on recursion over rooted subtrees. Their work was motivated by the upstream barrier removal problem in river networks [5], in which migratory fish such as salmon swim upstream from the root (ocean) of a river network attempting to access upstream spawning habitat, but are blocked by barriers such as dams along the way. Actions are taken to remove or repair barriers and thus increase the probability fish can pass and therefore utilize a greater amount of their historical spawning habitat. In this paper, we investigate the harder problem of stochastic network design in a bidirected tree, motivated by a novel conservation planning problem we term bidirectional barrier removal. The goal is to remove barriers to facilitate point-to-point movement in river networks. This applies to the much broader class of resident (non-migratory) fish species whose populations and gene-flow are threatened by dams and smaller river barriers (e.g., culverts) [14]. Replacing or retrofitting barriers with passage structures is a key conservation priority [15, 16]. However, stochastic network design in a bidirected tree is apparently much harder than in a directed tree. Since spread originates at all vertices instead of a designated root and edges may have different probabilities in each direction, it is not obvious how computations can be structured in a recursive fashion as in [8]. Our main contribution is a novel RDP algorithm for stochastic network design in bidirected trees and a proof that it is an FPTAS in particular, it computes (1 ϵ)-optimal solutions in time O(n8/ϵ6). To derive the new RDP algorithm, we first show in Section 3 that the computation can be structured recursively despite the lack of a fixed orientation to the tree by choosing an arbitrary orientation and using a more nuanced dynamic programming algorithm. However, this algorithm does not run in polynomial time. In Section 4, we apply a rounding scheme and then prove in Section 5 that this leads to a polynomial-time algorithm with the desired optimality guarantee. However, the running time of O(n8/ϵ6) limits scalability in practice, so in Section 6 we describe an adaptive-rounding version of the algorithm that is much more efficient. Finally, we show that RDP significantly outperforms competing algorithms on the bidirectional barrier removal problem in real river networks. 2 Problem Definition The input to the stochastic network design problem consists of a bidirected tree T = (V, E) with probabilities puv assigned to each directed edge (u, v) E. A finite set of possible repair actions Au,v = Av,u is associated with each bidirected edge {u, v}; action a Au,v has cost cuv,a and, if taken, simultaneously increases the two directed edge probabilities to puv|a and pvu|a. We assume that Au,v contains a default zero-cost noop action a0 such that puv|a0 = puv and pvu|a0 = pvu. A policy π selects an action π(u, v) either a repair action or a noop for each bidirected edge. We write puv|π := puv|π(u,v) for the probability of edge (u, v) under policy π. In addition to the edge probabilities, a non-negative reward rs,t is specified for each pair of vertices s, t V . Given a policy π, the s-t accessibility ps t|π is the product of all edge probabilities on the unique path from s to t, which is the probability that s retains a path to t in the subgraph T where each edge is present independently with probability puv|π. The total expected reward for policy π is z(π) = P s,t V rs,t ps t|π. Our goal is to find a policy that maximizes z(π) subject to a budget b limiting the total cost c(π) of the actions being taken. Hence, the resulting policy satisfies π arg max{π|c(π) b} z(π). In this work, we will assume that the rewards factor as rs,t = hsht, which is useful for our dynamic programming approach and consistent with several widely used metrics. For example, network resilience [17] is defined as the expected number of node-pairs that can communicate after random component failures, which is captured in our framework by setting rs,t = hs = ht = 1. Network resilience is a general model of connectivity that can apply in diverse complex network settings. The ecological measure of probability of connectivity (PC) [18], which was the original motivation of our formulation, can also be expressed using factored rewards. PC is widely used in ecology and conservation planning and is implemented in the Conefor software, which is the basis of many planning applications [19]. A precise definition of PC appears below. Figure 1: Left: sample river network with barriers A, B, C and contiguous regions u, v, w, x. Right: corresponding bidirected tree. Barrier Removal Problem Fig. 1 illustrates the bidirectional barrier removal problem in river networks and its mapping to stochastic network design in a bidirected tree. A river network is a tree with edges that represent stream segments and nodes that represent either stream junctions or barriers that divide segments. Fish begin in each segment and can swim freely between adjacent segments, but can only pass a barrier with a specified passage probability or passability in each direction; in most cases, downstream passability is higher than upstream passability. To map this problem to stochastic network design, we create a bidirected tree T = (V, E) where each node v V represents a contiguous region of the river network i.e., a connected set of stream segments among which fish can move freely without passing any barriers and the value hv is equal to the total amount of habitat in that region (e.g., the total length of all segments). Each barrier then becomes a bidirected edge that connects two regions, with the passage probabilities in the upstream and downstream directions assigned to the corresponding directed edges. It is easy to see that T retains a tree structure. Our objective function z(π) is motivated by PC introduced above. It is defined as follows: PC(π) = z(π) t S rs,tps t|π where R = P s,t hsht is a normalization constant. When hv is the amount of suitable habitat in region v, PC(π) is the probability that a fish placed at a starting point chosen uniformly at random from suitable habitat (so that a point in region s is chosen with probability proportional to hs) can reach a random target point also chosen uniformly at random by passing each barrier in between. In the rest of the paper, we present algorithms for solving this problem and their theoretical analysis that generalize the rounded DP approach introduced in [8]. 3 Dynamic Programming Algorithm Given a bidirected tree T , we present a divide-and-conquer method to evaluate a policy π and a dynamic programming algorithm to optimize the policy. We use the fact that given an arbitrary root, any bidirected tree T can be viewed as a rooted tree in which each vertex u has corresponding children and subtrees. To simplify our algorithm and proofs, we make the following assumption. Assumption 1. Each vertex in the rooted tree has at most two children. Any problem instance can be converted into one that satisfies this assumption by replacing any vertex u with more than two children by a sequence of internal vertices with exactly two children. The original edges are attached to the original children of u and the added edges have probabilities 1. In the modified tree, u has two children and its habitat is split equally among u and the newly added vertices. The resulting binary tree has at most twice as many vertices as the original one. Most importantly, a policy for the modified tree can be trivially mapped to a unique policy for the original tree with the same expected reward. Evaluating A Fixed Policy Using Divide and Conquer To evaluate a fixed policy π, we use a divide and conquer method that recursively computes a tuple of three values per subtree. Let v and w be the children of u. The tuple of the subtree Tu rooted at u can be calculated using the tuples of subtrees Tv and Tw. Once the tuple of Troot = T , is calculated, we can extract the total expected reward from that tuple. Now, given a policy π, we define the tuple of Tu as ψu(π) = (νu(π), µu(π), zu(π)), where t Tu pu t|πht is the sum of the s-t accessibilities of all paths from u to t Tu, each of which is weighted by the habitat ht of its ending vertex t. µu(π) = P s Tu ps u|πhs is the sum of the s-t accessibilities of all paths from s Tu to u, each of which is weighted by the habitat hs of its departing vertex s. zu(π) = P t Tu ps t|πrs,t (rs,t = hsht) represents the total expected reward that a fish obtains by following paths with both starting and ending vertices in Tu. The tuple ψu(π) is calculated recursively using ψv(π) and ψw(π). To calculate νu(π), we note that a path from u to a vertex in Tu\{u} is the concatenation of either the edge (u, v) with a path from v to Tv or the edge (u, w) with a path from w to Tw, that is, νu(π) can be written as X t Tv puv|πpv t|πht + X t Tw puw|πpw t|πht + hu = puv|πνv(π) + puw|πνw(π) + hu (2) Similarly, µu(π) = X s Tv ps v|πpvu|πhs + X s Tw ps w|πpwu|πhs + hu = pvu|πµv(π) + pwu|πµw(π) + hu (3) By dividing the reward from paths that start and end in Tu based on their start and end nodes, we can express zu(π) as follows: zu(π) = zv(π)+zw(π)+µv(π)pv w|πνw(π)+µw(π)pw v|πνv(π)+huνu(π)+huµu(π) h2 u (4) The first two terms describe paths that start and end within a single subtree either Tv or Tw. The third and fourth terms describe paths that start in Tv and end in Tw or vice versa. The last three terms describe paths that start or end at u, with an adjustment to avoid double-counting the trivial path that starts and ends at u. That way, all tuples can be evaluated with one pass from the leaves to the root and each vertex is only visited once. At the root, zroot(π) is the expected reward of policy π. Dynamic Programming Algorithm We introduce a DP algorithm to compute the optimal policy. Let subpolicy πu be the part of the full policy that defines actions for barriers within Tu. In the DP algorithm, each subtree Tu maintains a list of tuples ψ that are reachable by some subpolicies and each tuple is associated with a least-cost subpolicy, that is, π u arg min{πu|ψu(πu)=ψ} c(πu). Let v and w be two children of u. We recursively generate the list of reachable tuples and the associated least-cost subpolicies using the tuples of v and w. To do this, for each ψv, ψw, we first extract the corresponding π v and π w. Then, using these two least-cost subpolicies of the children, for each a Auv and a Auw, a new subpolicy πu is constructed for Tu with cost c(πu) = cuv,a + cuw,a + c(π v) + c(π w). Using Eqs. (2), (3) and (4), the tuple ψu(πu) of πu is calculated. If ψu(πu) already exists in the list (i.e., ψu(πu) was created by some other previously constructed subpolicies), we update the associated subpolicy such that only the minimum cost subpolicy is kept. If not, we add this tuple ψu(πu) and subpolicy πu to the list. To initialize the recurrence, the list of a leaf subtree contains only a single tuple (hu, hu, h2 u) associated with an empty subpolicy. Once the list of Troot is calculated, we scan the list to pick a pair (ψ root, π ) such that (ψ root, π ) arg max{(ψroot,π)|c(π) b} zroot where zroot is the third element of ψroot. Finally, π is the returned optimal policy and z root is the optimal expected reward. 4 Rounded Dynamic Programming The DP algorithm is not a polynomial-time algorithm because the number of reachable tuples increases exponentially as we approach the root. In this section, we modify the DP algorithm into a FPTAS algorithm. The basic idea is to discretize the continuous space of ψu at each vertex such that there only exists a polynomial number of different tuples. To do this, the three dimensions are discretized using granularity factors Kν u, Kµ u and Kz u respectively such that the space is divided into a finite number of cubes with volume Kν u Kµ u Kz u. For any subpolicy πu of u in the discretized space, there is a rounded tuple ˆψu(πu) = (ˆνu(πu), ˆµu(πu), ˆzu(πu)) to underestimate the true tuple ψu(πu) of πu. To evaluate ˆψu(πu), we use the same recurrences as (2), (3) and (4), but rounding each intermediate value into a value in the discretized space. The recurrences are as follow: ˆνsum u (πu) = puv|πu ˆνv(πu)+puw|πu ˆνw(πu)+hu ˆµsum u (πu) = pvu|πu ˆµv(πu)+pwu|πu ˆµw(πu)+hu ˆνu(πu) = Kν u ˆνsum u (πu) Kνu ˆµu(πu) = Kµ u ˆµsum u (πu) Kµ u ˆzu(πu) = Kz u (6) ˆzv(πu)+ˆzw(πu)+ˆµv(πu)pv w|πu ˆνw(πu)+ˆµw(πu)pw v|πu ˆνv(πu)+huˆµsum u (πu)+huˆνsum u (πu) h2 u Kzu The modified algorithm rounded dynamic programming (RDP) is the same as the DP algorithm, except that it works in the discretized space. Specifically, each vertex maintains a list of reachable rounded tuples ˆψu, each one associated with a least costly subpolicy achieving ˆψu, that is, π u arg min{πu| ˆ ψu(πu)= ˆ ψu} c(πu). Similarly to our DP algorithm, we generate the list of reachable tuples for each vertex using its children s lists of tuples. The difference is that to calculate the rounded tuple of a new subpolicy we use recurrences (5) and (6) instead of (2), (3) and (4). 5 Theoretical Analysis We now turn to the main theoretical result: Theorem 1. RDP is a FPTAS. Specifically, let OPT be the value of the optimal policy. Then, RDP can compute a policy with value at least (1 ϵ)OPT in time bounded by O( n8 Approximation Guarantee Let π be the optimal policy and let π be the policy returned by RDP. We bound the value loss z(π ) z(π ) by bounding the distance of the true tuple ψ(π) and the rounded tuple ˆψ(π) for an arbitrary policy π. In Eqs. (5) and (6), starting from leaf vertices, each rounding operation introduces an error at most K u where represents ν, µ and z. For ν, starting from u, each vertex t Tu introduces error Kν t by using the rounding operation. The error is discounted by the accessibility from u to t. For µ, each vertex s Tu introduces error Kµ s , discounted in the same way. The total error is equal to the sum of all discounted errors. Finally, we get the following result by setting 3hu, Kµ u = ϵ 3hu, Kz u = ϵ Lemma 1. If condition (7) holds, then for all u V and an arbitrary policy π: νu(π) ˆνu(π) X t Tu pu t|πKν t = ϵ t Tu pu t|πht = ϵ µu(π) ˆµu(π) X s Tu ps u|πKµ s = ϵ s Tu ps u|πhs = ϵ The difference of z(π) ˆz(π) is bounded by the following lemma. Lemma 2. If condition (7) holds, z(π) ˆz(π) ϵz(π) for an arbitrary policy π. The proof by induction on the tree appears in the supplementary material. Theorem 2. Let π and π be the optimal policy and the policy return by RDP respectively. Then, if condition (7) holds, we have z(π ) z(π ) ϵz(π ). Proof. By Lemma 2, we have z(π ) ˆz(π ) ϵz(π ). Furthermore, z(π ) ˆz(π ) ˆz(π ) where the second inequality holds because π is the optimal policy with respect to the rounded policy value. Therefore, we have z(π ) z(π ) z(π ) ˆz(π ) which proves the theorem. Runtime Analysis Now, we derive the runtime result of Theorem 1, that is, if condition (7) holds, the runtime of RDP is bounded by O( n8 ϵ6 ). First, it is reasonable to make the following assumption: Assumption 2. The value hu is constant with respect to n and ϵ for each u V . Let mu,ˆν, mu,ˆµ and mu,ˆz be the number of different values for ˆνu, ˆµu and ˆzu respectively in the rounded value space of u. Lemma 3. If condition (7) holds, then mu,ˆν = O nu , mu,ˆµ = O nu , mu,ˆz = O n2 u ϵ for all u V where nu is the number of vertices in subtree Tu. Proof. The number mu,ˆν is bounded by Kν u where P t Tu ht is a naive and loose upper bound of νu obtained assuming all passabilities of streams in Tu are 1.0. By Assumption (2), mu,ˆν = O( nu ϵ ). The upper bound of mu,ˆµ can be similarly derived. Assuming all passabilities are 1.0, the upper bound of zu is P t Tu hsht. Therefore, mu,ˆz t Tu hsht Kzu = O( n2 u ϵ ) Recall that RDP works by recursively calculating the list of reachable rounded tuples and associated least costly subpolicy. Using Lemma 3, we get the following main result: Theorem 3. If condition (7) holds, the runtime of RDP is bounded by O( n8 Proof. Let T(n) be the maximum runtime of RDP for any subtree with n vertices. In RDP, for vertex u with children v and w, we compute the list and associated subpolicies by iterating over all combinations of ˆψv and ˆψw. For each combination, we iterate over all available action combinations auv Auv and auw Auw, which takes constant time because the number of available repair actions are constant w.r.t. n and ϵ. Therefore, we can bound T(n) using the following recurrence: T(nu)=O(mv,ˆνmv,ˆµmv,ˆzmw,ˆνmw,ˆµmw,ˆz) + T(nv) + T(nw) cn4 vn4 w ϵ6 + T(nv) + T(nw) max 0 k (nu 1) ck4(nu k 1)4 ϵ6 + T(k) + T(nu k 1) where nu = 1 + nv + nw as Tu consists of u, Tv and Tw. The second inequality is due to Lemma 3. The third inequality is obtained by a change of variable. We prove that T(n) c n8 ϵ6 using induction. For the base case n = 0, we have T(n) = 0 and for the base case n = 1, the subtree only contains one vertex, so T(n) = c. Now assume that T(k) c k8 ϵ6 for all k < n. Then one can show that T(n) max 0 k (n 1) c ϵ6 k4(n k 1)4 + k8 + (n k 1)8 cn8 and thus the theorem holds. A detailed justification of the final inequality appears in the supplementary material. 6 Algorithm Implementation and Experiments The theoretical results suggest that the RDP approach may be impractical for large networks. However, we can accelerate the algorithm and produce high quality solutions by making some changes, motivated by observations from our initial experiments. First, the theoretical runtime upper bound is much worse than the actual runtime of RDP because in practice, because the number of reachable tuples per vertex is much lower than the upper bounds of mu,ˆν mu,ˆµ and mu,ˆz used in the proof. Moreover, some inequalities used in Section 5 are very loose; most of the rounding operations in fact produce much less error than the upper bound K u. Therefore, we can set the values of K u much larger than the theoretical values without compromising the quality of approximation. Consequently, before calculating the list of reachable tuples of u, we first estimate the upper bound and lower bound of the reachable values of ˆνu, ˆµu and ˆzu using the list of tuples of its children. Then, we dynamically assign values to K u by fixing the total number of different discrete values of ˆνu, ˆµu and ˆzu in the space, thereby determining the granularity of discretization. For example, if the upper and the lower bounds of ˆνu are 1000 and 500 respectively, and we want 10 different values, the value of Kν u is set to be 1000 500 10 = 50. By using a finer granularity of discretization, we get a slower algorithm but better solution quality. In our experiments, setting these numbers to be 50, 50 and 150 for ˆνu, ˆµu and ˆzu, the algorithm became very fast and we were able to get very good solution quality. We compared RDP with a greedy algorithm and a state-of-the-art algorithm for conservation planning, which uses sample average approximation and mixed integer programming (SAA+MIP) [4, 6, 11]. We initially considered two different greedy algorithms. One incrementally maximizes the increase of expected reward. The other incrementally maximizes the ratio between increase in expected reward and action cost. We found that the former performs better than the latter, so we only report results for that version. We compare all three algorithms on small river networks. On large networks, we only compare RDP with the greedy algorithm because SAA+MIP fails to solve problems of that size. Figure 2: River networks in Massachusetts Dataset Our experiments use data from the CAPS project [20] for river networks in Massachusetts (Fig. 2). Barrier passabilities are calculated from barrier features using the model defined by the CAPS project. We created actions to model practical repair activities. For road-crossings, most passabilities start close to 1 and are cheap to repair relative to dams. To model this, we set Au,v ={a1}, puv|a1 =pvu|a1 =1.0 and cuv|a1 = 5. In contrast, it is difficult and expensive to remove dams, so multiple strategies must be considered to improve their passability. We created actions Au ={a1, a2, a3} with action a1 having puv|a1 =pvu|a1 =0.2 and cuv|a1 =20; action a2 having puv|a2 =pvu|a2 =0.5 and cuv|a2 =40; and action a3 having puv|a3 =pvu|a3 =1.0 and cuv|a3 =100. Results on Small Networks We compared SAA+MIP, RDP and Greedy on small river networks. SAA+MIP used 20 samples for the sample average approximation and IBM CPLEX on 12 CPU cores to solve the integer program. RDP1 used finer discretization than RDP2, therefore requiring longer runtime. The results in Table 1 show that RDP1 gives the best increase in expected reward (relative to a zero-cost policy) in most cases and RDP2 produces similarly good solutions, but takes less time. Although Greedy is extremely fast, it produces poor solutions on some networks. SAA+MIP gives better results than Greedy, but fails to scale up. For example, on a network with 781 segments and 604 barriers, SAA+MIP needs more than 16G of memory to construct the MIP. Number of ER Increase Runtime Segments Barriers SAA+MIP Greedy RDP1 RDP2 SAA+MIP Greedy RDP1 RDP2 106 36 3.7 4.1 4.1 4.0 3.3 0.0 0.7 0.4 101 71 4.0 3.6 4.3 4.3 19.5 0.0 2.5 1.2 163 91 11.3 11.2 12.3 12.1 42.3 0.0 13.6 6.8 263 289 20.7 11.1 25.3 24.8 1148.7 0.7 263.3 98.7 499 206 48.6 55.6 53.8 53.2 116.0 0.7 11.9 6.4 456 464 124.1 96.8 146.9 144.3 8393.5 0.7 359.9 142.0 639 609 51.8 25.8 53.7 51.6 12720.1 1.3 721.2 242.4 Table 1: Comparison of SAA, RDP and Greedy. Time is in seconds. Each unit of expected reward is 107 (square meters). ER increase means the increase in expected reward after taking the computed policy. Results on Large Networks We compared RDP and Greedy on a large network the Connecticut River watershed, which has 10451 segments, 587 dams and 7545 crossings. We tested both algorithms on three different settings of action passabilities. 0.6 0.8 1 1.2 1.4 1.6 1.8 2 x 10 4 10 ER Increase RDP1 RDP2 Greedy (a) Expected reward increase 0.6 0.8 1 1.2 1.4 1.6 1.8 2 x 10 4 1000 RDP1 RDP2 Greedy (b) Runtime in seconds Figure 3: RDP vs Greedy on symmetric passabilities. Actions w/ symmetric passabilities In this experiment, we used the actions introduced above. The expected reward increase (Fig. 3a) and runtime (Fig. 3b) are plotted for different budgets. For the expected reward, each unit represents 1014 m2. Runtime is in seconds. As before, RDP1 uses finer discretization of tuple space than RDP2. As Fig. 3 shows, the RDP algorithms give much better solution quality than the greedy algorithm. With a budget of 20000, the ER increase of RDP1 is almost twice the increase for Greedy. Incidentally, RDP1 doesn t improve the solution quality by much, but it takes much longer time to finish. Notice that both RDP1 and RDP2 use constant runtime because the number of discrete values in both settings are bounded. In contrast, the runtime of Greedy increases with the budget size and eventually exceeds RDP2 s runtime. 0.6 0.8 1 1.2 1.4 1.6 1.8 2 x 10 4 20 25 30 35 40 45 50 55 ER Increase (a) Expected reward increase 0.6 0.8 1 1.2 1.4 1.6 1.8 2 x 10 4 1000 (b) Runtime in seconds Figure 4: RDP vs Greedy on asymmetric passabilities with all downstream passabilities equal to 1. Actions with asymmetric passabilities The RDP algorithms work with asymmetric passabilities as well. For road-crossings, we set the actions to be the same as before. For dams, we first considered the case in which the downstream passabilities are all 1 which happens for some fish and all upstream passabilities are the same as before. The results are shown in Figures 4a and 4b. In this case RDP still performs better than Greedy and tends to use less time as the budget increases. 0.6 0.8 1 1.2 1.4 1.6 1.8 2 x 10 4 10 15 20 25 30 35 40 45 50 55 ER Increase (a) Expected reward increase 0.6 0.8 1 1.2 1.4 1.6 1.8 2 x 10 4 0.05 0.2 0.5 0.8 1.1 1.4 1.7 2 2.3 2.5x 10 4 (b) Runtime in seconds Figure 5: RDP vs Greedy on asymmetric passabilities with varying downstream passabilities. We also considered a hard case in which the downstream passabilities of a dam are given by pvu|a1 = 0.8, pvu|a2 = 0.9, and pvu|a3 = 1.0. These variations of passabilities produce more tuples in the discretized space. Our RDP algorithm still works well and produces better solutions than Greedy over a range of budgets as shown in Fig. 5a. As expected in such hard cases, RDP needs much more time than Greedy. However, obtaining high quality solutions to such complex conservation planning problems in a matter of hours makes the approach very valuable. 0 2000 4000 6000 0 ER Increase Figure 6: Time/quality tradeoffs Time/Quailty Tradeoff Finally, we tested the time/quality tradeoff offered by RDP. The tradeoff is controlled by varying the level of discretization. We ran these experiments on the Connecticut River watershed using symmetric passabilities. Fig. 6 shows how runtime and expected reward grow as we refine the level of discretization. As we can see, in this case RDP converges quickly on high-quality results and exhibits the desired diminishing returns property of anytime algorithms the quality gain is large initially and it diminishes as we continue to refine the discretization. 7 Conclusion We present an approximate algorithm that extends the rounded dynamic programming paradigm to stochastic network design in bidirected trees. The resulting RDP algorithm is designed to maximize connectivity in a river network by solving the bidirectional barrier removal problem a hard conservation planning problem for which no scalable algorithms exist. We prove that RDP is an FPTAS, returning (1 ϵ)-optimal solutions in polynomial time. However, its time complexity, O(n8/ϵ6), makes it hard to apply it to realistic river networks. We present an adaptive-rounding version of the algorithm that is much more efficient. We apply this adaptive rounding method to segments of river networks in Massachusetts, including the entire Connecticut River watershed. In these experiments, RDP outperforms both a baseline greedy algorithm and an SAA+MIP algorithm, which is a state-of-art technique for stochastic network design. Our new algorithm offers an effective tool to guide ecologists in hard conservation planning tasks that help preserve biodiversity and mitigate the impacts of barriers in river networks. In future work, we will examine additional applications of RDP and ways to relax the assumption that the underlying network is tree-structured. Acknowledgments This work has been partially supported by NSF grant IIS-1116917. [1] Srinivas Peeta, F. Sibel Salman, Dilek Gunnec, and Kannan Viswanath. Pre-disaster investment decisions for strengthening a highway network. Computers and Operations Research, 37(10):1708 1719, 2010. [2] Jean-Christophe Foltˆete, Xavier Girardet, and C eline Clauzel. A methodological framework for the use of landscape graphs in land-use planning. Landscape and Urban Planning, 124:140 150, 2014. [3] Leandro R. Tambosi, Alexandre C. Martensen, Milton C. Ribeiro, and Jean P. Metzger. A framework to optimize biodiversity restoration efforts based on habitat amount and landscape connectivity. Restoration Ecology, 22(2):169 177, 2014. [4] Xiaojian Wu, Daniel Sheldon, and Shlomo Zilberstein. Stochastic network design for river networks. NIPS Workshop on Machine Learning for Sustainability, 2013. [5] Jesse Rush OHanley and David Tomberlin. Optimizing the removal of small fish passage barriers. Environmental Modeling & Assessment, 10(2):85 98, 2005. [6] Daniel Sheldon, Bistra Dilkina, Adam Elmachtoub, Ryan Finseth, Ashish Sabharwal, Jon Conrad, Carla Gomes, David Shmoys, William Allen, Ole Amundsen, and William Vaughan. Maximizing the spread of cascades using network design. In Proc. of the 26th Conference on Uncertainty in Artificial Intelligence (UAI), pages 517 526, 2010. [7] David Kempe, Jon Kleinberg, and Eva Tardos. Maximizing the spread of influence through a social network. In Proc. of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 137 146, 2003. [8] Xiaojian Wu, Daniel Sheldon, and Shlomo Zilberstein. Rounded dynamic programming for treestructured stochastic network design. Proc. of the 28th Conference on Artificial Intelligence (AAAI), 2014. [9] Wei Chen, Chi Wang, and Yajun Wang. Scalable influence maximization for prevalent viral marketing in large-scale social networks. In Proc. of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 1029 1038, 2010. [10] Leslie G. Valiant. The complexity of enumeration and reliability problems. SIAM Journal on Computing, 8(2):410 421, 1979. [11] Akshat Kumar, Xiaojian Wu, and Shlomo Zilberstein. Lagrangian relaxation techniques for scalable spatial conservation planning. In Proc. of the 26th AAAI Conference on Artificial Intelligence (AAAI), pages 309 315, 2012. [12] Shan Xue, Alan Fern, and Daniel Sheldon. Scheduling conservation designs via network cascade optimization. In Proc. of the 26th Conference on Artificial Intelligence (AAAI), pages 391 397, 2012. [13] Shan Xue, Alan Fern, and Daniel Sheldon. Dynamic resource allocation for optimizing population diffusion. In Proc. of the Conference on Artificial Intelligence and Statistics (AISTATS), 2014. [14] Benjamin H. Letcher, Keith H. Nislow, Jason A. Coombs, Matthew J. O Donnell, and Todd L. Dubreuil. Population response to habitat fragmentation in a stream-dwelling brook trout population. Plo S one, 2 (11):e1139, January 2007. [15] Alison A. Bowden. Towards a comprehensive strategy to recover river herring on the Atlantic seaboard: Lessons from Pacific salmon. ICES Journal of Marine Science, 2013. [16] Erik H. Martin and Colin D. Apse. Northeast aquatic connectivity: An assessment of dams on northeastern rivers. Technical report, The Nature Conservancy, Eastern Freshwater Program, 2011. [17] Charles J. Colbourn. Network resilience. SIAM Journal on Algebraic Discrete Methods, 8(3):404 409, 1987. [18] Santiago Saura and Luc ıa Pascual-Hortal. A new habitat availability index to integrate connectivity in landscape conservation planning: Comparison with existing indices and application to a case study. Landscape and Urban Planning, 83:91 103, 2007. [19] Santiago Saura and Josep Torne. Conefor sensinode 2.2: A software package for quantifying the importance of habitat patches for landscape connectivity. Environmental Modelling & Software, 24(1):135 139, 2009. [20] Kevin Mc Garigal, Bradley W. Compton, Scott D. Jackson, Ethan Plunkett, Kasey Rolih, Theresa Portante, and Eduard Ene. Conservation assessment and prioritization system (CAPS). Technical report, Department of Environmental Conservation, Univ. of Massachusetts Amherst, 2011.