# matchings_under_onesided_preferences_with_soft_quotas__b80d3989.pdf Matchings under One-Sided Preferences with Soft Quotas Santhini K. A.1 , Raghu Raman Ravi2 and Meghana Nasre1 1Indian Institute of Technology Madras 2ETH Zurich {santhini, meghana}@cse.iitm.ac.in, raghu.ravi.raman@gmail.com Assigning applicants to posts in the presence of the preferences of applicants and quotas associated with posts is extensively investigated. For a post, lower quota guarantees, and upper quota limits the number of applicants assigned to it. Typically, quotas are assumed to be fixed, which need not be the case in practice. We address this by introducing a soft quota setting, in which every post is associated with two values lower target and upper target which together denote a range for the intended number of applicants in any assignment. Unlike the fixed quota setting, we allow the number of applicants assigned to a post to fall outside the range. This leads to assignments with deviation. Here, we study the problem of computing an assignment that has two orthogonal optimization objectives minimizing the deviation (maximum or total) w.r.t. soft quotas and ensuring optimality w.r.t. preferences of applicants (rank-maximality or fairness). The order in which these objectives are considered, the different possibilities to optimize deviation combined with the well-studied notions of optimality w.r.t. preferences open up a range of optimization problems of practical importance. We present efficient algorithms based on flow-networks to solve these optimization problems. 1 Introduction We study the many-to-one bipartite matching problem where we have a bipartite graph G = (A P, E) of applicants A and posts P in the presence of preferences of applicants over the posts. Typically, every post p P is associated with an input upper quota q+(p) that limits the maximum number of applicants assigned to p in any allocation. Every applicant ranks all the posts in its neighborhood and this ranking can involve ties between the posts. A matching M is a collection of edges where an applicant has at most one edge incident to it in M and a post p has at most q+(p) many edges incident on it in M. In the literature, this setting is termed as house allocation setting [Abraham et al., 2004] and can be used to model different problems such as assigning applicants to jobs, allocating courses to students [Bir o, 2017], assigning papers to referees [Garg et al., 2010], students to schools [Abraham, 2009], and rental inventories to customers [Abraham et al., 2006]. In certain applications like allocating courses or projects to students [Bir o et al., 2010; Arulselvan et al., 2018; Cseh et al., 2022], the posts may require a minimum number of applicants to be assigned in order to be operational. Lower quotas are extensively investigated in the two-sided preference list setting called the Hospital Residents problem [Goko et al., 2022; Goto et al., 2014; Hamada et al., 2016; Fleiner and Kamiyama, 2016; Monte and Tumennasan, 2013; Nasre and Nimbhorkar, 2018; Nasre et al., 2021; Fragiadakis et al., 2016]. Here, the lower quota of a hospital denotes the minimum number of residents (doctors) needed for the smooth functioning of the hospital. To capture this, for a post p in addition to the upper quota q+(p), a lower-quota q (p) is associated with it. For a matching M and a post p, M(p) denotes the set of applicants assigned to p in M. In this setting, a matching M is feasible if for every post p, q (p) |M(p)| q+(p). We refer to this setting as the fixed quota setting throughout the paper. In the presence of lower quotas, there are simple instances where no feasible matching exists (see Figure 1 for an example). In many practical applications, quotas need not be rigid as is assumed in the fixed quota setting and relaxing the lower quotas and augmenting the upper quotas have been considered in the literature. For instance, instead of hospitals closing down for want of a minimum number of doctors, they may simply reduce their service levels as described in [Goko et al., 2022]. Similarly, in the context of school choice, capacity planning to augment the number of available seats has been considered recently in multiple works [Abe et al., 2022; Bobbio et al., 2021; Bobbio et al., 2022]. Motivated by such applications, we relax the fixed quota assumption by proposing and studying a soft quota setting for the assignment problem in the one-sided preference list model. 2 Soft Quota Setting An instance of the problem in the soft quota setting consists of a bipartite graph G = (A P, E) where applicants have preference over posts and a post p has two values a lower target ℓ(p) and an upper target u(p). For a post p, the two values ℓ(p) and u(p) define a range for the intended number of applicants for p in any matching. There is no restriction on Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence (IJCAI-23) the number of applicants that can be assigned to a post in any matching. However, in order to control skew, we measure the deviation of a post p w.r.t. a matching M as defined below: d M(p) = max{0, ℓ(p) |M(p)|, |M(p)| u(p)}. Our goal in the soft quota setting is to compute an assignment that has two orthogonal optimization objectives: (i) Minimizing deviation. For a matching M, we define the total deviation dtot(M) and max deviation dmax(M) as below, and our goal is to minimize one of them. dtot(M) = X p P d M(p) dmax(M) = max p P d M(p) (ii) Optimality w.r.t. preferences. Two well-studied notions of optimality in the fixed quota setting for one-sided preferences are rank-maximality [Irving et al., 2006; Paluch, 2013; Hosseini et al., 2021; Peters, 2022], and fairness [Huang et al., 2016]. A rank-maximal matching matches maximum number of applicants to rank 1 post, subject to that maximum number of applicants to rank 2 posts and so on. A fair matching minimizes the number of unmatched applicants, subject to that minimizes the number of applicants matched to the last rank post and so on. The signature of a matching allows us to formalize these notions. Let r be the maximum rank used by an applicant to order a post in G. The signature of a matching M (σ(M)) is an r + 1 tuple (x1, x2, . . . , xr+1) where xi denotes the number of applicants matched to their rank i post in M for i [r] and xr+1 denotes the number of unmatched applicants in M. Definition 1 (Rank-maximality). For two matchings M and M with signatures σ(M) = (x1, . . . , xr+1) and σ(M ) = (y1, . . . , yr+1), we say M is more rank-maximal than M (M >R M ) if there exists an ℓfor ℓ [r] such that xℓ> yℓ and for j [ℓ 1], we have xj = yj. A matching that has a maximum signature under the ordering >R is a rankmaximal matching. The notion of fairness can be defined analogously. In the fixed quota setting where the lower quota of every post is zero, both these notions are guaranteed to exist, and optimal matchings are efficiently computable by the algorithms given in [Irving et al., 2006; Paluch, 2013; Huang et al., 2016]. Even in the presence of lower quotas an optimal matching amongst all the feasible matchings can be efficiently computed by extending the algorithm in [Huang et al., 2016]. 2.1 Our Problems Deviation first, preference optimality next. There exist applications where having as low deviation as possible is required and subject to that optimality with respect to preferences needs to be respected. For example, in allocating projects to students [Cooper, 2020] or dividing political representatives to seats [Dhamala and Thapa, 2009; Pukelsheim et al., 2012; Serafini and Simeone, 2012] it is desirable to minimize the difference between input values and the assigned numbers. Let M be the set of all matchings in the instance G of the soft quota setting. Let Mt (Mm) denote the set of matchings that have minimum value of total deviation (respectively the minimum value of maximum deviation). We define our first set of problems: a1 : p1 a2 : p1 a3 : p1, p2, p3 a4 : p1, p6, p7 a5 : p1, p4, p5 a6 : p6, p7 a7 : p1, p6 p1 [1, 1] p2 [1, 1] p3 [1, 1] p4 [1, 1] p5 [1, 1] p6 [0, 1] p7 [0, 1] a1 a2 a3 a4 a5 a6 a7 M1 Figure 1: Example instance G with |A| = |P| = 7. Applicants along with their preferences are given. Each post has two values a lower value and an upper value. In the fixed quota setting these numbers represent rigid lower and upper quotas. In the soft quota setting these numbers are lower and upper target respectively. Matching M1 shown beside is an infeasible matching in the fixed quota setting but is valid assignment in the soft quota setting. OPT-MIN-TOT: The goal is to compute a matching that satisfies the notion OPT w.r.t. preferences in the set Mt. For instance, if the notion OPT is rank-maximality then the problem RMM-MIN-TOT outputs a rank-maximal matching in the set Mt. OPT-MIN-MAX: The goal is to compute a matching that satisfies the notion OPT in the set Mm. Preference optimality first, deviation next. Analogously, there exists applications where preference optimality has to be respected first and subject to that we would like to minimize a deviation parameter of interest. Since in the soft quota setting, there is no bound on the upper quota of a post, it is not immediate how to determine the set of matchings in which we wish to compute a rank-maximal or fair matching. To address this, we first introduce an additional input in our problem a signature requirement ρ for the output matching. We consider all matchings in the soft quota setting that respect the signature requirement with respect to the preference optimality under consideration. This is similar to what is done in [K. A. et al., 2022] where the signature is a part of the input. For a fixed notion of preference optimality, a matching M meets the signature requirement if σ(M) OP T ρ and let MOP T denote all the matchings that meet the signature requirement. Let M t (M m) denote the set of matchings that have minimum value of total deviation (minimum value of maximum deviation respectively) in MOP T . We consider the following two problems here: OPT-SIGN-MIN-TOT: The goal is to output any matching in M t. OPT-SIGN-MIN-MAX: The goal is to output any matching in M m. In the rest of the paper, we present the results when OPT is set to rank-maximality (RMM). Analogous results for fairness can be obtained using minor modifications to our algorithms. These are deferred to the full version of the paper. Example. We illustrate our problems using the example instance G in Figure 1. Table 1 shows four matchings along with parameters of the matching. We first consider the problems of the type deviation first, preference optimality next. Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence (IJCAI-23) Input Optimality notion Parameters of matching Matching Signature Max Total dev dev G RMM-MIN-TOT M1 = {(a1, p1), (a3, p2), (a4, p7), (a5, p4), (a6, p6)} (2, 2, 1, 2) 1 2 G RMM-MIN-MAX M2 = {(a1, p1), (a2, p1), (a3, p2), (a4, p7), (a5, p4), (a6, p6), (a7, p6)} (3, 3, 1, 0) 1 4 G, ρ RMM-SIGN-MIN-TOT M3 = {(a1, p1), (a2, p1), (a3, p2), (a4, p7), (a5, p4), (a6, p6), (a7, p1)} (4, 2, 1, 0) 2 4 G, ρ RMM-SIGN-MIN-MAX M4 = {(a1, p1), (a2, p1), (a3, p2), (a4, p1), (a5, p4), (a6, p6), (a7, p6)} (4, 3, 0, 0) 2 5 Table 1: Optimal matchings in the instance given in Figure 1 corresponding to our problems. For the preference optimality first type problems, we consider the input signature as (4, 0, 0, 3). We consider preference optimality as rank-maximality. Any matching in the instance leaves one of p2 or p3 (similarly, one of p4 or p5) unmatched. Thus, any matching in G must have a maximum deviation of at least one and total deviation of at least two. It can be verified that the matchings M1 and M2 shown in Table 1 are optimal for the problems RMM-MIN-TOT and RMM-MIN-MAX respectively. Now we turn to problems where preference optimality is considered first and subject to that deviation is minimized. For concreteness, we fix the requirement for signature as (4, 0, 0, 3). In order to meet the signature requirement, we must violate the upper target of the post p1 by 2 and hence the max deviation of any matching in MR is at least 2. By the remark in the preceding paragraph, the total lower deviation of any matching in the instance is at least 2 and hence the total deviation of any matching in MR is at least 4. Furthermore, it can be verified that the matchings M3 and M4 shown in Table 1 are optimal for the problems RMM-SIGNMIN-TOT and RMM-SIGN-MIN-MAX respectively. The order in which we consider the two orthogonal parameters to optimize provides us with the trade-off between deviation and preference optimality. As expected, when the deviation is optimized first, the output matching has a weak signature. In contrast, when preference optimality is given precedence, a higher deviation is incurred. Thus, our problems offer a trade-off based on the user requirement. Our problems consider optimizing two objective functions deviation and preference optimality and hence can be regarded as a special case of a multi-objective optimization problem [Esfandiari et al., 2016; Korula et al., 2013; Aggarwal et al., 2014]. These works differ from our work in terms of the problem setup and the objectives considered. 2.2 Our Results We show the following new results in our paper. Theorem 1. OPT-SIGN-MIN-MAX and OPT-SIGN-MINTOT admit polynomial time algorithms, where OPT can be one of rank-maximality (RMM) or fairness (FAIR). Theorem 2. OPT-MIN-MAX and OPT-MIN-TOT admit polynomial time algorithms, where OPT can be one of rankmaximality or fairness. Our Techniques. The algorithms for computing minimum total deviation matchings are based on a flow network that we design. Flow networks are used to compute Pareto-optimal matchings [Cechl arov a et al., 2016; Cechl arov a and Fleiner, 2017; Cseh et al., 2022], rank-maximal matchings [Nasre et al., 2019] and cumulative signature matchings [K. A. et al., 2022] in the literature. For computing matchings with minimum value of maximum deviation we reduce the problem to an appropriate problem in the fixed quota setting which can be solved efficiently. 2.3 Other Related Works Biproportional apportionment. Our soft quota setting is similar to the biproportional apportionment problem [Dhamala and Thapa, 2009; Pukelsheim et al., 2012; Serafini and Simeone, 2012] which deals with fair division of political representatives to states. Political representatives correspond to applicants, and states correspond to posts in our model. In our setting, we have applicant preferences which are absent in the biproportional apportionment problem. Soft bounds. The recent works [Ehlers et al., 2014; Echenique and Yenmez, 2015; Gonczarowski et al., 2019; Aziz et al., 2020; Kurata et al., 2017] in the two-sided preferences model allow soft bounds on the capacity constraints. These works focus on the controlled school choice problem in which students are associated with types, and schools have soft bounds on the quota of each type. For more information, refer to the survey paper by [Aziz et al., 2022]. Other models of flexibility. In the context of school choice problem, capacity planning is studied in [Gajulapalli et al., 2020; Rios et al., 2021; Bobbio et al., 2021; Abe et al., 2022; Chen and Cs aji, 2023]. In the work of [Limaye and Nasre, 2023; K. A. et al., 2022] programs have costs instead of quotas which allows flexibility. In all these works, flexibility of upper quotas is considered, whereas in our work, we allow posts to deviate on both upper and lower targets. To the best of our knowledge, this is the first work that investigates soft quotas in the one-sided preference list model. 3 Preference Optimality First In this section, we present the results for the preference optimality first, deviation next type of problems. As mentioned earlier, apart from the input instance these problems have a signature as a part of the input. 3.1 Algorithm for RMM-SIGN-MIN-TOT An instance of RMM-SIGN-MIN-TOT problem consists of a graph G with applicant preferences, soft quotas of posts and an input signature ρ. Let MR denote the set of matchings where for every M MR, we have σ(M) R ρ. Our goal is to output a matching M that has minimum total deviation in the set MR. We present a flow-network based algorithm to solve the RMM-SIGN-MIN-TOT problem. Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence (IJCAI-23) [ld, ld] [0, 1] [ℓ(p1), u(p1)] [0, ld] Figure 2: Flow network H(ld,ud) for solving RMM-SIGN-MINTOT, where = ld + ud Overall idea. Consider the optimal matching M for the RMM-SIGN-MIN-TOT problem. Let the total deviation of M be . In M , a subset of posts incur deviation for the lower target whereas a subset of posts (disjoint from the previous set) incur deviation for the upper target of the corresponding post. Let ld (ud ) denote the sum of the deviation of posts that deviate w.r.t. the lower target (respectively w.r.t. the upper target). Note that = ld + ud . We show that there exists a flow network H(ld ,ud ) such that the matching M has a feasible flow f in H(ld ,ud ). We use this flow network to obtain the optimal value of total deviation. Construction of the network. Given an instance G of RMM-SIGN-MIN-TOT, we fix value which can be split as the sum of two non-negative integers ld and ud. We construct the flow network H(ld,ud) with costs and demands on the edges. See Figure 2 for an illustration. Vertices of H(ld,ud) are: V (H(ld,ud)) = s, d , d , t A p , p : p P . For every applicant a in G, we have a node a in the network H(ld,ud). For every post p in G, we have two nodes in H(ld,ud), that is p (original copy) and p (extended copy). In addition to the above nodes, we have a source s, sink t and two special nodes d and d which help us in capturing the lower deviation and upper deviation respectively. Every edge e E(H(ld,ud)) in the network has a vector [q (e), q+(e)] that represents the demand and capacity associated with it. The edge set E(H(ld,ud)) is the union of the following sets: E1 ={(s, a) : a A} with vector [0, 1] E2 ={(a, p ), (a, p ) : (a, p) E} with vector [0, 1] E3 ={(p , t) : p P} with vector [ℓ(p), u(p)] E4 ={(p , d ) : p P} with vector [0, ud] E5 ={(d , p ) : p P} with vector [0, ld] E6 ={(s, d ), (d , t)} with vectors [ld, ld] and [ud, ud] resp. All edges in H(ld,ud) except the edges in E2 have a cost of 0. For an edge (a, p ) or (a, p ) in E2, if the corresponding edge (a, p) in G is of rank i, then the cost of that edge in the flow network is nr i, where n = |A|. We remark that, for an (a, p) edge of rank i, we use the same assignment of costs as mentioned in [Irving et al., 2006]. It is worth noting that our reductions apply to any notion of optimality encoded as a maximum weight matching problem. Let f be a feasible flow in H(ld,ud). Recall that for a post p P, there are two nodes p and p in the network. For the node p , we define lower deviation of p w.r.t. the flow f as f d , p and upper deviation is undefined for this node. For the node p , we define upper deviation of p w.r.t. f as f p , d , and lower deviation is undefined for this node. For a post p P, we define the total deviation of p w.r.t. f as df(p) = f d , p + f p , d . Similar to the definition of total deviation of a matching, we define total deviation of the flow f in H(ld,ud) as dtot(f) = P p P f d , p +f p , d . By the construction of the network H(ld,ud), and by the definition of total deviation of a flow f in H(ld,ud), it is easy to verify that dtot(f) = = ld + ud. There is a natural correspondence between the flow f in the flow network H(ld,ud) and matching M as follows. M = {(a, p) | f a, p = 1 or f a, p = 1, a A, p P} It would be ideal to have a correspondence between the total deviation of f and the total deviation of M. However, it is not true as illustrated in the simple example. Consider an input graph G that has a single edge (a, p), and the post p has ℓ(p) = 0 and u(p) = 1. Consider the flow network H(1,1) for = 2. The network H(1,1) admits a feasible flow f with value 2 that routes 1 unit of flow through the path s, d , p , t and 1 unit of flow through the path s, a, p , d , t . The flow f has a total deviation dtot(f) = 2 as mentioned earlier. The matching corresponding to f contains a single edge (a, p) and has a total deviation of 0. Thus there is a mismatch between the total deviation of the flow and the total deviation of the corresponding matching. We remark that the issue arises because both the edges d , p and p , d carry a non-zero flow. See full version for additional scenarios. To address these discrepancies and ensure that the total deviation of the flow corresponds to the total deviation of the matching, a feasible flow in our flow network should satisfy additional properties which are listed below as Property 1. Property 1. Let f be a feasible flow in H(ld,ud). The flow f is good if for every post p P all of the following hold. (i) at most one of the edges (d , p ) and (p , d ) has a nonzero flow in f (ii) if f p , d > 0 then f p , t = u(p) (iii) if f d , p > 0 then f p , t = ℓ(p) We note that for a good flow f in H(ld,ud), the corresponding matching M in G satisfies dtot(f) = dtot(M). It is not immediate how to compute a good flow in a flow network. But given a feasible flow in H(ld,ud), we show how to construct a good flow in a different flow network. We need the following notations. For an edge (a, p) in G, we call the path s, a, p , t in the network as the true path, and the path s, a, p , d , t as the extended path. For a post p in G, we call the path s, d , p , t as dummy path. Note that corresponding to an applicant a and a post p in G, we have a true Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence (IJCAI-23) path and an extended path using either one of this we can route a maximum of 1 unit flow in the network. Corresponding to a post p in G, we have a unique dummy path, and we can route a maximum of min{ld, u(p)} units of flow along this path. Let f be a feasible flow in H(ld,ud). We say that the status of a post p P is good if the corresponding pair of nodes p , p in the network satisfies all the conditions given in Property 1, and otherwise the status of p is bad. Suppose we have a bad post p in G w.r.t. the flow f. We will reroute the flow in a certain way to turn the post p a good post without changing the status of any other posts in G. Now we describe different cases under which p can be a bad post and the rerouting required in order to turn it to a good post. Definition 2 (correcting flow). Let f be a feasible flow in the network H(ld,ud) that has a bad post p w.r.t. f and let f be another flow in H(ld,ud) that differs in only the flow across some of the s-t paths corresponding to the post p. Then f is called a correcting flow if it satisfies the following conditions. (i) There exists a network H(ld ,ud ) such that f is feasible in the network H(ld ,ud ) where ld + ud < ld + ud. (ii) The matchings corresponding to f and f are the same. (iii) Status of a post q P \ {p} remains same in f and f . We describe the types of a correcting flow f . (ld, ud) correcting flow. We assume that the post p is bad due to the violation of Property 1.(i). Thus, f d , p > 0 and f p , d > 0. We reduce the flow along the dummy path corresponding to p by 1. Then we select an arbitrary extended path carrying unit flow corresponding to the post p, say s, a, p , d , t and reroute the flow along the path s, a, p , t . Note that the resulting flow f is feasible in a network H(ld ,ud ) where ld = ld 1 and ud = ud 1. ud correcting flow. We assume that the post p is bad due to the violation of Property 1.(ii). Thus, f p , d > 0 and f p , t < u(p). We select an arbitrary extended path carrying unit flow corresponding to the post p, say s, a, p , d , t and reroute the flow along the path s, a, p , t . Note that the resulting flow f is feasible in a network H(ld ,ud ) where ld = ld and ud = ud 1. ld correcting flow. We assume that p is bad due to the violation of Property 1.(iii). Thus, f d , p > 0 and f p , t > ℓ(p). We reduce the flow along the dummy path corresponding to p by 1. Note that the resulting flow f is feasible in a network H(ld ,ud ) where ld = ld 1 and ud = ud. Note that in each of the above cases, if applicant a is matched to the post p prior to the rerouting, then it continues to remain matched to p after the rerouting. Since we have not changed the flow w.r.t. any post q P \ {p}, the status of the post q remains unchanged. Thus f is a correcting flow. Lemma 1. If f is a feasible flow in H(ld,ud) that does not satisfy Property 1, then there exists a correcting flow f in H(ld,ud) that satisfies Property 1. Proof Sketch. For each bad post p, we identify the cause(s) of it being bad. That is, we identify which of the Property 1.(i), (ii) or (iii) being violated. Then we apply the corresponding correcting flow to change the status of post p to good. To compute RMM-SIGN-MIN-TOT, we guess the value of the total deviation which can be split as ld + ud. We construct the network H(ld,ud) and check if the corresponding matching M has a signature σ(M) R ρ. We do a binary search on in the range [0,|A|] to compute the smallest value of such that we get a matching that respects preference optimality and the corresponding flow satisfies Property 1. 3.2 Algorithm for RMM-SIGN-MIN-MAX Given an instance of the soft quota setting, and an input signature ρ, our goal is to compute a matching M in MR that has a minimum value of max deviation. We start by guessing the max deviation value and solve the rank-maximal matching in a fixed quota instance. This problem is called the RMM-LQ problem (see full version for details). The fixed quota instance has the same set of applicants and posts and applicant preferences. For each post p, we set the lower quota q (p) as ℓ(p) and upper quota q+(p) as u(p) + . Let M be the resulting matching. If σ(M) Rρ, then we get a feasible matching that respects rank-maximality. We do a binary search on in the range [0,|A|] to find the smallest such that we get a feasible matching. We remark that by applying the techniques used in [Huang et al., 2016], RMM-LQ problem can be solved in O(rmn) time. 4 Deviation First In this section, we present the results for the deviation first, preference optimality next type of problems. In the interest of space, we defer some proofs to the full version. 4.1 Algorithm for RMM-MIN-TOT Given an instance of the soft quota setting, our goal is to compute a matching that is rank-maximal amongst all matchings that have a minimum value of total deviation. We first discuss how to compute a minimum total deviation matching in G without applicant preferences being considered. Given an instance G of the soft quota setting, we denote the same instance viewed as an instance of the fixed quota setting by Gx. We note that w.r.t. the problem of minimizing total deviation, there is a useful connection between matchings in G and Gx. If the fixed quota setting instance Gx admits a feasible matching M, then M has a total deviation of 0 in G and hence an optimal matching in G. However, if Gx does not admit a feasible matching, any minimum total deviation matching in G has a positive value for the total deviation. Additionally, we observe that in an optimal matching M in G, no post incurs a violation on the upper target. To see this, if for a post p we have more than u(p) many applicants matched to it in M , then we can remove the excess number of applicants M (p) u(p) matched to p from M and obtain a matching with a smaller value of total deviation. We use the above observation and the connection between G and Gx to compute a matching with minimum total deviation without applicant preferences. Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence (IJCAI-23) Minimizing total deviation. Given the instance G, we construct an instance G of the soft quota setting, where for every post p, ℓ (p) = u (p) = ℓ(p). Consider the fixed quota setting instance G x and let M be an arbitrary maximum matching in G x. By the maximality of M in a fixed quota setting instance G x where the lower and upper quota for every post is the same, it is easy to observe that M is a minimum total deviation matching in G . To obtain a minimum total deviation matching in the original instance G, we use the instance Gx and augment the matching M in Gx. We remark that augmenting a matching never decreases the number of applicants matched to any post. Thus, augmenting M in Gx increases the size of the matching while preserving its total deviation. For an illustration, consider the soft quota instance G given in Figure 1. We construct the soft quota instance G , and let G x be the corresponding fixed quota instance. Consider the maximum matching M = {(a1, p1), (a3, p3), (a5, p5)} in G x. We augment the matching M in Gx and obtain a matching M = {(a1, p1), (a3, p3), (a5, p5), (a6, p7), (a7, p6)}. Note that M is a maximum matching in Gx and it has a total deviation of 2 in G, which is the optimum value of total deviation. Note that since we disregarded preferences, the above procedure does not guarantee a matching that is optimal w.r.t. the preferences. For example, the matching M computed above is not a solution of RMM-MIN-TOT problem in G since the matching M1 (illustrated in Table 1) is more rank-maximal than M. In the rest of the section, we refine our ideas above to compute an RMM-MIN-TOT matching in G. We construct the instance G from G as discussed above. We compute an RMM-MIN-TOT matching M in G using the procedure given in step (A) below. We augment the matching M in Gx to obtain an RMMMIN-TOT matching M in G. The procedure for computing M is given in step (B). Step (A): Computing RMM-MIN-TOT in G . Given the instance G , we reduce the problem to computing a min-cost max-flow in a suitably designed flow network H. We show that a matching M corresponding to a min-cost max-flow in H has the following properties: M is a min total deviation matching in G and also in G. M is a rank-maximal matching amongst all min total deviation matchings in G . Note that M need not be a solution to RMM-MIN-TOT in G. For every post p, we introduce ℓ(p) many dummy applicants. Let Dp denote the set of dummy applicants associated with the post p, and let D = S p P Dp. We construct the flow network H as follows. The vertices of H are: V (H) = {s, t} (A D) P. The edge set E(H) is the union of the following sets: Es A ={(s, a) : a A} Es D ={(s, d) : d Dp, p P} EP t ={(p, t) : p P} EDP ={(d, p) : d Dp, p P} EAP ={(a, p) : a A, p P} Cost of the edges in H. For an edge (a, p) in G having rank i, we set cost of the corresponding edge in EAP as (nr + nr i). For all the other edges in H, we set the cost as 0. Capacity of the edges in H. For a (p, t) edge in EP t, we set capacity as ℓ(p). All other edges in H have a unit capacity. We define the deviation of a max-flow f in H. Definition 3 (Total deviation of a max-flow). The total deviation of a max-flow f in H is defined as dtot(f) = X where df(p) = P d Dp f (d, p) is the incoming flow to the vertex p through the dummy applicants. There is a natural correspondence between a flow f in H and a matching M in G as follows M = {(a, p) | f(a, p) = 1, (a, p) EAP }. Cost of a max-flow in H and total deviation of a matching in G : By the design of our network H, we ensure that the cost of a max-flow in H is directly proportional to the total deviation of the corresponding matching in G . This is proved using Lemma 2 and Lemma 3. Lemma 2. Let f, f be two max-flows in H with dtot(f) < dtot(f ). Then cost(f) < cost(f ). Proof Sketch. Due to the term nr in the cost of each (a, p) EAP , observe that cost(f) < cost(f ). Lemma 3. Let f, f be two max-flows in H with cost(f) < cost(f ). Then dtot(f) dtot(f ). Proof of Lemma 3 follows from Lemma 2. Lemma 4. Any min-cost max-flow in H corresponds to a min total deviation matching in G . Lemma 5. If k is the minimum value of the total deviation of any matching in G , then the minimum value of the total deviation of any matching in G is k. Corollary 1. Any min-cost max-flow in H corresponds to a min total deviation matching in G. We now prove the preference optimality of the matching obtained by step (A). Lemma 6. Any min-cost max-flow in H corresponds to a solution to RMM-MIN-TOT in G . Proof Sketch. By Corollary 1, a matching M corresponding to a min-cost max-flow f is a min total deviation matching in G . Using the definition of rank-maximality and the correspondence between matching in G and a max-flow in H, we show that M is a solution to RMM-MIN-TOT in G . In step (A), we have computed a matching M which is an RMM-MIN-TOT matching in G . In step (B), we show how to compute a matching M which is an RMM-MIN-TOT matching in G by augmenting the matching M in Gx. We augment the matching M, such that in the j-th augmentation we get a rank-maximal matching Mj amongst all matchings in the fixed quota instance Gx with cardinality|M| + j. Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence (IJCAI-23) Step (B): Computing RMM-MIN-TOT in G by augmenting M in Gx. Our algorithm is iterative and we begin with the matching M0 = M. In iteration j, we augment Mj 1 in Gx by computing an s-t path in a directed graph Hj with costs as defined below. Vertices of Hj are defined as follows. V (Hj) = {s, t} A P. Let A denote the set of applicants that are unmatched in Mj 1 and let P denote the set of posts with ℓ(p) Mj 1(p) < u(p). Thus, P is the set of posts with d Mj 1(p) = 0. The edge set E(Hj) is the union of the following sets: Es A ={(s, a) : a A } EAP ={(a, p) : (a, p) E \ M} EP t ={(p, t) : p P } EP A ={(p, a) : (a, p) E M} Cost of the edges in Hj. For an edge (a, p) E of rank i, if (a, p) EAP , cost of the edge is nr i if (p, a) EP A, cost of the edge is nr i All other edges in Hj have a cost of 0. Let T j denote a mincost s-t path in Hj, and let Tj = T j E. We augment the matching Mj 1 using Tj, and the resulting matching is Mj = Mj 1 Tj. For a set S E, let #i(S) denotes the number of rank i edges in S. We require Definition 4 to prove the correctness. Definition 4 (Signature of a path Tj). We define the signature of a path Tj w.r.t. the matching Mj 1 as an r-tuple (x1, x2, . . . , xr) denoted by σ(Tj), where xi = #i(Tj Mj 1) #i(Tj \ Mj 1) Note that σ(Tj) can contain negative values. It is useful to note the connection between the signature of a path σ(Tj) and the cost of the path cost(Tj). i=1 xinr i. We observe the following: Observation 1. Pr i=1 xi = 1. Observation 1 is trivial because Tj is an augmenting path. Let σ(Mj 1) = (y1, y2, . . . , yr). The value xi > 0 in σ(Tj) indicates that Tj has more number of matched rank i edges than the unmatched rank i edges, and the value xi cannot be greater than the number of matched rank i edges yi in Mj 1. This is precisely stated in Observation 2. Observation 2. For any i [r], if xi > 0 then xi yi. Since Mj = Mj 1 Tj, it is easy to observe that σ(Mj) = σ(Mj 1) σ(Tj). Observation 3. Every component in σ(Mj) is non-negative. Observation 3 follows from Observation 2. We state a series of claims to prove the correctness of step (B). We ensure that our augmentations do not change the value of total deviation of the resulting matching, and this is stated in Claim 1. Claim 1. If dtot(M) = k, then dtot(Mj) = k. Next, we show that as the cost of path decreases, signature of the path w.r.t. rank-maximality also decreases. Claim 2. Let T j and Q j be two s-t paths in Hj such that cost(T j) < cost(Q j). Then σ(Tj)