# generalized_stochastic_matching__67fa2d48.pdf

Generalized Stochastic Matching

Alireza Farhadi, Jacob Gilbert, Mohammad Taghi Hajiaghayi

University of Maryland farhadi@cs.umd.edu, jgilber8@umd.edu, hajiagha@cs.umd.edu

In this paper, we generalize the recently studied stochastic matching problem to more accurately model a significant medical process, kidney exchange, and several other applications. Up until now the stochastic matching problem that has been studied was as follows: given a graph G = (V, E), each edge is included in the realized sub-graph G of G mutually independently with probability pe, and the goal is to find a degree-bounded sub-graph Q of G that has an expected maximum matching that approximates the expected maximum matching of G. This model does not account for possibilities of vertex dropouts, which can be found in several applications, e.g. in kidney exchange when donors or patients opt out of the exchange process as well as in online freelancing and online dating when online profiles are found to be faked. Thus, we will study a more generalized model of stochastic matching in which vertices and edges are both realized independently with some probabilities pv, pe, respectively, which more accurately fits important applications than the previously studied model. We will discuss the first algorithms and analysis for this generalization of the stochastic matching model and prove that they achieve good approximation ratios. In particular, we show that the approximation factor of a natural algorithm for this problem is at least 0.6568 in unweighted graphs, and 1/2 + ϵ in weighted graphs for some constant ϵ > 0. We further improve our result for unweighted graphs to 2/3 using edge degree constrained subgraphs (EDCS).

Introduction The stochastic matching problem has been used to model kidney exchange in several research papers in recent years, and in this paper, we generalize this model to better suit the needs of kidney exchange and other applications. Kidney exchange is an important medical procedure that is utilized to increase the amount of possible successful kidney transplants between patients and donors for hundreds of donor-patient pairs in the U.S. each year. This medical process occurs when an incompatible kidney donor-patient pair matches with another incompatible pair such that the donors are swapped to become compatible pairs with the patients. Unfortunately, compatibility medical testing can require patients and donors to be hospitalized and are expensive. Thus,

Copyright 2022, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

minimizing the amount of compatibility tests while maximizing compatible exchanges is an important problem in the medical world. Moreover, in this paper we also account for the possibility that a patient or donor may decide to drop out of the exchange due to health conditions or at their discretion at any point throughout the months long process. Therefore, while prior papers only considered donor-patient compatibility, we will also consider potential dropouts from the exchange process on top of compatibility. In our proposed stochastic matching model of kidney exchange, each donor-patient pair is represented by a vertex in the graph. Edges in this graph G represent donor-patient pairs that may be compatible for exchange. Only a subset of the edges are found to be compatible through medical records and testing, and this subset forms a realized subgraph G of possible successful exchanges. We say these edges are realized, i.e. appear in G, with some probability pe. Similarly, a vertex is realized with probability pv if the pair does not dropout during the exchange process. In our generalized model, an edge can only be included in G if both of its vertices are realized as well. A maximum matching algoritm seeks to pair vertices connected by an edge of the graph together to create the maximum amount of matches. So, a maximum matching of G represents maximized compatible kidney exchanges. As mentioned, medical tests for compatibility are expensive, and so querying the edges of G to see if they were realized should be kept to a minimum. Without knowing the sub-graph G, the goal of the stochastic matching problem is to find some degree-bounded subgraph Q with an expected maximum matching of realized edges that has a size approximately that of the actual maximum matching of G. We will state and prove the existence of the first bounds for the approximation ratio achieved with this model of kidney exchange in which vertices and edges may be dropped from the original graph.

Additional Applications

With our generalized stochastic matching, in addition to kidney exchange we can model the freelancing industry comprised of freelance workers and their potential employers. In modern freelancing, workers may have online profiles on websites that businesses can look through to find freelancers with compatibility for a job or project. Unfortunately, a large amount of fake profiles and fake job offerings plague these

The Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22)

websites. Finding out profiles and jobs are fake costs time and money from those who hired the fake profiles or the freelancers who took up a fake job. Therefore, in the online freelancing problem the goal is to maximize matchings between jobs and freelancers while minimizing the amount of queries to freelancers and employers needed to match real freelancer profiles to real job offers. To model online freelancing with stochastic matching, profiles and companies will make up the vertices of some graph G, and there is an edge between profiles and companies if a freelancer fits the qualifications for a company s job opening. Edges may be weighted by the amount a company will pay a freelancer for the job, or remain unweighted if all jobs are nearly equally valuable. Each vertex is realized with some probability pv as long as the online profile or company is real. Each edge (u, v) is realized as long as both of the vertices u and v are realized. As in kidney exchange, queries of edges/vertices are expensive since they require profile reviews and lengthy communications, but in this version of the problem only vertices have a realization probability while edges are always realized if both of its vertices have been realized. Online dating is a very similar scenario with potential fake vertices, but edges between dating profiles may also drop out if a match does not lead to a relationship. In our results later in the paper, we will state and prove bounds to the approximation ratio achieved for weighted graphs for the freelancing and dating model. Besides the aforementioned applications, the problem is significant from a computer science theory perspective as a discussion of graph sparsification. We will show that a simple, well-studied algorithm provides a sparse sub-graph with a good approximation of the expected maximum matching of the original graph for our generalized version of stochastic matching. This sub-graph will conform to a tight restriction: any vertex has at most constant O(1) degree.

Generalized Stochastic Matching Model As discussed, the kidney exchange problem may be modeled with our proposed generalized stochastic matching model. In the stochastic setting, we have a random sub-graph, the realized sub-graph, of some given graph, and we want to approximate some property of the realized sub-graph. Definition 1. Given fixed parameters pv, pe (0, 1] and weighted or unweighted graph G = (V, E) with vertex set V and edge set E V 2, let graph G = (V, E) be a subgraph of G such that any vertex v V is in V mutually independently randomly with probability pv and any edge e = (u, v) E is in E mutually independently randomly with probability pe if u, v V. We call G the realized subgraph of G. Definition 2. Given weighted graph G = (V, E, W) where W is a set of edge weights, let we W be the weight of edge e E. Define M(G) to be the maximum weighted matching of G; furthermore, let µ(G) := P e M(G) we be the weight of the the maximum matching of G. In the stochastic matching problem, we want to find a sparse sub-graph of a given graph such that the realized portion of the sparse sub-graph approximates the maximum

weighted matching of the realized sub-graph. More formally, given a graph G with n vertices, we want to find a sub-graph Q = (V, EQ) that satisfies the following two conditions:

1. Let Q = Q G, then the approximation ratio E[µ(Q)]/E[µ(G)] is as large as possible. 2. The degree of Q is O(1). Specifically, the maximum degree of any vertex in Q may be bounded by a constant determined by pv, pe but not n.

So, if we can find such a sub-graph Q, then we may query the O(n) edges of Q instead of doing expensive queries to all O(n2) edges of G to find out which were realized. However, finding such a sparse sub-graph and proving it has a large approximation ratio is non-trivial.

Related Work The less generalized version of stochastic matching in which all vertices are realized with probability 1 was first introduced by (Blum et al. 2015) primarily to model the kidney exchange setting. In this paper, the authors showed positive empirical results on simulated and real data from the United Network for Organ Sharing in which stochastic matching algorithms resulted in a good approximation of the optimal solution. This problem has been extensively studied since then (Assadi, Khanna, and Li 2016; Yamaguchi and Maehara 2018; Behnezhad and Reyhani 2018; Behnezhad et al. 2019a). The first discussion of this less generalized problem by (Blum et al. 2015) achieved an approximation ratio of (1/2 ϵ) in unweighted graphs, and then (Assadi, Khanna, and Li 2017) broke the half approximation barrier with an approximation ratio of .5001. This bound was later improved by (Behnezhad et al. 2019b) to .6568 and by (Assadi and Bernstein 2019) to (2/3 ϵ). Afterwards, (Behnezhad, Derakhshan, and Hajiaghayi 2020) and (Behnezhad and Derakhshan 2020) both built on the analysis of the algorithm proposed by (Behnezhad et al. 2019b) to further improve approximation ratios for unweighted and weighted graphs to (1 ϵ), respectively. We adapt this same algorithm as Algorithm 1 below to fit our model.

Our Results In the Crucial Edges and Unweighted Approximation section, we will achieve and prove a .65 approximation ratio for unweighted stochastic matching, i.e. the kidney exchange model. By adapting the analysis techniques of (Behnezhad et al. 2019b) for our new generalization of stochastic matching, we will prove the following theorem and lower bound for the unweighted case:

Theorem 3. For unweighted graph G, constant ϵ > 0, vertex and edge realization probabilities pv, pe (0, 1], there is an algorithm to find an Oϵ,p(1)1-degree subgraph Q of G such that E[µ(Q)]/E[µ(G)] .6568 ϵ.

In the Weighted Approximation section, we further consider the weighted stochastic matching problem, i.e. the freelancing model. Here, we prove the following bounds:

1We use Oϵ,p(.) to hide the dependency on poly(ϵ, pv, pe).

Algorithm 1: An algorithm for the generalized stochastic matching problem.

Input: Input weighted graph G = (V, E) and realization probabilities pv, pe [0, 1]. Parameter: R := 2000 log(1/ϵ) log(1/(ϵp2 vpe)) ϵ4p2 vpe 1: Q (V, ) 2: for r = 1, . . . , R do 3: Construct a sample Gr = (Vr, Er) of G, where any vertex v V appears in Vr independently with probability pv, and each edge e E connecting vertices u, v appears in Er independently with probability pe if and only if u, v Vr. 4: Add the edges in maximum weighted matching M(Er) of Gr to Q. 5: end for 6: Query the edges in Q and report the maximum weighted matching of it.

Theorem 4. For weighted graph G, constant ϵ > 0, vertex and edge realization probabilities pv, pe (0, 1], there is an algorithm to find an Oϵ,p(1)-degree subgraph Q of G such that E[µ(Q)]/E[µ(G)] .501 ϵ. One important distinction between our generalizations for freelancing and kidney exchange from prior work is that some edges are no longer realized completely independently. Specifically, if a vertex is not realized in our model, then every edge connected to it is also not realized. This dependence sets our model apart from previous stochastic matching papers. Thus, our techniques will not utilize independent realizations of certain edges, a property that both (Behnezhad, Derakhshan, and Hajiaghayi 2020) and (Behnezhad and Derakhshan 2020) have relied on before. We also improve our bound for unweighted graphs to (2/3 ϵ) in the EDCS 2/3 Approximation section using edge degree constraint sub-graphs (EDCS). Theorem 5. For unweighted graph G, constant ϵ > 0, vertex and edge realization probabilities pv, pe (0, 1], there is an algorithm to find an Oϵ,p(1)-degree subgraph Q of G such that E[µ(Q)]/E[µ(G)] 2/3 ϵ. While the bound of Theorem 5 currently dominates that of Theorem 3 for unweighted graphs, future improvements to Algorithm 1 and its analysis will eventually most likely overtake the 2/3 approximation ratio provided by the EDCS approach.

Algorithm 1 Analysis Before we can prove our main results, we must introduce the concept of fractional matchings and the related procedures we use to build these fractional matchings. Constructing an integral matching directly on our sparse sub-graph Q is difficult since we want a good approximation ratio in expectation without directly knowing G. Instead, we can relax our matching requirements to allow assigning fractional values to edges of our matching, and then later show that the fractional matching serves as proof of the existence of a integral matching of the same approximation ratio.

Fractional Matchings In order to prove Theorems 3 and 4, we will find a fractional matching x of Q that achieves a .6568 ϵ approximation ratio and .501 ϵ approximation ratio, respectively. In an integral matching, each vertex can only be matched to one other vertex. Alternatively, one can think of an integral matching as assigning a value of either 1 or 0 to every edge such that no vertex has two incident edges with value 1. A fractional matching x provides more flexibility in analysis than an integral matching since it allows assigning fractional values xe [0, 1] to edge e such that for any vertex, xv := P v e xe 1. Once we have our fractional matching and prove that it achieves our target approximation ratios, we will use the following folklore lemma to claim the existence of an integral matching y that achieves the same approximation ratio to complete the proofs of Theorem 3 and Theorem 4. Note that in the following lemma, Lemma 6, given graph G = (V, E) and subset U V , we use E(U) to refer to the edges of the induced sub-graph on G by U which includes every edge (u, v) E such that u, v U.

Lemma 6. Let x be a fractional matching, ϵ > 0 be a constant, and G = (V, E) be an edge weighted graph where we is the weight of edge e E. If for all U V such that |U| 1/ϵ it is true that P e E(U) xe |U|/2 , then G has an integral matching y such that P e E we ye (1 ϵ) P e E we xe.

Proof of Lemma 6 and further discussion about fractional matchings can be found in (Behnezhad et al. 2019b) in section 2.2. Now we see that to prove Algorithm 1 provides a good expectecd maximum matching, a fractional matching x on Q will need to satisfy the requirements of Lemma 6 in addition to achieving the target approximation ratio. To create such a matching, we will combine two smaller matchings over two disjoint sets of edges, a set of non-crucial edges and a set of crucial edges. Each edge will be classified as non-crucial or crucial based on the probability that the edge appears in the maximum matching of G.

Definition 7. For edge e E, we define qe := Pr(e M(G)) as the probability that e appears in the maximum weighted matching of realized sub-graph G, and we will refer to qe as the matching probability of edge e2. Additionally, for vertex v V , let qv := P e v qe. For a vertex v and subset X E, q(X) := P e X qe and q X v := P e:e X,v e qe.

Definition 8. Let threshold τ = ϵ3p2 vpe 20 log(1/ϵ), then edge e is crucial if qe τ and non-crucial if qe < τ. We will use C to denote the set of crucial edges and N to denote the set of non-crucial edges.

The matchings over non-crucial and crucial edges will be constructed with procedures analyzed below. When creating these procedures, we will have to keep a few things in mind about our new model. First, edges are only realized if their incident vertices are realized, and so they can be thought of

2Given a realization, we can assume that edges belong to the maximum weighted matching are unique. These edges can be the edges returned by an arbitrary deterministic algorithm.

as having a realization probability of not just pe but p2 vpe. With this in mind, when we sort edges into non-crucial and crucial sets, we make sure our threshold incorporates this realization probability p2 vpe in Definition 8. Furthermore, note that in Algorithm 1, the product p2 vpe makes an appearance in the number of iterations. One of the main reasons is to easily relate the number of rounds of the algorithm to our matching probability threshold τ. Another quirk of the new model is that there is a correlation in realization probabilities of adjacent edges that share an incident vertex. In step 2 of the upcoming non-crucial edge procedure, we scale down our fractional matching by a factor of pv to account for this. The first procedure we discuss will create a near-optimal matching on the non-crucial edges using the following useful observation. Observation 9. E[µ(G)] = P e E weqe. Essentially, we will use matching probabilities as the assigned values in our fractional matching because the expected size of the maximum matching of G is just the sum of edge weights times matching probabilities. Since we don t actually know the matching probabilities exactly, we will assign the value of fe, defined as the fraction of times edge e appears in a maximum matching of an iteration of Algorithm 1 out of R, the total number of iterations. Definition 10. For an edge e E, let k be the number of times e appeared in a maximum weighted matching of a sampled graph Gi for 1 i R during Algorithm 1. We define fe := k

R. Observe that from the above definition, E[fe] = qe, which is what we wanted.

Non-crucial Edge Procedure Given Q = (V, EQ) from Algorithm 1 with realized subgraph Q = (V, EQ). Let xe = 0 for all edges e E. Then, 1. For any realized edge e EQ N, set xe to be

min{fe/(p2 vpe), 2τ/(p2 vpe)}.

2. Let se be the scaling-factor of e where the default value is se = 1. For each vertex v V and edge e incident to v, set se

se = min n se, max{q N v , ϵ}/(pv X

e v xe) o .

Note that this step may be done for each vertex in an arbitrary order. 3. Finally scale down the fractional matching with se. So for all edges e, let xe := xe se. By definition of a fractional matching, it is required that xv 1 for any vertex v V . So, each vertex can be thought to have a budget of size 1. We want to make sure the noncrucial edge procedure leaves some remaining budget for the crucial edges later. To this end, we set the scaling-factors in step 2 to have a factor of P e v xe in the denominator. Moreover, we place a pv in the denominator because vertices can only be matched if they are first realized. Altogether since

qv pv, in expectation the scaling-factor q N v /(pv P e v xe) should keep xe below 1. Note that in the actual definition of the scaling-factors we have a max over q N v and ϵ, and so, we must make sure it is small enough to stay within the vertex budget even with ϵ. So as discussed, the properties of the following lemma prove an upper bound on the size of the fractional matching per vertex, i.e. some budget remains for crucial edges. Additionally, the first property of the following lemma, Lemma 11, proves that the non-crucial edge matching satisfies the requirements of Lemma 6, which as mentioned will be applied in our last step of this analysis to go from our fractional matching to an integral matching. The proof of Lemma 11 and other missing proofs of this section are available in the Appendix.

Lemma 11. Given graph G = (V, E), constant ϵ (0, 1], and fractional matching x from the non-crucial edge procedure:

1. U V such that |U| 1/ϵ, P e E(U) xe ϵ |U|/2 . 2. v V , xv max{qv, ϵ}/pv.

Now, we have proven that the non-crucial edge procedure is not only a fractional matching, but leaves some room to grow when we discuss crucial edges. However, as previously discussed, we must now also prove that the current matching is nearly optimal in size. The following definition will be useful to see this.

Definition 12. For X E, define φ(X) := P e X we qe as the expected matching weight of X. Also, for a vertex v and subset X E, we define φX v := P e X,v e we qe.

By Observation 9, the expected size of the maximum mathcing on G is φ(E). So in Lemma 13, we will show that matching x over non-crucial edges of our sparse sub-graph is within a (1 ϵ) factor of φ(N). To begin, we will bound the size of xe from the first step of the non-crucial edge procedure.

Lemma 13. E

e EQ N we xe

We then show that the scaling factor is at least (1 5ϵ) with high probability. Therefore, we will know that the scaling factor does not decrease the size of the fractional matching too greatly.

Lemma 14. For vertex v of realized edge e EQ N with probability at least 1 2ϵ, it is true that max{q N v , ϵ}/(pv P e v xe) 1 5ϵ.

Proof of Lemma 14. Note that the following proof is an updated proof from Claim A.5, (Behnezhad et al. 2019b) for the new model. In order to prove this lemma, we will need to relate the numerator of the scaling factor, q N v , to the denominator of the scaling factor, P e v xe. Let xv := P e v xe. To create this relation, we will analyze fe since it is related to qe and xe. Note from the lemma s statement, we have already specified some incident edge e = (u, v) EQ N. If e is not realized, we know that xe will just be 0 and does not contribute to the matching. Therefore, moving forward we assume that e is realized. We will use e1, e2, ..., ek EQ N

to denote the rest of the edges incident to v, realized or unrealized. Now, let f N v := P ei fe; we will begin the proof by showing that f N v is a good approximation of q N v . Trivially, since E[fe] = qe, E[f N v ] q N v . Intuitively, f N v is the fraction representing the number of matchings in which v is matched by some edge ei in a round of Algorithm 1 divided by R 1 since we do not count the round v is matched by e. So, by Hoeffding s inequality we have

Pr(f N v q N v ϵ2) Pr(f N v E[f N v ] ϵ2)

exp( 2(R 1)ϵ4) ϵ.

So far we have that with probability at least 1 ϵ, f N v q N v ϵ2, and this inequality implies that with probability at least 1 ϵ, max{f N v , ϵ} max{q N v , ϵ} ϵ2. Therefore, with probability at least 1 ϵ,

max{f N v , ϵ} max{q N v , ϵ} + ϵ2 (1 + ϵ) max{q N v , ϵ}. (1)

So, we have shown that f N v is close to (1 + ϵ)q N v . Note that in step 2 of the non-crucial edge procedure, se only decreases when xv is greater than max{q N v , ϵ} . Therefore, we will show that the probability of this happening is small, but we will use f N v in place of q N v . To do so, we define X1, X2, ..., Xk to be the random variables conditioned on edge e already being realized where we set Xi to 0 if ei is not realized and (min{fei/(pvpe), 2τ/(pvpe)}) if ei is realized. Note that we conditioned on the realization of edge e to keep the set of events Xi mutually independent. Since edge e is realized, we know that the incident vertex v to each edge ei must also be realized. Furthermore, the probability that each edge ei is realized is now only pvpe since a factor of pv has been removed for the incident vertex. Now, we define X := P ei Xi. Observe that X = xv pv xe since step 1 of the non-crucial edge procedure sets any realized edges ei to be min{fei/(p2 vpe), 2τ/(p2 vpe)}. Additionally by linearity of expectation, E[X] = P i E[Xi] fv. Now we will consider two cases. First, assume E[X] ϵ/(2pv), then by definition of X we also have that E[X] max{f N v , ϵ}/(2pv). From here, we can achieve the following probability bound:

Pr(X > max{f N v , ϵ})

= Pr(X max{f N v , ϵ}/2 max{f N v , ϵ}/2)

Pr(X E[X] max{f N v , ϵ}/2)

= Pr X 1 + max{f N v , ϵ} 2 E[X]

exp max{f N v , ϵ}/2 6τ/(pvpe)

, by Chernoff bound.

exp ϵ 12τ/(pvpe)

exp( 1/ϵ2) ϵ.

For the second case, if E[X] > ϵ/(2pv):

Pr(X > (1 + ϵ) max{f N v , ϵ}/pv) Pr(X > (1 + ϵ)E[X])

, by Chernoff bound.

, since E[X] > ϵ/(2pv).

exp 1 log(1/ϵ)

Thus, in either case we know that

Pr(X (1 + ϵ) max{f N v , ϵ}) ϵ.

Since X = xv xe, we also know that with probability at least 1 ϵ,

xv (1 + ϵ)(max{f N v , ϵ}/pv) + pv xe (1 + ϵ)(max{f N v , ϵ}/pv) + ϵ

(1 + 2ϵ)(max{f N v , ϵ}/pv).

Now, we can use Inequality 1 to obtain our inequality to substitute in q N v . With probability at least 1 2ϵ,

xv (1 + 2ϵ)(max{f N v , ϵ}/pv)

(1 + 2ϵ)(1 + ϵ)(max{q N v , ϵ}/pv)

(1 + 5ϵ)(max{q N v , ϵ}/pv).

Finally, with probability at least 1 2ϵ,

max{q N v , ϵ}/( xv pv) 1 1 + 5ϵ 1 5ϵ.

Lemma 15. Given a non-crucial edge matching x from Procedure 1,

(1 10ϵ)φ(N).

Crucial Edges and Unweighted Approximation Now we will augment the previous fractional matching from the non-crucial edge procedure with a second procedure on crucial edges. The crucial edge procedure is different for unweighted and weighted graphs, and we first give the slightly simpler procedure for unweighted graphs. The resulting fractional matching will give us our desired approximation ratio to prove Theorem 3. In our new stochastic matching model, it is important to emphasize that the realization of non-crucial edges actually gives some information about the realization of crucial edges. In fact, because some non-crucial edges share an incident vertex with crucial edges, the probability of a crucial edge being realized loses a factor of pv when we know an adjacent non-crucial edge already has been realized as that implies the shared incident vertex has been realized as well. In order to avoid this further complexity to the problem, this crucial edge procedure will assign remaining budget to

a crucial edge based on the incident vertices non-crucial edge budget such that no vertex contributes more than size 1 to the whole matching. Specifically, we will set xe of edge e = (u, v) for crucial edges equal to approximately 1 q N v or 1 q N u (with an additional (1 ϵ) factor) in the crucial edge procedure. Additionally, we utilize a probability distribution defined below. Definition 16. Take any matching M of the realized crucial edges in sub-graph Q = (V, EQ) from Algorithm 1. Given G, the appearance probability of M is the probability that M is the exact set of crucial edges that is in both EQ and the maximum matching of G.

Crucial Edge Procedure 1. Draw a matching MC of EQ C based on the appearance probabilities of matchings given G C. 2. For e = (u, v) MC, let xe = (1 ϵ) min{1 q N u , 1 q N v }. Trivially, this procedure makes sure that all vertices v V do not exceed their budget, i.e. xv 1. So, first, we will prove that this procedure still allows fractional matching x to satisfy the constraints of Lemma 6. Lemma 17. U V with |U| 1/ϵ, P e E xe |U|/2 . Second, we show that the size of the fractional matching after the crucial edge procedure is the size we wanted. The proof of the following lemma largely relies on the optimal matching on non-crucial edges from the non-crucial edge procedures as well as a bit of algebra involving matching probabilities and the additional value assignments from the crucial edge procedure. Lemma 18. Given unweighted graph G and fractional matching x from the non-crucial and crucial edge procedures, E[P e EQ xe]/E[µ(G)] (1 2ϵ)(4

Finally, we may prove our first theorem, Theorem 3.

Proof of Theorem 3. Take fractional matching x on Q = (V, EQ) as constructed by the non-crucial edge and crucial edge procedures. By Lemma 17, x satisfies the condition for Lemma 6. Thus, by Lemma 6, there exists an integral matching y on Q with size at least (1 ϵ) times that of x. So, we have that

By Lemma 6.

2 5)E[µ(G)] By Lemma 18. (.6568 ϵ0)E[µ(G)] where ϵ0 = 3(.6568)ϵ.

Unfortunately, it can be shown that this bound is tight for our analysis using the non-crucial and crucial edge procedures as there are examples of graphs where the best approximation using these procedures has ratio (4

2 5) (see (Behnezhad et al. 2019b))

Weighted Approximation For weighted graphs, we will begin similarly to the unweighted graph analysis. The end goal will be to create a fractional matching of non-crucial edges and crucial edges to prove our desired .501 bound. We have slightly changed and improved the results of this section compared to (Behnezhad et al. 2019b) by working with a smaller constant δ in Definition 20. From this update, the approximation ratio is a bit better for the following weighted case analysis. Due the new model, we must again work around the correlation between realization probabilities of adjacent edges in our new model. First, the non-crucial edge procedure will create a near-perfect fractional matching x on non-crucial edges as before. However, a new procedure for crucial edges will be used that will not only assign fractional matching values to crucial edges but may also modify the values in the fractional matching given to non-crucial edges. By lowering the previously assigned non-crucial edge matching values, there will be more budget for crucial edges to take. In weighted graphs, it is possible that crucial edges have high weights and significantly contribute to the maximum matching of G. For this reason, increasing the remaining vertex budget accordingly for crucial edges is necessary. So, the following procedure considers matching probabilities and edge weights when assigning fractional matching values to the crucial edges, and then it modifies the non-crucial fractional matching if needed.

Weighted Crucial Edge Procedure 1. As in the previous crucial edge procedure, draw some matching MC of EQ C according to appearanceprobabilities. For vertex v and constant α, define

g(v, α) := min{q N v , 1 α}

q N v φN v .

2. For e = (u, v) MC, set xe := (1 ϵ) arg max 0 α 1 (g(u, α) + g(v, α) + α we).

3. For any vertex v, if xv > 1, scale down fractional matching of incident non-crucial edges until xv 1. Before analyzing this new procedure, we want to bound φ(N) and φ(C) so that they are easier to work with. To do so, we utilize the following lemma discussing the relation of φ(C) to Algorithm 1 s expected matching size. The proof of the lemma and other missing proofs of this section can be found in Appendix. Lemma 19. Given sub-graph Q = (V, EQ) from Algorithm 1, E[µ(Q)] (1 ϵ)φ(C). Recall that from Procedure and Lemma 15, there is an expected matching of Q of size at least (1 10ϵ)φ(N). From Lemma 19, we know that there is also an expected matching size of Q with size at least (1 ϵ)φ(C). Thus, if either φ(C) or φ(N) is at least .501 E[µ(G)], we have our approximation ratio from Theorem 4. We may now focus on when this is not the case. Since E[µ(G)] = φ(C) + φ(N), we will analyze the case when .499 E[µ(G)] φ(C), φ(N) .501 E[µ(G)].

For this analysis, we will utilize the matching constructed by the weighted crucial edge procedure and further classify crucial edges by edge weights to make claims about this matching s expected weight.

Definition 20. Let δ = .09, then a crucial edge e = (u, v) C is heavy if we (1 + δ)(φN u + φN v ), and we denote the set of heavy edges with H. A crucial edge is semi-heavy if e is not heavy, we 2(1+ δ)φN v , and q N u (1 δ) where q N v q N u . We denote the set of semi-heavy edges with H .

Observe that by definition, the weight of a heavy edge e = (u, v) H is larger than the expected fractional matching of adjacent non-crucial edges. As such, from step 2 of the weighted crucial edge procedure, xe will be maximized when α = 0 and xe = (1 ϵ). Furthermore, step 3 will reduce the fractional matching of adjacent non-crucial edges to 0 so that xu, xv 1. Similarly, for semi-heavy edge e = (u, v) H , by definition the weight of e is greater than the expected non-crucial fractional matching of one of its vertices. Specifically, from step 2 of the procedure we have that xe (1 ϵ)(1 q N u ) (1 ϵ)δ. With these crucial edge bounds, we can prove that if a graph has enough heavy and semi-heavy edges, then will will reach at least the desired .501 approximation ratio.

Lemma 21. If φ(H) + φ(H ) 0.074φ(C), then E[µ(Q)]/E[µ(G)] .501 11ϵ.

Now, we must look at when φ(H) + φ(H ) < 0.074φ(C), i.e. when the heavy and semi-heavy edges do not constitute a large part of the expected maximum matching on crucial edges. We will use C to denote the set of crucial edges that are not heavy nor semi-heavy. We will direct edges of C based on their adjacent non-crucial edges contribution to the maximum matching of G.

Definition 22. We will classify edges e = (u, v) C into the following three types of edges:

1. If φN v φN u , direct e towards v. 2. If φN v < φN u and we 2(1 + δ)φN v , direct e towards v. 3. Else, direct e towards u. Note, in this case φN v < φN u and we > 2(1 + δ)φN v .

Note that edge types 1-3 partition C . Thus, using these definitions and directed edge types, we may prove properties of C and our desired bound for the approximation ratio of sub-graph Q from Algorithm 1.

Theorem 23. Given sub-graph Q from Algorithm 1, E[µ(Q)]/E[µ(G)] .501 11ϵ.

EDCS 2/3 Approximation

In this section we improve our bound for unweighted graphs to 2/3 ϵ for an arbitrary constant ϵ > 0. Inspired by a work of (Assadi and Bernstein 2019) we use edge-degree constrained sub-graph (EDCS) for designing our algorithm. Before stating our result we first give the definition of EDCS from (Bernstein and Stein 2015) and (Bernstein and Stein 2016).

Definition 24 ((Bernstein and Stein 2015, 2016)). For any graph G = (V, E), and integers β β 0, an edgedegree constrained sub-graph(EDCS)(G, β, β ) is a subgraph H = (V, EH) with the following two properties. 1. For every edge (v, u) EH: deg H(v) + deg H(u) β. 2. For every edge (v, u) / EH: deg H(v)+deg H(u) β .

It has been shown in (Bernstein and Stein 2015) and (Bernstein and Stein 2016) that for any graph G, and any parameters β > β , an EDCS of G exists. Also it is easy to see that an EDCS of G is degree-bounded and has a maximum degree of β. An interesting property of EDCS is that for a large enough β and β , it always preserves 2/3 approximation of maximum matching in G. Specifically, we have the following. Theorem 25 ((Assadi and Bernstein 2019)). Let G = (V, E) be any graph, ϵ < 1/2, λ ϵ/32, β 8λ 2 log(1/λ), β (1 λ)β, and Let H be an EDCS(G, β, β ). Then µ(H) (2/3 ϵ)µ(G). A result by (Assadi and Bernstein 2019) shows that for any stochastic graph G where each edge is realized with a probability of pe, an EDCS(G, β, β 1) also preserves a 2/3 ϵ approximation of the expected maximum matching. We show a similar result for the generalized stochastic matching problem where both edges and vertices are stochastic. Specifically, let Q be an EDCS(G, β, β 1) for β C log(1/(ϵ pv pe))

ϵ2pvpe where C is a large constant. Also, Let Q be the realized portion of Q. In the following lemma we show that using the EDCS approach, Q achieves a 2/3 O(ϵ) matching approximation ratio in expectation. The proof of Lemma 26 can be seen in the final section of the Appendix. Lemma 26. E[µ(Q)] (2/3 O(ϵ))E[µ(G)] .

Conclusion We have now proven bounds on the approximation ratio of Algorithm 1 breaking a half-approximation. The natural next step is to improve the ratio up to (1 ϵ). For the old model of stochastic matching, (Behnezhad and Derakhshan 2020) achieved a (1 ϵ) approximation for weighted stochastic matching using 1 complemented by a greedy sub-algorithm. Unfortunately, differences in the models bar us from adapting their new algorithm directly. Previous analysis techniques relied on the complete independence of edge realization, and it seems to us to be nontrivial to overcome this difference.

Acknowledgements This research was supported by the NSF BIGDATA Grant No. 1546108, NSF SPX Grant No. 1822738, NSF AF Grant No. 2114269, and an Amazon AWS award.

References Assadi, S.; and Bernstein, A. 2019. Towards a Unified Theory of Sparsification for Matching Problems. In Fineman, J. T.; and Mitzenmacher, M., eds., 2nd Symposium on

Simplicity in Algorithms, SOSA@SODA 2019, January 8-9, 2019 - San Diego, CA, USA, volume 69 of OASICS, 11:1 11:20. Schloss Dagstuhl - Leibniz-Zentrum f ur Informatik. Assadi, S.; Khanna, S.; and Li, Y. 2016. The Stochastic Matching Problem with (Very) Few Queries. In Conitzer, V.; Bergemann, D.; and Chen, Y., eds., Proceedings of the 2016 ACM Conference on Economics and Computation, EC 16, Maastricht, The Netherlands, July 24-28, 2016, 43 60. ACM. Assadi, S.; Khanna, S.; and Li, Y. 2017. The Stochastic Matching Problem: Beating Half with a Non-Adaptive Algorithm. In Daskalakis, C.; Babaioff, M.; and Moulin, H., eds., Proceedings of the 2017 ACM Conference on Economics and Computation, EC 17, Cambridge, MA, USA, June 26-30, 2017, 99 116. ACM. Behnezhad, S.; and Derakhshan, M. 2020. Stochastic Weighted Matching: $(1-ϵ)$ Approximation. Co RR, abs/2004.08703. Behnezhad, S.; Derakhshan, M.; Farhadi, A.; Hajiaghayi, M.; and Reyhani, N. 2019a. Stochastic Matching on Uniformly Sparse Graphs. In Fotakis, D.; and Markakis, E., eds., Algorithmic Game Theory - 12th International Symposium, SAGT 2019, Athens, Greece, September 30 - October 3, 2019, Proceedings, volume 11801 of Lecture Notes in Computer Science, 357 373. Springer. Behnezhad, S.; Derakhshan, M.; and Hajiaghayi, M. 2020. Stochastic matching with few queries: (1-ϵ) approximation. In Makarychev, K.; Makarychev, Y.; Tulsiani, M.; Kamath, G.; and Chuzhoy, J., eds., Proccedings of the 52nd Annual ACM SIGACT Symposium on Theory of Computing, STOC 2020, Chicago, IL, USA, June 22-26, 2020, 1111 1124. ACM. Behnezhad, S.; Farhadi, A.; Hajiaghayi, M.; and Reyhani, N. 2019b. Stochastic Matching with Few Queries: New Algorithms and Tools. In Chan, T. M., ed., Proceedings of the Thirtieth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2019, San Diego, California, USA, January 6-9, 2019, 2855 2874. SIAM. Behnezhad, S.; and Reyhani, N. 2018. Almost Optimal Stochastic Weighted Matching with Few Queries. In Tardos, E.; Elkind, E.; and Vohra, R., eds., Proceedings of the 2018 ACM Conference on Economics and Computation, Ithaca, NY, USA, June 18-22, 2018, 235 249. ACM. Bernstein, A.; and Stein, C. 2015. Fully Dynamic Matching in Bipartite Graphs. In Halld orsson, M. M.; Iwama, K.; Kobayashi, N.; and Speckmann, B., eds., Automata, Languages, and Programming - 42nd International Colloquium, ICALP 2015, Kyoto, Japan, July 6-10, 2015, Proceedings, Part I, volume 9134 of Lecture Notes in Computer Science, 167 179. Springer. Bernstein, A.; and Stein, C. 2016. Faster Fully Dynamic Matchings with Small Approximation Ratios. In Krauthgamer, R., ed., Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2016, Arlington, VA, USA, January 10-12, 2016, 692 711. SIAM.

Blum, A.; Dickerson, J. P.; Haghtalab, N.; Procaccia, A. D.; Sandholm, T.; and Sharma, A. 2015. Ignorance is Almost Bliss: Near-Optimal Stochastic Matching With Few Queries. In Roughgarden, T.; Feldman, M.; and Schwarz, M., eds., Proceedings of the Sixteenth ACM Conference on Economics and Computation, EC 15, Portland, OR, USA, June 15-19, 2015, 325 342. ACM. Yamaguchi, Y.; and Maehara, T. 2018. Stochastic Packing Integer Programs with Few Queries. In Czumaj, A., ed., Proceedings of the Twenty-Ninth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2018, New Orleans, LA, USA, January 7-10, 2018, 293 310. SIAM.