# differentially_private_link_prediction_with_protected_connections__7eb28d35.pdf Differentially Private Link Prediction with Protected Connections Abir De, Soumen Chakrabarti Indian Institute of Technology Bombay {abir, soumen}@cse.iitb.ac.in Link prediction (LP) algorithms propose to each node a ranked list of nodes that are currently non-neighbors, as the most likely candidates for future linkage. Owing to increasing concerns about privacy, users (nodes) may prefer to keep some of their connections protected or private. Motivated by this observation, our goal is to design a differentially private LP algorithm, which trades off between privacy of the protected node-pairs and the link prediction accuracy. More specifically, we first propose a form of differential privacy on graphs, which models the privacy loss only of those nodepairs which are marked as protected. Next, we develop DPLP, a learning to rank algorithm, which applies a monotone transform to base scores from a non-private LP system, and then adds noise. DPLP is trained with a privacy induced ranking loss, which optimizes the ranking utility for a given maximum allowed level of privacy leakage of the protected node-pairs. Under a recently-introduced latent node embedding model, we present a formal trade-off between privacy and LP utility. Extensive experiments with several real-life graphs and several LP heuristics show that DPLP can trade off between privacy and predictive performance more effectively than several alternatives. 1 Introduction Link prediction (LP) (Liben-Nowell and Kleinberg 2007; L u and Zhou 2011) is the task of predicting future edges that are likely to emerge between nodes in a social network, given historical views of it. Predicting new research collaborators, new Facebook friends and new Linked In connections are examples of LP tasks. LP algorithms usually presents to each node u a ranking of the most promising non-neighbors v of u at the time of prediction. Owing to the growing concerns about privacy (Tucker 2014; Machanavajjhala, Korolova, and Sarma 2011; Waniek et al. 2019), social media users often prefer to mark their sensitive contacts, e.g., dating partners, prospective employers, doctors, etc. as protected from other users and the Internet at large. However, the social media hosting platform itself continues to enjoy full access to the network, and may exploit them for link prediction. In fact, many popular LP algorithms often leak neighbor information (Waniek et al. Copyright 2021, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved. 2019), at least in a probabilistic sense, because they take advantage of rampant triad completion: when node u links to v, very often there exists a node w such that edges (u, w) and (w, v) already exist. Therefore, recommending v to u allows u to learn about the edge (w, v), which may result in breach of privacy of v and w. Motivated by these observations, a recent line of work (Machanavajjhala, Korolova, and Sarma 2011; Xu et al. 2018) proposes privacy preserving LP methods. While they introduce sufficient noise to ensure a specified level of privacy, they do not directly optimize for the LP ranking utility. Consequently, they show poor predictive performance. 1.1 Present Work In this work, we design a learning to rank algorithm which aims to optimize the LP utility under a given amount of privacy leakage of the node-pairs (edges or non-edges) which are marked as protected. Specifically, we make the following contributions. Modeling differential privacy of protected node pairs. Differential privacy (DP) (Dwork and Roth 2014) is concerned with privacy loss of a single entry in a database upon (noisy) datum disclosure. As emphasized by Dwork and Roth (2014, Section 2.3.3), the meaning of single entry varies across applications, especially in graph databases. DP algorithms over graphs predominantly consider two notions of granularity, e.g., node and edge DP, where one entry is represented by a node and an edge respectively. While node DP provides a stronger privacy guarantee than edge DP, it is often stronger than what is needed for applications. As a result, it can result in unacceptably low utility (Song et al. 2018; Dwork and Roth 2014). To address this challenge, we adopt a variant of differential privacy, called protected-pair differential privacy, which accounts for the loss of privacy of only those node-pairs (edges or non edges), which are marked as private. Such a privacy model is well suited for LP in social networks, where users may hide only some sensitive contacts and leave the remaining relations public (Dwork and Roth 2014). It is much stronger than edge DP, but weaker than node DP. However, it serves the purpose in the context of our current problem. Maximizing LP accuracy subject to privacy constraints. Having defined an appropriate granularity of privacy, we develop DPLP, a perturbation wrapper around a base LP algo- The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21) rithm, that renders it private in the sense of node-pair protection described above. Unlike DP disclosure of quantities, the numeric scores from the base LP algorithm are not significant in themselves, but induce a ranking. This demands a fresh approach to designing DPLP, which works in two steps. First, it transforms the base LP scores using a monotone deep network (Wehenkel and Louppe 2019), which ensures that the ranks remain unaffected in absence of privacy. Noise is then added to these transformed scores, and the noised scores are used to sample a privacy-protecting ranking over candidate neighbors. DPLP is trained using a ranking loss (AUC) surrogate. Effectively, DPLP obtains a sampling distribution that optimizes ranking utility, while maintaining privacy for protected pairs. Extensive experiments show that DPLP outperforms state-of-the-art baselines (Machanavajjhala, Korolova, and Sarma 2011; Mc Sherry and Talwar 2007; Geng and Viswanath 2015) in terms of predictive accuracy, under identical privacy constraints1. Bounding ranking utility subject to privacy. Finally, we theoretically analyze the ranking utility provided by three LP algorithms (Liben-Nowell and Kleinberg 2007), viz., Adamic-Adar, Common Neighbors and Jaccard Coefficient, in presence of privacy constraints. Even without privacy constraints, these and other LP algorithms are heuristic in nature and can produce imperfect rankings. Therefore, unlike Machanavajjhala, Korolova, and Sarma (2011), we model utility not with reference to a non-DP LP algorithm, but a generative process that creates the graph. Specifically, we consider the generative model proposed by Sarkar, Chakrabarti, and Moore (2011), which establishes predictive power of popular (non-DP) LP heuristics. They model latent node embeddings in a geometric space (Hoff, Raftery, and Handcock 2002) as causative forces that drive edge creation but without any consideration for privacy. In contrast, we bound the additional loss in ranking utility when privacy constraints are enforced and therefore, our results change the setting and require new techniques. 1.2 Related Work The quest for correct privacy protection granularity has driven much work in the DP community. Kearns et al. (2015) propose a model in which a fraction of nodes are completely protected, whereas the others do not get any privacy assurance. This fits certain applications e.g., epidemic or terrorist tracing, but not LP, where a node/user may choose to conceal only a few sensitive contacts (e.g., doctor or partner) and leave others public. Edge/non-edge protection constraints can be encoded in the formalism of group (Dwork 2011), Pufferfish (Song, Wang, and Chaudhuri 2017; Kifer and Machanavajjhala 2014; Song and Chaudhuri 2017) and Blowfish (He, Machanavajjhala, and Ding 2014) privacy. However, these do not focus on LP tasks on graphs. Chierichetti et al. (2015) partition a graph into private and public subgraphs, but for designing efficient sketching and sampling algorithms, not directly related to differentially private LP objectives. In particular, they do not use any information from private graphs; however, a differentially private 1Code: https://www.cse.iitb.ac.in/ abir/codes/dplp.zip algorithm does use private information after adding noise into it. Moreover, their algorithms do not include any learning to rank method for link prediction. Abadi et al. (2016) design a DP gradient descent algorithm for deep learning, which however, does not aim to optimize the associated utility. Moreover, they work with pointwise loss function, whereas our objective is a pairwise loss. Other work (Geng and Viswanath 2015; Geng et al. 2018; Ghosh, Roughgarden, and Sundararajan 2012) aims to find optimal noise distribution for queries seeking real values. However, they predominantly use variants of noise power in their objective, whereas we use ranking loss, designed specifically for LP tasks. 2 Preliminaries In this section, we first set up the necessary notation and give a brief overview of a generic non-private LP algorithm on a social network. Then we introduce the notion of protected and non-protected node pairs. 2.1 Social Network And LP Algorithm We consider a snapshot of the social network as an undirected graph G = (V, E) with vertices V and edges E. Neighbors and non-neighbors of node u are denoted as the sets nbr(u) and nbr(u), respectively. Given this snapshot, a non-private LP algorithm A first implements a scoring function s A(u, ) : nbr(u) R+ and then sorts the scores s A(u, v) of all the non-neighbors v nbr(u) in decreasing order to obtain a ranking of the potential neighbors for each query node u. Thus, the LP algorithm A provides a map: πA u : {1, 2, . . . , | nbr(u)|} 1:1 nbr(u). Here, πA u (i) represents the node at rank i, recommended to node u by algorithm A. For brevity, we sometimes denote u A i = πA u (i). Note that, although πA u can provide ranking of all candidate nodes, most LP applications consider only top K candidates for some small value of K. When clear from context, we will use πA u to denote the top-K items. 2.2 Protected And Non-protected Node-pairs We assume that each node u V partitions all other nodes V \{u} into a protected node set prot(u) and a nonprotected/public node set pub(u). We call (u, w) a protected node-pair if either w marks u as protected, i.e., u prot(w) or vice-versa, i.e., w prot(u). In general, any node u can access three sets of information: (i) its own connections with other nodes, i.e., nbr(u) and nbr(u); (ii) the possible connections between any node w (not necessarily a neighbor) with nodes pub(w) not protected by w; and, (iii) the ranked list of nodes recommended to itself by any LP algorithm. Note that a node-pair (u, w) is accessible to u even if u prot(w), since u knows its own edges (and non-edges). On the other hand, if a third node v V \{u, w} is marked as protected by w, i.e., v prot(w), then u should not know whether v nbr(w) or v nbr(w). However, when a nonprivate LP algorithm recommends nodes to u, u can exploit them to reverse-engineer such protected links. For example, if w is connected to both u and v, then a triad-completion based LP heuristic (Adamic-Adar or Common-Neighbor) is can access u cannot access u u ) | is edge u ) | is non-edge di erentially private 2 prot( 2 prot( 2 prot( 2 pub( 2 pub( 2 pub( Pr( Pr( Pr( Pr( Pr( Pr( , (c) (a) (b) : 6= u} { A : 6= u} { A Protected pair Figure 1: Problem setup. (a) Nodes in prot(w) colored red; nodes in pub(w) colored green. (b) u cannot see red edges (solid), red non-edges (dotted) and recommendations (πA , = u) to other nodes. u can see green edges (solid), green non-edges (dotted) and recommendation (πA u ) to itself. (c) Output of A should be insensitive to protected pairs. likely to recommend v to the node u, which in turn allows u to guess the presence of the edge (v, w). Figure 1 illustrates our problem setup. 3 Privacy Model For Protected Pairs In this section, we first present our privacy model which assesses the privacy loss of protected node-pairs, given the predictions made by an LP algorithm. Then we introduce the sensitivity function, specifically designed for graphs with protected pairs. 3.1 Protected-pair Differential Privacy In order to the prevent privacy leakage of the protected connections, we aim to design a privacy preserving LP algorithm A based on A, meaning that A uses the scores s A(u, ) to present to each node u a ranking πA u , while maintaining the privacy of all the protected node pairs in the graph, where one of the two nodes in such a pair is not u itself. We expect that u has strong reasons to link to the top nodes in πA u , and that the same list will occur again with nearly equal probability, if one changes (adds or deletes) edges between any other node w and the protected nodes of w, other than u, i.e., prot(w)\{u}. We formally define such changes in the graph in the following. Definition 1 (u-neighboring graphs). Given two graphs G = (V, E) and G = (V, E ) with the same node set V , and a node u V , we call G, G as u-neighboring graphs, if there exists some node w = u (not necessarily a neighbor), such that 1. protected nodes prot(w) are the same in G and G , 2. public nodes pub(w) are the same in G and G , and 3. E \{(w, v) | v prot(w)\u}=E\{(w, v) | v prot(w)\u}. The set of all u-neighboring graph pairs is denoted as Nu. In words, the last requirement is that, for the purpose of recommendation to u, the information not protected by w is the same in G and G . u-neighboring graphs may differ only in the protected pairs incident to some other node w, except (w, u). Next, we propose our privacy model which characterizes the privacy leakage of protected node-pairs in the context of link prediction. Definition 2 (Protected-pair differential privacy). We are given a non-private base LP algorithm A, and a randomized LP routine A based on A, which returns a ranking of K nodes. We say that A ensures ϵp-protected-pair differential privacy , if, for all nodes u V and for all ranking RK on the candidate nodes, truncated after position K, we have Pr(πA u = RK | A, G) eϵp Pr(πA u = RK | A, G ), whenever G and G are u-neighboring graphs. In words, A promises to recommend the same list to u, with nearly the same probability, even if we arbitrarily change the protected partners of any node w, except for u itself, as that information is already known to u. 3.2 Graph Sensitivity Recall that A recommends v to u based on a score s A(u, v). A key component of our strategy is to apply a (monotone) transform f(s A(u, v)) to balance it more effectively against samples from a noise generator. To that end, given base LP algorithm A, we define the sensitivity of any function on the scores f(s A(u, )) : V R+, which will be subsequently used to design any protected-pair differentially private LP algorithm. Definition 3 (Sensitivity of f s A). The sensitivity f,A of a transformation f on the scoring function s A is defined as f,A = max u max (G,G ) Nu max v f(s A(u, v|G)) f(s A(u, v|G )) (1) Recall Nu represents all u-neighboring graph pairs. 4 Design of Privacy Protocols for LP In this section, we first state our problem of designing a privacy preserving LP algorithm A based on a non-private LP algorithm A which aims to provide an optimal tradeoff between privacy leakage of the protected node pairs and LP accuracy. Next, we develop DPLP, our proposed algorithm which solves this problem, 4.1 Problem Statement Privacy can always be trivially protected by ignoring private data and sampling each node uniformly, but such an LP algorithm has no predictive value. Hence, it is important to design a privacy preserving LP algorithm which can also provide a reasonable LP accuracy. To this end, given a base LP algorithm A, our goal is to design a protected-pair differentially private algorithm A based on A, which maximizes the link prediction accuracy. A uses random noise, and induces a distribution πA u over rankings presented to u. Specifically, our goal is solve the following optimization problem: maximize A P u V ERK Pr(πA u |A,G)AUC(RK) (2) s.t. A is Kϵp protected-pair differentially private. Here, AUC(RK) measures the area under the ROC curve given by RK the ranked list of recommended nodes truncated at K. Hence, the above optimization problem maximizes the expected AUC over the rankings generated by the randomized protocol A, based on the scores produced by A. The constraint indicates that predicting a single link using A requires to satisfy ϵp-protected-pair differential privacy. 4.2 DPLP: A Protected-pair Differentially Private Link Predictor A key question is whether well-known generic privacy protocols like Laplace or exponential noising are adequate solvers of the optimization problem in Eq. (2), or might a solution tailored to protected-pair differential privacy in LP do better. We answer in the affirmative by presenting DPLP, a novel DP link predictor, with five salient components. 1. Monotone transformation of the original scores from A. 2. Design of a ranking distribution πA u , which suitably introduces noise to the original scores to make a sampled ranking insensitive to protected node pairs. 3. Re-parameterization of the ranking distribution to facilitate effective training. 4. Design of a tractable ranking objective that approximates AUC in Eq. (2). 5. Sampling current non-neighbors for recommendation. Monotone transformation of base scores. We equip A with a trainable monotone transformation f : R+ R+, which is applied to base scores s A(u, ). Transform f helps us more effectively balance signal from A with the privacyassuring noise distribution we use. We view this transformation from two perspectives. First, f(s A(u, )) can be seen as a new scoring function which aims to boost the ranking utility of the original scores s A(u, ), under a given privacy constraint. In the absence of privacy constraints, we can reasonably demand that the ranking quality of f s A be at least as good as s A. In other words, we expect that the transformed and the original scores provide the same ranking utility, i.e., the same list of recommended nodes, in the absence of privacy constraints. A monotone f ensures such a property. Second, as we shall see, f can also be seen as a probability distribution transformer. Given a sufficiently expressive neural representation of f (as designed in Section 4.3), any arbitrary distribution on the original scores can be fitted with a tractable known distribution on the transformed scores. This helps us build a high-capacity but tractable ranking distribution Pr(πA u | A, G), as described below. Design of a ranking distribution. Having transformed the scores s A(u, .) into f(s A(u, v)), we draw nodes v using an iterative exponential mechanism on the transformed scores f(s A(u, v)). Specifically, we draw the first node v with probability proportional to exp (ϵpf(s A(u, v))/2 f,A), then repeat for collecting K nodes in all. Thus, our ranking distribution in Eq. (2) is: Pr(πA u = RK|A, G) = exp ϵpf(s A(u, RK(j))) P w RK(1,..,j 1) exp ϵpf(s A(u, w)) In the absence of privacy concerns (ϵp ), the candidate node with the highest score f(s A(u, .)) gets selected in each step, and the algorithm reduces to the original nonprivate LP algorithm A, thanks to the monotonicity of f. On the other hand, if ϵp 0, then every node has an equal chance of getting selected, which preserves privacy but has low predictive utility. Indeed, our ranking distribution in Eq. (3) follows an exponential mechanism on the new scores f(s A(u, )). However, in principle, such a mechanism can capture any arbitrary ranking distribution Pr(πA u |A, G), given sufficient training and expressiveness of f. Finally, we note that f can be designed to limit f,A. E.g., if we can ensure that f( ) is positive and bounded by B, then we can have f,A B; if we can ensure that the derivative f ( ) is bounded by B , then, by the Lipschitz property, f,A B A. Re-parameterization of the ranking distribution. In practice, the above-described procedure for sampling a top-K ranking results in high estimation variance during training (Zhao et al. 2011; Marbach and Tsitsiklis 2003; Peters and Schaal 2006; Sehnke et al. 2010). To overcome this problem, we make a slight modification to the softmax method. We first add i.i.d Gumbel noise ηu,v GUM(2 f,A/ϵp) to each score f(s A(u, v)) and then take top K nodes with highest noisy score (Durfee and Rogers 2019, Lemma 4.2). Such a distribution allows an easy reparameterization trick GUM(2 f,A/ϵp) = (2 f,A/ϵp) GUM(1) which reduces the parameter variance. With such a re-parameterization of the softmax ranking distribution in Eq. (3), we get the following alternative representation of our objective in Eq. (2): u V E {ηu, } GUM(1) AUC RK f(s A(u, ) + 2 f,Aηu, where RK(.) gives a top-K ranking based on the noisy transformed scores {f(s A(u, ) + 2 f,Aηu, /ϵp} over the candidate nodes for recommendation to u. Designing a tractable ranking objective. Despite the aforementioned re-parameterization trick, standard optimization tools face several challenges to maximize the above objective. First, optimizing AUC is an NP-hard problem. Second, truncating the list up to K nodes often pushes many neighbors out of the list at the initial stages of learning, which can lead to poor training (Liu 2009). To overcome these limitations, we replace AUC (Gao and Zhou 2015) with a surrogate pairwise hinge loss ℓ( ) and consider a nontruncated list R|V | 1 of all possible candidate nodes during training (Gao and Zhou 2015; Joachims 2005). Hence our Algorithm 1 DPLP: Learns A based on a non-private LP algorithm A and then uses it to recommend K nodes to u. 1: Input G = (V, E); protected and non-protected node-sets prot( ), pub( ); scores s A(., .) given by A; privacy leakage ϵp, margin ϱ. 2: Output RK : List of top-K recommended nodes to node u 3: Initialize RK(1 . . . K) 4: f TRAIN(G, {s A(w, v) | v pub(w), u V }, ϵp, ϱ) 5: candidates nbr(u) 6: for j in 1 . . . K do 7: w SOFTMAX ({ϵpf(s A(u, v)/2 f,A | v candidates}) 8: candidates candidates\{w} 9: RK(j) w 10: Return RK optimization problem becomes: min f P u V E{ηu, } GUM(1)ℓ({f(s A(u, )), ηu } ; G) (4) where ℓ({f(s A(u, )), ηu } ; G) is given by a pairwise ranking loss surrogate over the noisy transformed scores: X g nbr(u) pub(u) b nbr(u) pub(u) Re LU ϱ + f(s A(u, b)) + 2 f,Aηub f(s A(u, g)) 2 f,Aηug The above loss function encourages that the (noisy) transformed score of a neighbor g exceeds the corresponding score of a non-neighbor b by at least (a tunable) margin ϱ. In absence of privacy concern, i.e., ϵp , it is simply Re LU [ϱ + f(s A(u, b)) f(s A(u, g))], which would provide the same ranking as A, due to the monotonicity of f. We would like to point out that the objective in Eq 5 uses only non-protected pairs {(u, v)|v pub(u)} to train f. This ensures no privacy is leaked during training. Sampling nodes for recommendation. Once we train f by solving the optimization problem in Eq. (4), we draw top-K candidate nodes using the trained f. Specifically, we first sample a non-neighbor v with probability proportional to exp (ϵpf(s A(u, v))/2 f,A) and then repeat the same process on the remaining non-neighbors nbr(u)\{v} and continue K times. Overall algorithm. We summarize the overall algorithm in Algorithm 1, where the TRAIN( ) routine solves the optimization problem in Eq. (4). Since there is no privacy leakage during training, the entire process from training to test ensures Kϵp protected-pair differential privacy, which is formalized in the following proposition and proven in (De and Chakrabarti 2020). Proposition 4. Algorithm 1 implements Kϵp-protected-pair differentially private LP. Incompatibility with node and edge DP. Note that, our proposal is incompatible with node or edge privacy (where each node or edge is protected). In those cases, objective in Eq. (5) cannot access any data in the training graph. However, our proposal is useful in many applications like online social networks, in which, a user usually protects a fraction of his sensitive connections, while leaving others public. 4.3 Structure Of The Monotonic Transformation We approximate the required nonlinear function f with fθ, implemented as a monotonic neural network with parameters θ. Initial experiments suggested that raising the raw score s A(u, v) to a power a > 0 provided a flexible trade-off between privacy and utility over diverse graphs. We provide a basis set {ai > 0} of fixed powers and obtain a positive linear combination over raw scores s A(u, ) raised to these powers. More specifically, we compute: νβ(s A(u, v)) = i=1 eτβi(s A(u, v))ai, (6) where (ai)i [na] > 0 are fixed a priori. β = (βi)i [na] are trainable parameters and τ is a temperature hyperparameter. This gave more stable gradient updates than letting a float as a variable. While using νβ(s A(u, v)) as f(s A(u, v)) is already an improvement upon raw score s A(u, v), we propose a further refinement: we feed νβ(s A(u, v)) to a Unconstrained Monotone Neural Network (UMNN) (Wehenkel and Louppe 2019) to finally obtain fθ(s A(u, v)). UMNN is parameterized using the integral of a positive nonlinear function: fθ(s A(u, v)) = Z νβ(s A(u,v)) 0 gφ(s)ds + b0, (7) where gφ(.) is a positive neural network, parameterized by φ. UMNN offers an unconstrained, highly expressive parameterization of monotonic functions, since the monotonicity herein is achieved by only enforcing a positive integrand gφ( ). In theory, with sufficient training and capacity, a universal monotone transformation may absorb νβ( ) into its input stage, but, in practice, we found that omitting νβ( ) and passing s A(u, v) directly to UMNN gave poorer results (De and Chakrabarti 2020). Therefore, we explicitly provide νβ(s A(u, v)) as an input to it, instead of the score s A(u, v) itself. Overall, we have θ = {β, φ, b0} as the trainable parameters. 5 Privacy-accuracy Trade-off Of DPLP It is desirable that a privacy preserving LP algorithm should be able to hide the protected node-pairs as well as give high predictive accuracy. However, in principle, LP accuracy is likely to deteriorate with higher privacy constraint (lower ϵp). For example, a uniform ranking distribution, i.e., ϵp 0 provides extreme privacy. However, it would lead to a poor prediction. On the other hand, in absence of privacy, i.e., when ϵp , the algorithm enjoys an unrestricted use of the protected pairs, which is likely to boost its accuracy. In this section, we analyze this trade-off from two perspectives. 1. Relative trade-off analysis. Here, we assess the relative loss of prediction quality of A with respect to the observed scores of the the base LP algorithm A. 2. Absolute trade-off analysis. Here, we assess the loss of absolute prediction quality of A with respect to the true utility provided by a latent graph generative process. 5.1 Relative Trade-off Analysis Like other privacy preserving algorithms, DPLP also introduces some randomness to the base LP scores, and therefore, any DPLP algorithm A may reduce the predictive quality of A. To formally investigate the loss, we quantify the loss in scores suffered by A, compared to A: γu(A, ϵp) = EA i [K] s A(u, u A i |G) X i [K] s A(u, u A i |G) . Recall that u A i = πA u (i) and u A i = πA u (i). We do not take absolute difference because the first term cannot be smaller than the second. The expectation is over randomness introduced by the algorithm A. The following theorem bounds γu(A, ϵp) in two extreme cases (Proven in (De and Chakrabarti 2020)). Proposition 5. If ϵp is the privacy parameter and κu,A := maxi [|V | 1]{s A(u, u A i ) s A(u, u A i+1)}, then we have: lim ϵp γu(A, ϵp) = 0 and, lim ϵp 0 γu(A, ϵp) Kκu,A. Here, κu,A gives maximum difference between the scores of two consecutive nodes in the ranking provided by A to u. 5.2 Absolute Trade-off Analysis Usually, A provides only imperfect predictions. Hence, the above trade-off analysis which probes into the relative cost of privacy with respect to A, may not always reflect the real effect of privacy on the predictive quality. To fill this gap, we need to analyze the absolute prediction quality utility in terms of the true latent generative process which is harder to analyze. Even for a non-private LP algorithms A, absolute quality is rarely analyzed, except for Sarkar, Chakrabarti, and Moore (2011), which we discuss below. Latent space network model. We consider the latent space model proposed by Sarkar, Chakrabarti, and Moore (2011), which showed why some popular LP methods succeed. In their model, the nodes lie within a D-dimensional hypersphere having unit volume. Each node u has a co-ordinate xu within the sphere, drawn uniformly at random. Given a parameter r > 0, the latent generative rule is that nodes u and v get connected if the distance duv = ||xu xv||2 < r. Relating ranking loss to absolute prediction error. If an oracle link predictor could know the underlying generative process, then its ranking π u would have sorted the nodes in the increasing order of distances du . An imperfect nonprivate LP algorithm A does not have access to these distances and therefore, it would suffer from some prediction error. A, based on A, will incur an additional prediction error due to its privacy preserving randomization. We quantify this absolute loss by extending our loss function in Eq. (4): ℓ(πA u ; π u) := P i