# unsupervised_adversarially_robust_representation_learning_on_graphs__113c625c.pdf

Unsupervised Adversarially Robust Representation Learning on Graphs

Jiarong Xu1, Yang Yang1 , Junru Chen1, Xin Jiang3, Chunping Wang2, Jiangang Lu1, Yizhou Sun3

1Zhejiang University 2Fin Volution Group 3University of California, Los Angeles {xujr, yangya, jrchen_cali, lujg}@zju.edu.cn, jiangxjames@ucla.edu, wangchunping02@xinye.com, yzsun@cs.ucla.edu

Unsupervised/self-supervised pre-training methods for graph representation learning have recently attracted increasing research interests, and they can be generalized to various downstream applications. Yet, the adversarial robustness of such pre-trained graph learning models remains largely unexplored. More importantly, most existing defense techniques for endto-end graph representation learning methods require prespeciﬁed label deﬁnitions, and thus cannot be directly applied to the pre-training methods. In this paper, we propose an unsupervised defense technique to robustify pre-trained deep graph models, so that the perturbations on the input graph can be successfully identiﬁed and blocked before the model is applied to diﬀerent downstream tasks. Speciﬁcally, we introduce a mutual information-based measure, graph representation vulnerability (GRV), to quantify the robustness of graph encoders on the representation space. We then formulate an optimization problem to learn the graph representation by carefully balancing the trade-oﬀbetween the expressive power and the robustness (i.e., GRV) of the graph encoder. The discrete nature of graph topology and the joint space of graph data make the optimization problem intractable to solve. To handle the above diﬃculty and to reduce computational expense, we further relax the problem and thus provide an approximate solution. Additionally, we explore a provable connection between the robustness of the unsupervised graph encoder and that of models on downstream tasks. Extensive experiments demonstrate that even without access to labels and tasks, our model is still able to enhance robustness against adversarial attacks on three downstream tasks (i.e., node classiﬁcation, link prediction, and community detection) by an average of +16.5% compared with existing methods.

1 Introduction Graphs, a common mathematical abstraction for modeling pairwise interactions between objects, are widely applied in numerous domains, including bioinformatics, social networks, and ﬁnance. Owing to their prevalence, deep learning on graphs, such as graph neural networks (GNNs) (Kipf et al. 2017; Hamilton et al. 2017), have recently undergone rapid development, and made major progress in various analytical tasks, including node classiﬁcation (Kipf et al. 2017; Hamilton et al. 2017), link prediction (Kipf et al. 2016), and graph

Corresponding author. Copyright 2022, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

graph representation

adversarial risk

S = (A, X) Z Y

graph encoder

input space representation space label space

node classification

link prediction

community detection

Figure 1: Overview of a graph pre-training pipeline under adversarial attacks. If the graph encoder is vulnerable to the attacks, the adversarial risk would propagate to every downstream task via the perturbed graph representation.

classiﬁcation (Xu et al. 2019b). However, most deep learning models on graphs are trained with task-speciﬁc labels in an end-to-end manner for a particular task. This motivates some recent eﬀorts to pre-train an expressive graph encoder on unlabeled data and further feed the learned representations to (supervised/unsupervised) oﬀ-the-shelf machine learning models for relevant downstream tasks (Hu et al. 2019, 2020; Qiu et al. 2020). The pre-training models on graphs enable the learned representations to be directly applicable to different applications with a simple and inexpensive machine learning model attached after the encoded representations. Despite the promising results achieved by deep learning models on graphs, recent studies have shown that these models are vulnerable to adversarial attacks (Dai et al. 2018; Zügner et al. 2019a; Bojchevski et al. 2019a; Xu et al. 2020b; Zheng et al. 2021). In other words, even imperceptible perturbations on graph topology and node attributes can signiﬁcantly aﬀect the learned graph representation, thereby degrading the performance of downstream tasks (Chen et al. 2020; Xu et al. 2020c). This so-called adversarial vulnerability has given rise to tremendous concerns regarding the utilization of deep learning models on graphs, especially in security-critical applications such as drug discovery (Gilmer et al. 2017) and ﬁnancial surveillance (Yang et al. 2019, 2021). However, the adversarial vulnerability of pre-training models on graphs is far overlooked. In this work, we show that graph pre-training models also suﬀer from the adversarial vulnerability problem. Actually, owing to the complicated and deep structure, the graph encoder is more vulner-

The Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI-22)

able to adversarial attacks than the simple machine learning models used for downstream tasks in a graph pre-training pipeline (Tanay et al. 2016). As Figure 1 shows, once the graph encoder is vulnerable to adversarial attacks, the adversarial risk would propagate to every task via the perturbed representations. Most eﬀorts targeted at this adversarial vulnerability problem focus on supervised, end-to-end models designed for a particular application scenario (Zügner et al. 2019b; Bojchevski et al. 2019b; Wang et al. 2019; Jin et al. 2020; Wang et al. 2021; Xu et al. 2021). However, the dependency on the supervised information largely limits the scope of their application and usefulness. For example, these models do not perform well on downstream tasks in which training labels are missing, e.g., community detection in social networks. In addition, training multiple models for diﬀerent downstream tasks is both costly and insecure (Feurer et al. 2015). In contrast, robust unsupervised pre-training models can easily handle the above issues because adversarial attacks are identiﬁed and blocked before propagating to downstream tasks. Moreover, these models are applicable to a more diverse group of applications, including node classiﬁcation, link prediction, and community detection. Unfortunately, however, these robust pre-training models under the unsupervised setting still remain largely unexplored. There are many interesting yet challenging questions in this new ﬁeld of research. Conventionally, the robustness of a model is deﬁned based on the label space (Xu et al. 2020a; Zügner et al. 2019b; Bojchevski et al. 2019b), which is not the case in our setting. Thus the ﬁrst diﬃculty we meet is to quantify the robustness of an unsupervised model (without the knowledge of the true or predicted labels). To overcome the above challenge, in this paper, we ﬁrst introduce the graph representation vulnerability (GRV), an information theoretic-based measure used to quantify the robustness of a graph encoder. We then formulate an optimization problem to study the trade-oﬀbetween the expressive power of a graph encoder and its robustness to adversarial attacks, measured in GRV. However, how to eﬃciently compute or approximate the objective of the optimization problem becomes the next issue. First, it remains a big problem on how to describe the ability of the attack strategies or the boundary of perturbations, because adversarial attacks on graphs perturb both the discrete graph topology and the continuous node attributes. Second, the rigorous deﬁnition of the objective is intractable. To handle the above issues, we ﬁrst quantify the ability of adversarial attacks using Wasserstein distance between probability distributions, and provide a computationally eﬃcient approximation for it. We then adopt a variant of projected gradient descent method to solve the proposed optimization problem eﬃciently. A sub-optimal solution for the problem gives us a well-qualiﬁed and robust graph encoder. Last but not least, we explore several interesting theoretical connections between the proposed measure of robustness (GRV) and the classiﬁer robustness based on the label space. To show the practical usefulness of our model, we apply the learned representations to three diﬀerent downstream tasks. Experimental results reveal that under adversarial attacks,

our model beats the best baseline by an average of +1.8%, +1.8%, and +45.8% on node classiﬁcation, link prediction, and community detection task, respectively.

2 Preliminaries and Notations

In most cases, we use upper-case letters (e.g., 푋and 푌) to denote random variables and calligraphic letters (e.g., X and Y) to denote their support, while the corresponding lower-case letters (e.g., 풙and 풚) indicate the realizations of these variables. We denote the probability distributions of the random variables using subscripts (e.g., 휇푋and 휇푌) and the corresponding empirical distributions with hat accents (e.g., ˆ휇푋and ˆ휇푌). We use bold upper-case letters to represent matrices (e.g., A). When indexing the matrices, A푖푗denotes the element at the 푖-th row and the 푗-th column, while A푖 represents the vector at the 푖-th row. Let (X, 푑) denote the metric space, where 푑: X X R is a distance function on X. The set of all probability measures on X is M(X). We assume a generic unsupervised graph representation learning setup. In brief, we are provided with an undirected and unweighted graph G = (V, E) with the node set V = {푣1, 푣2, ..., 푣|V|} and edge set E V V = {푒1, 푒2, ..., 푒|E|}. We are also provided with the adjacency matrix A {0, 1}|V| |V| of the graph G, a symmetric matrix with elements A푖푗= 1 if (푣푖, 푣푗) E or 푖= 푗, and A푖푗= 0 otherwise. We augment G with the node attribute matrix X R|V| 푐if nodes have attributes. Accordingly, we deﬁne our input as 풔= (풂, 풙) S; thus, we can conceive of 풙as the attribute matrix and 풂as the adjacency matrix of G under a transductive learning setting, while 풂and 풙are the adjacency matrix and attribute matrix respectively of a node s subgraph under an inductive learning setting. We deﬁne an encoder 푒: S Z, which maps an input 풔= (풂, 풙) S to a representation 푒(풂, 풙) Z, and a simple machine learning model 푓: Z Y that maps a representation 풛 Z to a label 푓(풛) Y. We go on to deﬁne 푔= 푓 푒as their composition, such that ( 푓 푒)(풂, 풙) = 푓(푒(풂, 풙)). The mutual information between two random variables 푋and 푌is denoted by I(푋;푌). Speciﬁcally, it is deﬁned as the Kullback Leibler divergence between the joint distribution 푝(풙, 풚) and the product of the marginal distributions 푝(풙)푝(풚).

Admissible perturbations on graphs. The Wasserstein distance can be conceptualized as an optimal transport problem: we wish to transport the mass with probability distribution 휇푆 into another distribution 휇푆 at the minimum cost. Formally, the 푝-th Wasserstein distance between 휇푆and 휇푆 is

푊푝= (휇푆, 휇푆 ) = inf 휋 Π(휇푆,휇푆 )

S2 푑(풔, 풔 ) 푑휋(풔, 풔 ) 1/푝 ,

where Π(휇푆, 휇푆 ) denotes the collection of all measures on S S with marginal 휇푆and 휇푆 , respectively. The choice of -Wasserstein distance (i.e., 푝= ) is conventional in learning graph representations (Champion et al. 2008). Based on -Wasserstein distance, we can quantify the ability of the adversarial attacks. An attack strategy is modeled as a probability distribution close to that of 푆= (퐴, 푋), and all possible attack strategies stay in a ball around the

genuine distribution 휇푆, with a pre-deﬁned budget 휏> 0:

B (휇푆, 휏) = {휇푆 M(푆) : 푊 (휇푆, 휇푆 ) 휏}.

3 Graphs Representations Robust to Adversarial Attacks

In a widely adopted two-phase graph learning pipeline, the ﬁrst step is to pre-train a graph encoder 푒(without the knowledge of any labels), which maps the joint input space S (i.e., the graph topology A and node attributes X) into some, usually lower-dimensional, representation space Z. Then the encoded representation is used to solve some target tasks. In this section, we explain how to obtain a well-qualiﬁed graph representation robust to adversarial attacks. We ﬁrst propose a measure to quantify the robustness without label information in 3.1. In 3.2, we formulate an optimization problem to explore the trade-oﬀbetween the expressive power and the robustness of the graph encoder. We then describe every component in the proposed optimization problem, and explain how we obtain a sub-optimal solution eﬃciently in 3.3.

3.1 Quantifying the Robustness of Graph Representations

In this section, we propose the graph representation vulnerability (GRV) to quantify the robustness of an encoded graph representation. Intuitively, the learned graph representation is robust if its quality does not deteriorate too much under adversarial attacks. Now we introduce in detail how to measure the quality of representations using MI, and how to describe the diﬀerence of representation quality before and after adversarial attacks.

The use of mutual information. A fundamental challenge to achieving a qualiﬁed graph representation is the need to ﬁnd a suitable objective that guides the learning process of the graph encoder. In the case of unsupervised graph representation learning, the commonly used objectives are random walk-based (Perozzi et al. 2014; Grover et al. 2016) or reconstruction-based (Kipf et al. 2016). These objectives impose an inductive bias that neighboring nodes have similar representations. However, the inductive bias is easy to break under adversarial attacks (Jin et al. 2020; Entezari et al. 2020), because the connections among local neighborhoods are prone to be broken under adversarial attacks. As an alternative solution, we turn to maximize the MI between the input attributed graph and the representation output by the encoder, i.e., I(푆; 푒(푆)), from a more global view. In our case, maximizing the I(푆; 푒(푆)) encourages the representations to be maximally informative about the input graph and to avoid the above-mentioned inductive bias.

Graph representation vulnerability. In addition to the measure of the quality of a graph representation, we also need to describe the robustness of a representation. Intuitively, an encoder is called robust if the value of the objective stays relatively stable after tiny perturbations on the input. Therefore, the encoder is robust if the MI before and after attack are close to each other. Thus, we propose the graph repre-

sentation vulnerability (GRV) to quantify this diﬀerence:

GRV휏(푒) = I(푆; 푒(푆)) inf 휇푆 B (휇푆,휏) I(푆 ; 푒(푆 )), (1)

where 푆= (퐴, 푋) is the random variable following the benign data distribution, and 푆 = (퐴 , 푋 ) follows the adversarial distribution. The ﬁrst term I(푆; 푒(푆)) in (1) is the MI between the benign graph data and the encoded representation, while the term I(푆 , 푒(푆 )) uses the graph data after attack. The attack strategy 휇푆 that results in the minimum MI is called the worst-case attack, and is deﬁned as

휇푆 = argmin 휇푆 B (휇푆,휏) I(푆 ; 푒(푆 )).

Hence by deﬁnition, the graph representation vulnerability (GRV) describes the diﬀerence of the encoder s behavior using benign data and under the worst-case adversarial attack. A lower value of GRV휏(푒) implies a more robust encoder to adversarial attacks. Formally, an encoder is called (휏, 훾)- robust if GRV휏(푒) 훾. An analogy to the graph representation vulnerability (GRV) has been studied in the image domain (Zhu et al. 2020). However, the extension of (Zhu et al. 2020) to the graph domain requires nontrivial eﬀort. An image is considered to be a single continuous space while a graph is a joint space S = (A, X), consisting of a discrete graph-structure space A and a continuous feature space X. Moreover, the perturbation on the joint space (A, X) is diﬃcult to track because a minor change in the graph topology or node attributes will propagate to other parts of the graph via edges. This is diﬀerent in the image domain, where the distributions of all the pixels are assumed to be i.i.d.. Therefore, the discrete nature of graph topology and joint space (A, X) make the worst-case adversarial attack extremely diﬃcult to estimate. Thus, the optimization method we apply is substantially diﬀerent from that in (Zhu et al. 2020); see 3.2 and 3.3. Furthermore, more complicated analysis is needed to verify our approach in theory; see 4 for details.

3.2 Optimization Problem The trade-oﬀbetween model robustness and the expressive power of encoder has been well-studied (Tsipras et al. 2019; Zhang et al. 2019). In our case, this trade-oﬀcan be readily explored by the following optimization problem

maximize ℓ1(Θ) = I(푆; 푒(푆)) 훽GRV휏(푒), (2)

where the optimization variable is the learnable parameters Θ of the encoder 푒, and 훽> 0 is a pre-deﬁned parameter. However, in practice, the most robust encoder is usually not the desired one (as it sacriﬁces too much in the encoder s expressive power). An intuitive example for the most robust encoder is the constant map, which always outputs the same representation whatever the input is. Hence, a robust enough encoder would be suﬃcient, or even better. To this end, we add a soft-margin 훾to GRV, and obtain the following optimization problem

maximize ℓ2(Θ) = I(푆; 푒(푆)) 훽max {GRV휏(푒), 훾}. (3)

The second term is positive if GRV휏> 훾and constant otherwise. As a result, when the encoder is suﬃciently robust,

the second term in ℓ2 does no contribution, and thus Problem (3) turns to the standard MI maximization using benign data. Furthermore, when 훽= 1, Problem (3) can be divided into two simple sub-problems, depending on the value of GRV휏(푒):

max Θ inf 휇푆 B (휇푆,휏) I(푆 ; 푒(푆 )), if GRV휏> 훾

max Θ I(푆; 푒(푆)), otherwise. (4)

In this case (훽= 1), when GRV휏(푒) > 훾, the problem maximizes the MI under the worst-case adversarial attack. In other words, the robust encoder tries to maintain the mutual dependence between the graph data and the encoded representation, under all kinds of adversarial attacks. When the encoder is suﬃciently robust (i.e., GRV휏(푒) 훾), the problem turns to maximize the encoder s expressive power. GNN as the parameterized encoder. GNN has been extensively used as an expressive function for parameterizing the graph encoder (Kipf et al. 2016, 2017). In this paper, we adopt a one-layer GNN: 푒(A, X) = 휎( ˆD 1/2 ˆA ˆD 1/2XΘ), where ˆA is the adjacency matrix with self-loops, ˆD is the corresponding degree matrix, 휎is the Re LU function, and Θ is the learnable parameters.

3.3 Approximate Solution Although we formulate robust graph learning as an optimization problem (4), this problem is still diﬃcult to solve for several reasons. First of all, the mutual information I(푆, 푒(푆)) is extremely hard to compute, mainly because 푆= (퐴, 푋) is a joint random variable involving a high-dimensional discrete variable 퐴. In addition, the search space of the adversarial attacks, B (휇푆, 휏), is intractable to quantify: There is no conventional or well-behaved choice for the distance metric 푑 in such a complicated joint space. Even when we know the metric, the distance between two random variables is diﬃcult to calculate. Apart from the above challenges, the classical, well-known projected gradient descent algorithm does not work in the joint space 푆= (퐴, 푋), and thus the worst-case adversarial attack 휇푆 is no way to ﬁnd. Therefore, we further address the above issues in detail. MI estimation. Directly computing I(푆; 푒(푆)) in Problem (4) is intractable, especially for a joint distribution 푆= (퐴, 푋) which includes a high-dimensional discrete random variable 퐴. Some authors propose to maximize the average MI between a high-level global representation and local input regions, and show signiﬁcant improvement in the quality of representations (Hjelm et al. 2019; Veličković et al. 2019). Inspired by recent work Deep Graph Infomax (Veličković et al. 2019), we use a noise-contrastive type objective as an approximation of I(푆; 푒(푆)):

ℓenc(푆, 푒) = E푆[log D (풛, 풛G)] + E 푆[log (1 D ( 풛, 풛G))] , (5)

where 풛denotes the local/node representation obtained from a GNN encoder; 풛G = sigmoid (E푆(풛)) denotes the global/- graph representation; 푆is the random variable of negative examples, and 풛is the realization of 푒( 푆). The critic function D(풛, 풛G) represents the score assigned to a pair of local and global representations obtained from the natural samples

(i.e., the original graph), while D( 풛, 풛G) is that obtained from negative samples. We adopt DΦ = sigmoid(풛푇Φ풛G), where Φ is a learnable scoring matrix. Finally, in practice, the expectation over an underlying distribution is typically approximated by the mean of 푛independent samples {(풂푖, 풙푖)}푖 [푛]. Adversarial distribution estimation. Besides the estimation of MI, another challenge in solving Problem (4) is how to ﬁnd the worst-case adversarial distribution 휇푆 B (휇푆, 휏). Here, we elaborate the three diﬃculties in ﬁnding 휇푆 , and explain in detail how we solve them one by one. First, it is diﬃcult to choose an appropriate metric 푑on the joint space S = (A, X) that faithfully measures the distance between each pair of point elements. An intuitive choice for the distance between any pair of 풔1 = (풂1, 풙1) and 풔2 = (풂2, 풙2) in the joint metric space (A, 푑A) and (X, 푑X) would be the 퐿푝-norm 푑A(풂1, 풂2), 푑X(풙1, 풙2) 푝. However, this intuition fails in our case because the changes in both the graph topology and the node attributes are not in the same order of magnitude. Thereby, we have to consider the perturbations in A and X separately. With a little abuse of notation, we redeﬁne the perturbation bound as follows: B (휇퐴, 휇푋, 훿, 휖) = {(휇퐴 , 휇푋 ) M(A) M(X) | 푊 (휇퐴, 휇퐴 ) 훿, 푊 (휇푋, 휇푋 ) 휖}, where the small positive numbers 훿and 휖play the role of perturbation budget now. This is indeed a subset of the previous search space B (휇푆, 휏). Moreover, although the search space has been restricted, the -Wasserstein constrained optimization problem remains intractable: We still have no clue about the underlying probability distribution. Similar to what we did to estimate MI, we turn to replace the real data distribution with an empirical one. Suppose we have a set of i.i.d. samples {(풂푖, 풙푖)}푖 [푛] (note that 푛= 1 under a transductive learning setting), based on which we can compute the empirical distribution ( ˆ휇퐴, ˆ휇푋). The empirical search space is ˆB {풂푖}푛 푖=1, {풙푖}푛 푖=1, 훿, 휖

= n ( ˆ휇퐴 , ˆ휇푋 ) 풂푖 풂푖 0 훿, 풙푖 풙푖 휖, 푖 [푛] o ,

where ˆ휇퐴 and ˆ휇푋 are the empirical distributions computed from the perturbed samples {(풂푖 , 풙푖 )}푖 [푛]. Here we use the cardinality (i.e., 퐿0-norm) to measure the change in graph topology, and the 퐿 -norm to measure the change in continuous node attributes (when node attributes are discrete, or even binary, we can also use 퐿0-norm for them). Finally, we notice that the empirical space ˆB {풂푖}푛 푖=1, {풙푖}푛 푖=1, 훿, 휖

is again a subset of B ( ˆ휇퐴, ˆ휇푋, 훿, 휖). The remaining question is how to eﬃciently ﬁnd the worst-case adversarial attack. The classical choice for the image domain, i.e., the projected gradient descent (PGD) method (Madry et al. 2018), is no longer applicable in our case, as the graph topology is a Boolean random matrix. As a remedy for the discrete case, we adopt a projected gradient descent topology attack for graph topology (Xu et al. 2019a). More speciﬁcally, we ﬁrst ﬁnd a convex hull of the discrete feasible set, and apply the projected gradient method. A binary sub-optimal solution of worst-case 풂 is then recovered using random sampling. This projected gradient descent

topology attack helps us identify the worst-case adversarial example eﬃciently.

4 Theoretical Connection to Label Space In this section, we examine the ability of the proposed robust graph encoder of blocking perturbation and beneﬁting downstream tasks. To better understand the power of our robust model, we establish a theoretical connection between the robustness of representations (measured by our proposed GRV) and the robustness of the potential model built upon the representations. We take node classiﬁcation as an example task, and the result can be easily generalized to other classical graph learning tasks. We ﬁrst introduce the concept of adversarial gap (AG) to measure the robustness of the downstream node classiﬁer, and then explore some interesting theoretical connections between GRV and AG. Adversarial gap. Adversarial gap (AG) is a classical measure of robustness for node classiﬁcation in inductive learning. Let 풂and 풙be the adjacency matrix and the attribute matrix of an induced subgraph. Denote by (S, 푑) the input metric space and Y the space of labels. For a node classiﬁer 푔: S Y, we deﬁne the adversarial risk of 푔with the budget 휏 0 as

Adv Risk휏(푔) = P 풔 = (풂 , 풙 ) B(풔, 휏), s.t. 푔(풂 , 풙 ) 푦 ,

where B(풙, 휏) = {풙 X | 푑(풙 , 풙) 휏}. The adversarial gap (AG) is then deﬁned as

AG휏(푔) = Adv Risk휏(푔) Adv Risk0(푔),

which measures the relative vulnerability of the given model 푔. Apparently from the deﬁnition, a smaller value of AG (or Adv Risk) implies a more robust node classiﬁer 푔. Table 1 brieﬂy summarizes the robustness measures, including AG, RV and GRV. The traditional model robustness, adversarial gap (i.e., AG휖(푔) and AG휏(푔)), is based on the label space Y, while the MI-based robustness measures (i.e., RV (푒) and GRV (푒)) are built upon the representation space Z. The prior work (Zhu et al. 2020), which deﬁnes RV휖(푒) on a single input space X in the image domain, has shown that RV휖(푒) has a clear connection with classiﬁer robustness. Comparatively, the graph representation venerability GRV휏(푒) is deﬁned on a joint input space (A, X)

Robustness measure Domain Input space Output space AG휖(푔) Image Single X Y AG휏(푔) Graph Joint (A, X) Y RV휖(푒) Image Single X Z GRV휏(푒) Graph Joint (A, X) Z

Table 1: Summary of robustness measures. Adversarial gap (AG) is built on the label space Y, while representation vulnerability (RV) and graph representation vulnerability (GRV) are MI-based measures built on the representation space Z. The subscript 휖denotes the perturbation budget of 풙(i.e., the image) on the image domain, while the subscript 휏denotes the perturbation budget of (풂, 풙) on the graph domain.

in the graph domain. Thus the new deﬁnition is essentially diﬀerent from the one on the image domain due to the existence of both discrete and continuous input data structures. In what follows, some interesting theoretic are presented to show inherent relationship between the graph representation vulnerability GRV휏(푒) and the adversarial gap AG휏(푔). We ﬁrst work on two special cases under each of the following assumptions. Both assumptions are imposed on the statistical independence between the input random variables (i.e., 퐴or 푋) and the output label 푌. Topology-aware: given 푋 푌, 푝(푌|퐴, 푋) = 푝(푌|퐴) Attribute-aware: given 퐴 푌, 푝(푌|퐴, 푋) = 푝(푌|푋)

Special cases. To obtain a tractable surrogate model, we consider a simpliﬁed GNN-based encode architecture 풛= 풂푇풙횯(Wu et al. 2019a). Thus, the representation of each node depends only on its one-hop neighbors, and then the corresponding column of A can be used directly to compute the representation for each node. Additionally, inspired by (Miyato et al. 2017; Dai et al. 2019), in which perturbation on intermediate representations is deﬁned, we opt to deﬁne the adversarial distribution w.r.t 휇퐴푇푋instead of that w.r.t 휇퐴and 휇푋respectively. This assumption is reasonable owing to our focus on the robustness of our model rather than the real attack strategies. Accordingly, we assume that the set of adversarial distributions is B (휇퐴푇푋, 휌) = {휇퐴 푇푋 M(H) : 푊 (휇퐴푇푋, 휇퐴 푇푋 ) 휌}, where H = {풂푇풙: 풂 A, 풙 X}. In Theorems 4.1 and 4.2, we denote by 풂 {0, 1}|V| one column in A and 풙= X. The subscript 휌of GRV, Adv Risk and AG represents that they are deﬁned via B (휇퐴푇푋, 휌), while F = { 푓: 푧 푦} denotes the set of non-trivial downstream classiﬁers, 푓 = arg min 푓 F Adv Risk휌( 푓 푒) is the optimal classiﬁer built upon 푒, and 퐻푏is the binary entropy function. Moreover, when indexing 풂and 풙, 풂푖denotes the 푖-th entry of 풂and 풙푖denotes the 푖-th row of 풙. Theorem 4.1 (Topology-aware) Let (A, 0) and (X, 푝) be the input metric spaces, Y = { 1, +1} be the label space and Z = { 1, +1} be the representation space. The set of encoders with Θ R|V| is as follows: E = {푒: (풂, 풙) S sgn[풂푇풙횯]| 횯 2 = 1}. (6) Assume that all samples (풔, 풚) 휇푆푌are generated from

풚u.a.r. 푈{ 1, +1}, 풂푖 i.i.d. Bernoulli(0.5 + 풚 (푝 0.5)) and

풙푖 i.i.d. N (0, 휎2푰푐) where 푖= 1, 2, . . . , |푽| and 0 < 푝< 1. Then, given 휌 0, for any 푒 E, we obtain GRV휌(푒) = 1 퐻푏(0.5 + AG휌( 푓 푒)).

Next, consider a simpler case in which 풚u.a.r. 푈{ 1, +1} and 풂푖 i.i.d. Bernoulli(0.5 + 풚 (푝 0.5)) hold, but 풙푖= 1푐, 푖= 1, . . . , |푽| and the set of encoders follows such that E = {푒: (풂, 풙) sgn[(풂푇풙 0.5|푽|1푇 푐)횯] | 횯 2 = 1}, which can be regarded as the non-attribute case. Then, given 휌 0, for any 푒 E, we have GRV휌(푒) 1 퐻푏 0.5 0.5AG휌( 푓 푒) (7a)

GRV휌(푒) 1 퐻푏 0.5 AG휌( 푓 푒) (7b)

Theorem 4.1 reveals an explicit connection between GRV휌(푒) and AG휌( 푓 푒) achieved by the best classiﬁer in the topology-aware case. We note that 퐻푏(휃) is concave on (0, 1) and that the maximum of 퐻푏is attained uniquely at 휃= 0.5. Thus, a smaller GRV implies a smaller AG, and vice versa.

Theorem 4.2 (Attribute-aware) Let (A, 0) and (X, 푝) be the input metric spaces, Y = { 1, +1} be the label space and Z = { 1, +1} be the representation space. Suppose that the set of encoders is as in (6). Assume that the samples

(풔, 풚) 휇푆푌are generated from 풚u.a.r. 푈{ 1, +1}, 풂푖 i.i.d.

Bernoulli(0.5) and 풙푖 i.i.d. N (풚 흁, 휎2푰푐) where 푖= 1, 2, . . . , |푽|. Then, given 휌 0, for any 푒 E, we have:

GRV휌(푒) = 1 퐻푏( 1

2 AG휌( 푓 푒)). (8)

Next, consider a simpler case in which 풚u.a.r. 푈{ 1, +1},

풙푖 i.i.d. N (풚 흁, 휎2푰푐) but 풂 {0, 1}|푽|, Í|푽| 푖=1 풂푖= 푛0 + 푛1, where 푛0 = |푽|/4 + 풚 (푝 |푽|/4), 푛1 = |푽|/4 + 풚 (푞 |푽|/4) and 푝+ 푞= |푽|/2, 0 푝, 푞 |푽|/2, 푝, 푞 Z; that is, 풂푇풙will aggregate 푛0 samples with 풚= +1 and 푛1 samples with 풚= 1. Further suppose that the set of encoders is as presented in (6). Then, given 휌 0, (7) also holds for any 푒 E.

Similarly, we have GRV휌 AG휌in Theorem 4.2. Note that Theorems 4.1 and 4.2 still hold when 풂contains self-loops. General case. We illustrate a more general case in which 푌 is dependent on both 퐴and 푋. In the general case, we can extend (Zhu et al. 2020, Theorem 3.4) to the graph domain. Regardless of the encoder, the theorem below provides a general lower bound of adversarial risk over any downstream classiﬁers that involves both MI and GRV. Theorem 4.3 (Zhu et al. 2020). Let (S, 푑) be the input metric space, Z be the representation space and Y be the label space. Assume that the distribution of labels 휇푌over Y is uniform and 푆is the random variable following the joint distribution of inputs 휇퐴푋. Further suppose that F is the set of downstream classiﬁers. Given 휏 0,

inf 푓 F Adv Risk휏( 푓 푒) 1 퐼(푆; 푒(푆)) GRV휏(푒) + log 2

holds for any encoder 푒. Theorem 4.3 suggests that lower adversarial risk over all downstream classiﬁers cannot be achieved without either lower GRV or higher MI between 푆and 푒(푆). It turns out that jointly maximizing 퐼(푆; 푒(푆)) and minimizing GRV휏(푒) enables the learning of robust representations. Note that Theorem 4.3 also holds in the graph classiﬁcation task.

5 Experiments In the experiments, we train our model in a fully unsupervised manner, and then apply the output representations to three graph learning tasks. Compared with non-robust and other robust graph representation models, the proposed model produces more robust representations to defend adversarial attacks. Furthermore, the superiority of our model still holds under diﬀerent strengths of attacks and under various attack strategies.

5.1 Experimental Setup

For evaluation, we use three datasets, Cora, Citeseer and Polblogs, and compare our model with the following baselines.

Non-robust graph representation learning: 1) Raw: concatenating graph topology and node attributes (only graph topology for Polblogs); 2) Deep Walk (Perozzi et al. 2014): a random walk-based unsupervised graph model; 3) Deep Walk+X: concatenating the Deepwalk embedding and the node attributes; 4) GAE (Kipf et al. 2016): variational graph auto-encoder and 5) DGI (Veličković et al. 2019): another unsupervised graph model based on MI.

Defense models: 1) Dwns_Adv T (Dai et al. 2019): a defense model designed for Deepwalk; 2) RSC (Bojchevski et al. 2017): a robust unsupervised graph model via spectral clustering; 3) DGI-Edge Drop (Rong et al. 2020): a defense model that works by dropping 10% of edges during training DGI; 4) DGI-Jaccard (Wu et al. 2019b): DGI applied to a pruned adjacency matrix in which nodes with low Jaccard similarity are forced to be disconnected; and 5) DGI-SVD (Entezari et al. 2020): DGI applied to a lowrank approximation of the adjacency matrix obtained by truncated SVD.

We also include Ours-soft, an variant of our model which removes soft margin on GRV.

Implementation details. In the training phase, we adopt the projected gradient descent topology attack (Xu et al. 2019a) and PGD attack (Madry et al. 2018) to construct adversarial examples of 풂and 풙, respectively. We set 훾= 5e-3, 훿= 0.4|E|, and 휖= 0.1. For Polblogs, we do not perform attacks on the pseudo node attributes. In evaluation, we use the same attack strategy as in training, but set 훿= 0.2|E| to satisfy imperceptible constraint. Considering the training eﬃciency and the real attack during evaluation, the step size and iteration number are set diﬀerent. Note that Deep Walk and RSC both require the entire graph, and thus we retrain them using polluted data. The evaluation is performed on node classiﬁcation, link prediction, and community detection. We run 10 trials for all the experiments and report their average performance and standard deviation. Codes are available at: https://github.com/galina0217/robustgraph.

5.2 Results

Performance on downstream tasks. Table 2 summarizes the performance of diﬀerent models in three tasks. We see that our model beats the best baseline by an average of +1.8% on the node classiﬁcation, +1.8% on the link prediction and +45.8% on the community detection. It s worth noting that, in community detection, adversarial attacks can cause dramatic inﬂuence on model performance because the task itself is very sensitive to the global graph topology. The diﬀerence between the performance of our model and that of those non-robust graph learning models indicates the importance of defense. Moreover, our model still stands out with huge lead when compared with existing defense models. Besides, the ablation study, i.e., comparing the last two rows in Table 2, shows the superiority of the soft margin on GRV.

Node classiﬁcation (Acc%) Link prediction (AUC%) Community detection (NMI%)

Model Dataset Cora Citeseer Polblogs Cora Citeseer Polblogs Cora Citeseer Polblogs

Raw 57.4 3.0 49.7 1.6 73.9 0.9 60.5 0.1 50.2 0.5 89.0 0.4 9.7 7.5 1.0 0.5 0.2 0.1 Deep Walk 56.2 1.1 16.5 0.9 80.4 0.5 55.4 0.8 50.3 0.3 89.2 0.7 34.6 0.6 11.1 1.0 0.4 0.5 Deep Walk + X 59.3 0.4 26.5 0.5 - 55.9 0.6 50.9 0.3 - 34.2 3.7 11.1 1.3 - GAE 14.0 1.2 16.2 1.1 49.9 1.2 52.4 1.4 50.9 1.8 50.5 1.3 10.9 2.1 1.4 1.7 9.2 1.0 DGI 69.3 2.8 53.2 2.2 75.2 2.4 68.6 0.4 57.6 2.1 91.2 1.1 30.3 3.5 8.5 3.8 6.0 5.6

Dwns_Adv T 59.2 1.2 25.0 1.0 80.7 0.5 56.0 0.7 50.7 0.4 89.5 0.8 35.0 0.7 11.5 1.0 0.9 0.7 RSC 46.9 3.5 34.0 2.2 58.9 1.7 52.5 0.4 57.2 0.2 61.5 0.4 4.9 0.7 1.8 0.4 4.4 4.3 DGI-Edge Drop 56.0 4.3 49.0 4.5 79.8 1.7 66.2 0.8 61.3 0.9 89.3 1.6 30.1 6.8 7.34 0.8 9.0 7.8 DGI-Jaccard 69.4 2.8 57.1 1.3 79.3 0.8 63.8 0.8 57.6 1.0 84.7 0.9 16.4 1.1 6.1 0.6 12.9 0.0 DGI-SVD 68.1 8.0 56.1 16.4 81.6 0.7 60.1 0.8 54.7 1.3 85.2 0.7 16.2 0.9 6.5 0.8 13.0 0.0

Ours-soft 69.4 0.7 57.5 2.0 79.7 2.1 68.1 0.3 58.2 1.3 90.3 0.5 39.2 8.8 23.5 1.9 12.6 9.6 Ours 70.7 0.9 58.4 1.4 82.7 2.2 69.2 0.4 59.8 1.3 91.8 0.4 41.4 4.7 23.6 2.8 14.8 2.7

Table 2: Summary of results for the node classiﬁcation, link prediction and community detection tasks using polluted data.

Accuracy (%)

Ours DGI DGI-Jaccard DGI-SVD

1e-3 5e-3 1e-2 5e-2 1e-1 5e-1

Increasing perturbation rate

Accuracy (%)

Ours DGI DGI-Jaccard DGI-SVD

0.1 0.15 0.2 0.25 0.3 66

Increasing perturbation rate δ

Figure 2: Accuracy of diﬀerent models under various perturbation rates 훿and 휖. The downstream task is node classiﬁcation and we use the Cora dataset for illustration. The shaded area indicates the standard deviation ( 0.1) over 10 runs.

Performance under diﬀerent rates of perturbation. We further compare our model with several strong competitors under various strength of adversarial attacks on the graph topology and the node attributes, by choosing diﬀerent perburation rates 훿and 휖, respectively. We use the node classiﬁcation task and the Cora dataset as an illustrative example. As shown in Figure 2, the performance of our model is consistently superior to other competitors, both on average and in worst-case. Note that the strong competitor DGI generates negative samples in the training phase, and this might explain the robustness of the DGI model. Comparably, the high standard deviation of DGI-SVD might be attributed to the continuous low-rank approximation of the adjacency matrix: the output of truncated SVD is no longer a 0 1 matrix, which violates the discrete nature of graph topology.

Performance under other attack strategies. In practice, we do not know which kind of attack strategies the malicious users are going to use. Thus it is interesting and important to know the performance of our model across diﬀerent types of adversarial attacks. We adapt some common attack strategies to the unsupervised setting and use them as baselines. 1) Degree/Betw/Eigen: ﬂip edges based on the sum of the degree/betweenness/eigenvector centrality of two end nodes; 2) DW (Bojchevski et al. 2019a): a black-box attack method designed for Deep Walk. We set the size of the sampled can-

Model Attacker Degree Betw Eigen DW

Raw 87.4 0.3 84.1 0.8 86.4 0.6 87.9 0.4 Deep Walk 87.8 0.9 83.5 1.2 84.3 1.0 87.7 0.9 Deep Walk + X 85.8 2.7 82.7 2.1 85.0 1.1 88.3 0.9 GAE 83.7 0.9 81.0 1.6 81.5 1.4 85.4 1.1 DGI 86.6 1.1 84.8 1.2 84.8 1.0 86.4 1.1 Dwns_Adv T 88.0 1.0 84.1 1.3 84.6 1.0 88.0 0.8 RSC 52.1 1.3 51.9 0.7 51.4 0.5 52.6 1.1 DGI-Edge Drop 87.1 0.3 87.0 0.6 80.5 0.5 86.3 0.3 DGI-Jaccard 82.1 0.3 80.7 0.4 80.6 0.3 82.2 0.2 DGI-SVD 86.5 0.2 85.6 0.2 86.1 0.2 85.3 0.3 Ours-soft 88.5 0.7 85.7 1.5 86.2 0.4 88.7 0.7 Ours 89.3 0.7 86.3 1.2 86.7 0.4 89.0 0.8

Table 3: Defense against diﬀerent attackers on Polblogs for the node classiﬁcation task.

didate set to 20K, as suggested in (Bojchevski et al. 2019a). This time we consider the node classiﬁcation task on Polblogs for illustration. This choice is convincing because all the above attack strategies only vary the graph topology, which is the only information we know about Polblogs. Results in Table 3 show that our model s superiority persists in three attack strategies out of four. Comparison between Table 2 and Table 3 shows that the projected gradient descent topology attack via MI is the most eﬀective attack strategy used here, which veriﬁes that our model learns the worst adversarial example that deteriorates the performance most.

6 Conclusion In this paper, we study unsupervised robust representation learning on graphs. We introduce the graph representation vulnerability to quantify the robustness of an unsupervised graph encoder. After that we propose a robust unsupervised graph model that can enhance robustness as well as improve expressive power. We further build sound theoretical connections between GRV and one example task, node classiﬁcation. Extensive experimental results demonstrate the eﬀectiveness of our method on blocking perturbations on input graphs, regardless of the downstream tasks.

Acknowledgements This work is supported by NSFC (62176233), the National Key Research and Development Project of China (2018AAA0101900), NSF III-1705169, Okawa Foundation Grant, Amazon Research Awards, Picsart gift, NSFC (71531006), the Major Scientiﬁc Project of Zhejiang Laboratory (2020MC0AE01) and the Fundamental Research Funds for the Central Universities (Zhejiang University New Generation Industrial Control System (NGICS) Platform)

References Bojchevski, A.; et al. 2017. Robust spectral clustering for noisy data: Modeling sparse corruptions improves latent embeddings. In SIGKDD, 737 746. Bojchevski, A.; et al. 2019a. Adversarial attacks on node embeddings via graph poisoning. In ICML, 695 704. Bojchevski, A.; et al. 2019b. Certiﬁable robustness to graph perturbations. In Neur IPS, 1656 1665. Champion, T.; et al. 2008. The -Wasserstein Distance: Local Solutions and Existence of Optimal Transport Maps. SIAM Journal on Mathematical Analysis, 40(1): 1 20. Chen, L.; Li, J.; Peng, J.; Xie, T.; Cao, Z.; Xu, K.; He, X.; and Zheng, Z. 2020. A survey of adversarial learning on graph. ar Xiv preprint ar Xiv:2003.05730. Dai, H.; Li, H.; Tian, T.; Huang, X.; Wang, L.; Zhu, J.; and Song, L. 2018. Adversarial attack on graph structured data. In ICML, 1115 1124. Dai, Q.; Shen, X.; Zhang, L.; Li, Q.; and Wang, D. 2019. Adversarial Training Methods for Network Embedding. In WWW, 329 339. Entezari, N.; Al-Sayouri, S. A.; Darvishzadeh, A.; and Papalexakis, E. E. 2020. All you need is low (rank) defending against adversarial attacks on graphs. In WSDM, 169 177. Feurer, M.; Klein, A.; Eggensperger, K.; Springenberg, J.; Blum, M.; and Hutter, F. 2015. Eﬃcient and Robust Automated Machine Learning. In Neur IPS, 2755 2763. Gilmer, J.; Schoenholz, S. S.; Riley, P. F.; Vinyals, O.; and Dahl, G. E. 2017. Neural message passing for quantum chemistry. In ICML, 1263 1272. Grover, A.; et al. 2016. node2vec: Scalable feature learning for networks. In SIGKDD, 855 864. Hamilton, W. L.; et al. 2017. Inductive representation learning on large graphs. In Neur IPS, 1025 1035. Hjelm, R. D.; Fedorov, A.; Lavoie-Marchildon, S.; Grewal, K.; Bachman, P.; Trischler, A.; and Bengio, Y. 2019. Learning deep representations by mutual information estimation and maximization. In ICLR. Hu, W.; Liu, B.; Gomes, J.; Zitnik, M.; Liang, P.; Pande, V.; and Leskovec, J. 2019. Strategies for pre-training graph neural networks. ar Xiv preprint ar Xiv:1905.12265. Hu, Z.; Dong, Y.; Wang, K.; Chang, K.-W.; and Sun, Y. 2020. Gpt-gnn: Generative pre-training of graph neural networks. In SIGKDD, 1857 1867. Jin, W.; Ma, Y.; Liu, X.; Tang, X.; Wang, S.; and Tang, J. 2020. Graph structure learning for robust graph neural networks. In SIGKDD, 66 74.

Kipf, T. N.; et al. 2016. Variational graph auto-Encoders. NIPS Workshop on Bayesian Deep Learning. Kipf, T. N.; et al. 2017. Semi-supervised classiﬁcation with graph convolutional networks. In ICLR. Madry, A.; Makelov, A.; Schmidt, L.; Tsipras, D.; and Vladu, A. 2018. Towards deep learning models resistant to adversarial attacks. In ICLR. Miyato, T.; et al. 2017. Adversarial training methods for semi-supervised text classiﬁcation. In ICLR. Perozzi, B.; et al. 2014. Deepwalk: Online learning of social representations. In SIGKDD, 701 710. Qiu, J.; Chen, Q.; Dong, Y.; Zhang, J.; Yang, H.; Ding, M.; Wang, K.; and Tang, J. 2020. Gcc: Graph contrastive coding for graph neural network pre-training. In SIGKDD, 1150 1160. Rong, Y.; Huang, W.; Xu, T.; and Huang, J. 2020. Drop Edge: Towards Deep Graph Convolutional Networks on Node Classiﬁcation. In ICLR. Tanay, T.; et al. 2016. A boundary tilting persepective on the phenomenon of adversarial examples. ar Xiv preprint ar Xiv:1608.07690, abs/1608.07690. Tsipras, D.; Santurkar, S.; Engstrom, L.; Turner, A.; and Madry, A. 2019. Robustness may be at odds with accuracy. In ICLR. Veličković, P.; Fedus, W.; Hamilton, W. L.; Liò, P.; Bengio, Y.; and Hjelm, R. D. 2019. Deep graph infomax. In ICLR. Wang, B.; Jia, J.; Cao, X.; and Gong, N. Z. 2021. Certiﬁed robustness of graph neural networks against adversarial structural perturbation. In SIGKDD, 1645 1653. Wang, X.; et al. 2019. Graph Defense: Towards robust graph convolutional networks. ar Xiv preprint ar Xiv:1911.04429. Wu, F.; Zhang, T.; Souza Jr, A. H. d.; Fifty, C.; Yu, T.; and Weinberger, K. Q. 2019a. Simplifying graph convolutional networks. In ICML, 6861 6871. Wu, H.; Wang, C.; Tyshetskiy, Y.; Docherty, A.; Lu, K.; and Zhu, L. 2019b. Adversarial examples for graph data: Deep insights into attack and defense. In ĲCAI, 4816 4823. Xu, H.; Ma, Y.; Liu, H.-C.; Deb, D.; Liu, H.; Tang, J.-L.; and Jain, A. K. 2020a. Adversarial attacks and defenses in images, graphs and text: A review. International Journal of Automation and Computing, 17(2): 151 178. Xu, J.; Sun, Y.; Jiang, X.; Wang, Y.; Yang, Y.; Wang, C.; and Lu, J. 2020b. Query-free Black-box Adversarial Attacks on Graphs. ar Xiv preprint ar Xiv:2012.06757. Xu, J.; Yang, Y.; Pu, S.; Fu, Y.; Feng, J.; Jiang, W.; Lu, J.; and Wang, C. 2021. Net RL: Task-aware Network Denoising via Deep Reinforcement Learning. IEEE Transactions on Knowledge and Data Engineering, 1 1. Xu, J.; Yang, Y.; Wang, C.; Liu, Z.; Zhang, J.; Chen, L.; and Lu, J. 2020c. Robust Network Enhancement from Flawed Networks. IEEE Transactions on Knowledge and Data Engineering, 1 1. Xu, K.; Chen, H.; Liu, S.; Chen, P.-Y.; Weng, T.-W.; Hong, M.; and Lin, X. 2019a. Topology attack and defense for graph neural networks: An optimization perspective. In ĲCAI, 3961 3967.

Xu, K.; Hu, W.; Leskovec, J.; and Jegelka, S. 2019b. How Powerful are Graph Neural Networks? In ICLR. Yang, Y.; Xu, Y.; Sun, Y.; Dong, Y.; Wu, F.; and Zhuang, Y. 2021. Mining Fraudsters and Fraudulent Strategies in Large-Scale Mobile Social Networks. IEEE Transactions on Knowledge and Data Engineering, 33(1): 169 179. Yang, Y.; Xu, Y.; Wang, C.; Sun, Y.; Wu, F.; Zhuang, Y.; and Gu, M. 2019. Understanding Default Behavior in Online Lending. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management, CIKM 19, 2043 2052. Association for Computing Machinery. Zhang, H.; Yu, Y.; Jiao, J.; Xing, E.; Ghaoui, L. E.; and Jordan, M. 2019. Theoretically Principled Trade-oﬀbetween Robustness and Accuracy. In ICML, 7472 7482. Zheng, Q.; Zou, X.; Dong, Y.; Cen, Y.; Yin, D.; Xu, J.; Yang, Y.; and Tang, J. 2021. Graph Robustness Benchmark: Benchmarking the Adversarial Robustness of Graph Machine Learning. Neur IPS D&B. Zhu, S.; et al. 2020. Learning adversarially robust representations via Worst-Case mutual information maximization. In ICML, 11609 11618. Zügner, D.; et al. 2019a. Adversarial attacks on graph neural networks via meta learning. In ICLR. Zügner, D.; et al. 2019b. Certiﬁable robustness and robust training for graph convolutional networks. In SIGKDD, 246 256.