# goodnessoffit_tests_for_inhomogeneous_random_graphs__a513d52e.pdf Goodness-of-Fit Tests for Inhomogeneous Random Graphs Soham Dan 1 Bhaswar B. Bhattacharya 2 Hypothesis testing of random networks is an emerging area of modern research, especially in the high-dimensional regime, where the number of samples is smaller or comparable to the size of the graph. In this paper we consider the goodnessof-fit testing problem for large inhomogeneous random (IER) graphs, where given a (known) reference symmetric matrix Q [0, 1]n n and m independent samples from an IER graph given by an unknown symmetric matrix P [0, 1]n n, the goal is to test the hypothesis P = Q versus ||P Q|| ε, where || || is some specified norm on symmetric matrices. Building on recent related work on two-sample testing for IER graphs, we derive the optimal minimax sample complexities for the goodness-of-fit problem in various natural norms, such as the Frobenius norm and the operator norm. We also propose practical implementations of natural test statistics, using their asymptotic distributions and through the parametric bootstrap. We compare the performances of the different tests in simulations, and show that the proposed tests outperform the baseline tests across various natural random graphs models. 1. Introduction With the ubiquitous presence networks in bioinformatics and social sciences, developing statistical methods for graph-valued data has become increasingly important. Although network analysis has been an area of active interest in statistics and machine learning, most classical approaches for graph testing are applicable in the relatively low-dimensional setting, where the population size (number of graphs) is larger than the size of the graphs (number of 1Department of Computer and Information Science, University of Pennsylvania, Philadelphia, USA, 2Department of Statistics, University of Pennsylvania, Philadelphia, USA. Correspondence to: Soham Dan , Bhaswar B. Bhattacharya . Proceedings of the 37 th International Conference on Machine Learning, Online, PMLR 119, 2020. Copyright 2020 by the author(s). vertices). However, in the modern high-dimensional regime [19] the number of samples m could be potentially much smaller or comparable to the size of the graph n, for example, graphs may correspond to different sets of relations constructed for a set of actors [16]. As a result, theoretical understanding for testing random graphs is an area of emerging interesting. Most of the recent work has focussed primarily on detecting planted communities or sparse structures in the network [3, 4, 22, 26], and testing for network dependence [2, 8]. Here, we consider the problem of goodness-of-fit testing for network data, which involves developing statistical tests for assessing whether a given sample of networks fits a specified model. This problem has found many applications recently, for example, in assessing fit of protein-protein interaction (PPI) networks [9, 27] and in functional neuroimaging data [15]. In particular, Ospina-Forero et al. [27] developed a non-parametric procedure for testing whether network models used as backgrounds in community detection of social networks, such as the Erd os-R enyi model or the Chung-Lu model [6], can also be used to describe the appearance of subgraph counts in the Facebook networks in US universities. They also applied their method for assessing fit in PPI networks, which arise in many biological studies, such as the discovery of disease risk pathways and the investigation of genes undergoing age expression changes. Another application is in functional neuro-imaging data [15], where the vertices correspond to regions of interest in the brain, and an edge between two regions indicates functional connectivity, in the sense that the two regions interact together to achieve some higher-order function. Here, Ginestet et al. [15] considered the problem of testing whether the Laplacian matrix of the model generating a given sample of networks is equal to a reference Laplacian matrix, based on the asymptotics of the sample (Fr echet) mean, when the graph sizes are fixed and the sample sizes grow to infinity. Other goodness-of-fit tests for the β-model and the exponential random graph model (ERGM) are discussed in [7] and [23], respectively. In this paper, we propose theoretically optimal and computationally efficient methods for network goodness-of-fit for inhomogeneous Erd os-R enyi (IER) random graph models. This is a general class of random graph model, which includes several popular network models, such as the Chung Lu model [6], the β-model [5], random dot product graphs Goodness-of-Fit Tests for Inhomogeneous Random Graphs [29], and stochastic block models [22]. Our proposed tests attain optimal sample complexities, under different matrix norms, and can be efficiently implemented, either by their asymptotic distributions or the parametric bootstrap. The tests also perform well in a range of simulation experiments, illustrating the broad applicability of the methods. 1.1. Problem Statement and Summary of Results Given a symmetric matrix P (n) [0, 1]n n with zeroes on the diagonal, a graph G is said to be an inhomogeneous Erd os-R enyi (IER) random graph [1] with edge probability P (n) = ((pij)), denoted as G IER(P (n)), if its symmetric adjacency matrix A(G) = ((aij(G))) {0, 1}n n have independent entries satisfying: aij(G) Ber(pij) for all i < j. Recently, Tang et al. [29, 30] and Ghoshdastidar et al. [12, 13, 14] studied the problem of two-sample testing for this model which asks: Given two populations of random graphs, decide whether both populations are generated from the same distribution or not? This paper addresses the related problem of goodness-of-fit, where given independent graph samples G1, G2, . . . , Gm IER(P (n)), and a known matrix Q(n), the goal is to test the hypothesis H0 : P (n) = Q(n) versus H1 : ||P (n) Q(n)|| ε, (1.1) where || || is some norm on symmetric matrices. Here, we will consider the following three natural norms on a symmetric matrix A = ((aij)) [0, 1]n n: Frobenius Norm: ||A||F = q P 1 i,j n a2 ij, which is the root of the sum of squares of all the entries in A. Operator Norm: ||A||op = max{|λn(A)|, |λ1(A)|}, where λn(A) λn 1(A) λ1(A) are the eigenvalues of A. Zero Norm: ||A||0 = P 1 i,j n 1{aij = 0}, which is the number of non-zero entries in A. This is not really a norm, but is a popular way to quantity the sparsity of matrix. Given i.i.d. samples G1, G2, . . . , Gm from IER(P), a test is a binary function φ : Gm := (G1, G2, . . . , Gm) {0, 1}, which is 0 when the test accepts H0 and 1 otherwise. The worst-case risk of a test function φ for the testing problem (1.1) is defined as: Rm(Q(n), φ, || ||) = PQ(n)(φ = 1) + sup P (n): ||P (n) Q(n)|| ε PP (n)(φ = 0), (1.2) which is the sum of the Type I error and the maximum possible Type II error rate of the test φ. (Hereafter, we will omit the dependence on the distance function || || in (1.2) above, whenever it is clear from the context.) We are interested in the asymptotic regime where the risk (1.2) transitions from 0 to 1. This is formalized in the following definition: Definition 1.1. Given G1, G2, . . . , Gm i.i.d. samples from IER(P (n)), where m = mn can depend on n, a sequence of test functions φn,m is said to be asymptotically powerless for (1.1), if there exists a sequence of symmetric matrices Q(n) [0, 1]n n such that limn Rm(Q(n), φm,n) = 1. On the other hand, a sequence of test functions φn,m is said to be asymptotically powerful for (1.1), if for all symmetric matrices Q(n) [0, 1]n n, limn Rm(Q(n), φn,m) = 0. The main focus of this paper is to derive optimality results for goodness-of-fit testing in IER graphs for the various norms described above, and complement these results with implementable tests based on asymptotic properties and the bootstrap. The following is summary of the results obtained in this paper: We show that the optimal sample complexity for testing separation as in (1.1), for both the Frobenius (Theorem 2.2) and the operator norm (Theorem 2.3), is n/ε2. This means that there is a (computationally efficient) test which is asymptotical powerful for (1.1) when the sample size m n/ε2, and all tests asymptotically powerless when the sample size m n/ε2. We also show that testing for any separation is impossible in the zero norm (Theorem 2.1). Next, we derive the asymptotic null distribution and statistical consistency (against a large class of alternatives) of a natural goodness-of-fit test, based on a sample estimate of the Frobenius norm, which can be used to efficiently calibrate the test statistic for moderate to large sized networks (Section 3). This test statistic, however, fails to work when there is only one sample (m = 1), in which case, we propose a test based on the operator norm of the sample adjacency matrix. We also discuss how the method of parametric bootstrap [11] can be used to approximately calibrate any test statistic, and compare the performance of the different tests in finite sample simulations (Section 4). Our experiments show that the proposed asymptotic Frobenius norm based test accurately approximates the null distribution and has good power for a wide class of alternatives for moderate sized networks, even when the sample size m is very small. Goodness-of-Fit Tests for Inhomogeneous Random Graphs 1.2. Organization The rest of the paper is organized as follows: The optimal sample complexities for testing in the norms described above are derived in Section 2. The asymptotic null distribution, consistency, and details of the bootstrap are discussed in Section 3. The performance of the different tests in finite sample simulations are described in Section 4. Proofs of the theorems and additional simulations are given in the appendix. 2. Minimax Sample Complexities In this section, we obtain the optimal sample complexities for goodness-of-fit testing under the three norms described above. We begin by recalling some standard asymptotic notation. For two nonnegative sequences {an}n 1 and {bn}n 1, an bn means an = O(bn); an bn means an = (1 + o(1))bn; an bn means an bn an; an bn means an = o(bn); and an bn means bn = o(an). We start with an impossibility theorem about the zero-norm. The following theorem, which is proved in Appendix 1, shows that testing in zero norm is impossible, for any symmetric matrix Q(n) [0, 1]n n and any ε > 0. This is because, we can increase the zero norm between two matrices to their maximum possible value (which is n(n 1)), by making arbitrarily small perturbations to the elements of the matrices. Theorem 2.1. For the testing problem (1.1) under the zero norm || ||0, all tests are asymptotically powerless for any sequence of symmetric matrices Q(n) [0, 1]n n and any ε > 0. Next, we consider testing in the Frobenius norm. In this case our test is based on the following statistic: 1 i 1. We also discuss a spectral test based on the (properly normalized) adjacency matrix of the observed graph, for the case m = 1. In Section 3.2 we describe the method of parametric bootstrap, a well-known resampling technique [11] for calibrating the null distribution of any test statistic. 3.1. Asymptotic Properties We begin with the asymptotic distribution of Tm,n. The following theorem shows that Tm,n (appropriately scaled) converges in distribution to standard normal under H0. Using this we can construct a test with probability of Type I error converging to α (asymptotically level α), which is also consistent (probability of Type II error converging to zero), under a separation condition in terms of ||P (n) Q(n)||2 F . The asymptotics is in n , where m > 1 is also allowed to depend on n. We denote by zα the (1 α)-th quantile of the standard normal N(0, 1). Theorem 3.1. Suppose {Q(n)}n 1 be a sequence of edgeprobability matrices with entries bounded away from 1, such that limn ||Q(n)||F = . Then under the null hypothesis H0, Zm,n := Tm,n q 1 i zα/2 is asymptotically level α, that is, lim n PH0(|Zm,n| > zα/2) = α. (3.1) On the other hand, if {P (n)}n 1 is such that ||P (n) Q(n)||2 F 1 m||Q(n)||F , then lim n PP (n)(|Zm,n| > zα/2) = 1, (3.2) that is, the power of the test converges to 1. Figure 1. (a) Figure 2. (b) Figure 3. The quantile-quantile (QQ) plots of Zm,n, for sequence of m = 4 graphs on n = 100 vertices each generated from (a) the Erd os-R enyi model ER(100, 0.5), and (b) the planted bisection model PB(100, 0.9, 0.1). The proof of the theorem is given in Appendix 4. This result parallels the CLT for the related two-sample statistic derived in [12]. However, unlike in the two-sample case where the corresponding test is conservative (that is, the limiting Type I error is less than equals to α), for the goodness-of-fit problem the test which rejects H0 when |Zm,n| > zα/2 has asymptotic Type I error exactly α (see (3.1)). This is because in the goodness-of-fit problem, we know the variance of Tm,n under the null, and, hence, the standardized statistic Zm,n can be directly computed from the data. Moreover, since the test based on Tm,n unbiasedly estimates ||P (n) Q(n)||2 F , it is natural to expect consistency when this separation becomes large (as in (3.2)). Figure 3 shows the quantile-quantile (QQ) plots of the statistic Zm,n, under the null, for the Erd os-R enyi model and the planted bisection model: Erd os-R enyi Model: This is a special case of the IER model, where all the edge interconnection probabilities are the equal. More formally, given q [0, 1], we denote this model by ER(n, q), that is, a random graph on n vertices where each edge is present or absent independently with probability q. This is one of the most fundamental models for random networks, and has been extensively studied over the last few decades [20]. Planted Bisection Model: This is a special case of the well-known stochastic block model [25], in which the nodes are divided into two equal-sized communities and then edges are added randomly in a way that depends on the community membership. More formally, the planted bisection model is an IER graph on n vertices where the edge-probability edge Q = ((qij)) has Goodness-of-Fit Tests for Inhomogeneous Random Graphs the following form 2-block structure: a if 1 i = j n 2 < i = j n, b if 1 i n 2 < j n or n 2 < i n and 1 j n 2 0 if i = j, where a, b [0, 1]. Given a, b [0, 1], we denote a random graph on n vertices from this model as PB(n, a, b). To validate the asymptotic results in Theorem 3.1, we simulate the null distribution of Zm,n for the two models described above, for a collection of m = 4 graphs on n = 100 vertices, and compare the empirical quantiles of Zm,n (over 1000 iterations) with the predicted theoretical quantiles of N(0, 1). The plots show that the asymptotics in Theorem 3.1 give accurate approximations for moderate-size graphs (n = 100) sample size as small as 4. We investigate the power of this test for various alternatives in Section 4 and in Appendix 5. Remark 3.1. (The case m = 1.) When the data consists of only a single graph, the statistic (2.1) becomes degenerate, and it no longer unbiasedly estimates ||P (n) Q(n)||2 F . In this case, we propose a test based on the spectral norm of the (scaled) observed adjacency of the graph. To this end, suppose G1 IER(P (n)) and consider the scaled adjacency matrix Wn := ((wij)), where wij := aij(G1) qij p (n 1)qij(1 qij) , (3.3) for 1 i, j n, where Q(n) = ((qij)) is the reference edge-probability matrix under the null H0. Note that, under the null H0, Wn is a symmetric random matrix, whose entries above the diagonal are independent with mean zero and variance 1 n 1. Hence, by directly applying well-known results from random matrix theory [10, 21] we can get the limiting null distribution of the largest and the smallest eigenvalues of Wn. In particular, λ1(Wn) and λn(Wn) has the same limiting distribution as λ1(Dn) and λn(Dn), where Dn is a symmetric random matrix with zero diagonal, whose entries above the diagonal are i.i.d. normal with mean zero and variance 1 n 1 [10]. This combined with results of Lee and Yin [21] about λ1(Dn) and λn(Dn) implies n 2 3 (λ1(Wn) 2) D TW1 and n 2 3 ( λn(Wn) 2) D TW1, where TW1 is Tracy-Widom law for orthogonal ensembles [31]. Then, recalling that ||Wn||op = max{|λ1(Wn)|, |λn(Wn)|}, and a union bound, implies that under the null, lim sup n PH0 n 2 3 (||Wn||op 2) τα/2 α, (3.4) where τα/2 is the (1 α 2 )-th quantile of the TW1 distribution. 3.2. The Parametric Bootstrap The parametric bootstrap is a well-known resampling technique [11], which can be used to approximate the null distribution of any test statistic for the goodness-of-fit problem, by repeatedly sampling from the null model. The details of the algorithm are given below in Algorithm 1. Algorithm 1 Bootstrapping a Test Statistic Input : Samples G1, . . . , Gm from a IER model, the null edge-probability matrix Q, a test statistic Sm,n, a significance Level α (0, 1), and B 1 (the number of bootstrap repetitions). 1: for b = 1 to B do 2: Draw m samples G(b) 1 , . . . , G(b) m from the IER(Q) model. 3: Compute the test statistic S(b) m,n = Sm,n(G(b) 1 , . . . , G(b) m ). 4: Denote by Lm,n and Um,n the α 2 empirical quantiles of {S(1) m,n, S(2) m,n, . . . , S(B) m,n}, respectively. 5: end for Output: Reject H0 if Sm,n(G1, . . . , Gm) / [Lm,n, Um,n]. Otherwise accept H0. Note that sampling a IER(Q) random graph on n vertices takes O(n2) time, therefore, the algorithm above can be easily implemented for moderate sized networks. In Section 4, we compare the asymptotic tests described above with their bootstrap counterparts. We will refer to the bootstrapped analogue of Tm,n as the Bootstrapped Frobenius Test. We will also consider a bootstrap test based on the statistic s=1 A(Gs) m Q(n) where G1, G2, . . . , Gm are i.i.d. IER(Q(n)), which we will refer to as the Bootstrapped Operator-Norm Test.1 In addition, we will consider the bootstrap versions of the following two baseline tests: The Edge Test: Given G1, G2, . . . , Gm i.i.d. samples from IER(Q(n)), the edge test rejects the 1Incidentally, it can be shown by an application of matrix concentration inequalities that the test based on (3.5) is asymptotic powerful for detecting ε separation in the operator norm, whenever m n log n/ε2. One might also expect to remove the factor of log n to match the lower bound in Theorem 2.3, using a moment-method based argument, similar to that in [14], where the analogous two-sample problem was studied. Goodness-of-Fit Tests for Inhomogeneous Random Graphs null for large/small values of the sum of the total number of edges in the observed sample, that is, Pm s=1 P 1 i