# local_private_hypothesis_testing_chisquare_tests__4f95286a.pdf

Local Private Hypothesis Testing: Chi-Square Tests

Marco Gaboardi 1 Ryan Rogers 2

The local model for differential privacy is emerging as the reference model for practical applications of collecting and sharing sensitive information while satisfying strong privacy guarantees. In the local model, there is no trusted entity which is allowed to have each individual s raw data as is assumed in the traditional curator model. Individuals data are usually perturbed before sharing them. We explore the design of private hypothesis tests in the local model, where each data entry is perturbed to ensure the privacy of each participant. Speciﬁcally, we analyze locally private chi-square tests for goodness of ﬁt and independence testing.

1. Introduction

Hypothesis testing is a widely applied statistical tool used to test whether given models should be rejected, or not, based on sampled data from a population. Hypothesis testing was initially developed for scientiﬁc and survey data, but today it is also an essential tool to test models over collections of social network, mobile, and crowdsourced data (American Statistical Association, 2014; Hunter et al., 2008; Steele et al., 2017). Collected data samples may contain highly sensitive information about the subjects, and the privacy of individuals can be compromised when the results of a data analysis are released. A way to address this concern is by developing new techniques to support privacy-preserving data analysis. Among the different approaches, differential privacy (Dwork et al., 2006b) has emerged as a viable solution: it provides strong privacy guarantees and it allows to release accurate statistics. A standard way to achieve differential privacy is by injecting some statistical noise in the computation of the data analysis. When the noise is carefully chosen, it helps to protect the individual privacy without compromising the utility of the data analysis. Several recent works have studied differentially private hypothesis tests

*Equal contribution 1 University at Buffalo, Buffalo, NY, USA 2University of Pennsylvania, Philadelphia, PA, USA. Correspondence to: Marco Gaboardi <gaboardi@buffalo.edu>.

Proceedings of the 35 th International Conference on Machine Learning, Stockholm, Sweden, PMLR 80, 2018. Copyright 2018 by the author(s).

that can be used in place of the standard, non-private hypothesis tests (Uhler et al., 2013; Yu et al., 2014; Sheffet, 2015; Karwa & Slavkovi c, 2016; Wang et al., 2015; Gaboardi et al., 2016; Kifer & Rogers, 2017; Cai et al., 2017). These tests work in the curator model of differential privacy. In this model, the data is centrally stored and the curator carefully injects noise in the computation of the data analysis in order to satisfy differential privacy.

In this work we instead address the local model of privacy, formally introduced by Raskhodnikova et al. (2008). The ﬁrst differentially private algorithm called randomized response in fact it predates the deﬁnition of differential privacy by more than 40 years guarantees differential privacy in the local model (Warner, 1965). In this model, there is no trusted centralized entity that is responsible for the noise injection. Instead, each individual adds enough noise to guarantee differential privacy for their own data, which provides a stronger privacy guarantee than the curator model. The data analysis is then run over the collection of the individually sanitized data. The local model of differential privacy is a convenient model for several applications: for example it is used to collect statistics about the activity of the Google Chrome Web browser users (Erlingsson et al., 2014), and to collect statistics about the typing patterns of Apple s i Phone users (Apple Press Info, 2016). Despite these applications, the local model has received far less attention than the centralized curator model. This is in part due to the more ﬁrm requirements imposed by this model, which make the design of effective data analysis harder.

Our main contribution is in designing chi-square hypothesis tests for the local model of differential privacy. Similar to previous works we focus on goodness of ﬁt and independence hypothesis tests. Most of the private chi-square tests proposed so far are based on mechanisms that add noise in some form to the aggregate data, e.g. the cells of the contingency tables, or the resulting chi-square statistics value. These approaches cannot be used in the local model, since noise needs to be added at the individual s data level. We then consider instead general privatizing techniques in the local model, and we study how to build new hypothesis tests with them. Each test we present is characterized by a speciﬁc local model mechanism. The main technical challenge for designing each test is to create statistics, which incorporate the local model mechanisms, that converge as

Local Private Hypothesis Testing: Chi-Square Tests

we collect more data to a chi-square distribution, as in the classical chi-square tests. We then use these statistics to ﬁnd the critical value to correctly bound the Type I error.

We present three different goodness of ﬁt tests: Local Noise GOF presents a statistic that guarantees the convergence to a chi-square distribution under the null hypothesis so that we can use the correct critical values when local (concentrated) differential privacy is guaranteed by adding Laplace or Gaussian noise to the individual data; Local Gen RRGOF also provides a statistic that converges to a chi-square under the null hypothesis when a private value for each individual is selected by using a generalized form of randomized response, which can also be thought of as an instantiation of the exponential mechanism (Mc Sherry & Talwar, 2007); ﬁnally, Local Bit Flip GOF introduces a statistic that converges to a chi-square distribution when the data is privatized using a bit ﬂipping algorithm (Bassily & Smith, 2015), which provide better accuracy for higher dimensional data. Further, we develop corresponding independence tests: Local Noise IND (see supplementary ﬁle), Local Gen RRIND, and Local Bit Flip IND. For all these tests we study their asymptotic behavior. A desiderata for private hypothesis tests is to have a guaranteed upper bound on the probability of a false discovery (or Type I error) rejecting a null hypothesis or model when the data was actually generated from it and to minimize the probability of a Type II error, which is failing to reject the null hypothesis when the model is indeed false. This latter criteria corresponds to the power of the statistical test. We then present experimental results showing the power of the different tests which demonstrates that no single local differentially private algorithm is best across all data dimensions and privacy parameter regimes. However, this evaluation also shows a relation between the power of the test and the noncentral parameter of the test statistic that is used. This suggests that besides looking at the parameters of the test, a data analyst may need also to consider which test statistic results in the largest noncentral parameter.

2. Related Works

There have been several works in developing private hypothesis test for categorical data, but all look at the traditional model of (concentrated) differential privacy instead of the local model, which we consider here. Several works have explored private statistical inference for GWAS data, (Uhler et al., 2013; Yu et al., 2014; Johnson & Shmatikov, 2013). Following these works, there has also been general work in private chi-square hypothesis tests, where the main tests are for goodness of ﬁt and independence testing, although some do extend to more general tests (Wang et al., 2015; Gaboardi et al., 2016; Kifer & Rogers, 2017; Cai et al., 2017; Kakizaki et al., 2017). Among these, the works most related to

ours are the ones by Gaboardi et al. (2016); Kifer & Rogers (2017). One of our mechanisms, Local Noise GOF, can be seen as an adaptation of their techniques to the local model. However, the other mechanisms we introduce differ substantially and require novel asymptotic analyses. There has also been work in private hypothesis testing for ordinary least squares regression (Sheffet, 2015).

Duchi et al. (2013b;a) focus on controlling disclosure risk in statistical estimation and inference by ensuring the analysis satisﬁes local differential privacy. In their work, they show that a generalized version of randomized response gives optimal sample complexity for estimating the multinomial probability vector. We use this idea as the basis for our hypothesis test Local Bit Flip GOF. Kairouz et al. (2014) also considers hypothesis testing in the local model, although they measure utility in terms of f-divergences and do not give a decision rule, i.e. when to reject a given null hypothesis. We provide statistics whose distributions asymptotically follow a chi-square distribution, which allows for approximating statistical p-values that can be used in a decision rule. We consider their extremal mechanisms and empirically conﬁrm their result that for small privacy regimes (small ϵ) one mechanism has higher utility than other mechanisms and for large privacy regimes (large ϵ) a different mechanism outperforms the other. However, we measure utility in terms of the power of a locally private hypothesis test subject to a given Type I error bound. Other notable works in the local privacy model include Pastore & Gastpar (2016); Kairouz et al. (2016); Ye & Barg (2017)

Independent of this work, another paper (Sheffet, 2018) has addressed local private hypothesis testing. Sheffet (2018) considers ﬁnite sample complexity by showing certain test quantities take different values under the nulland alternative-hypothesis. In this work, we design and analyze asymptotic statistical tests and empirically evaluate the performance of each test for ﬁnite samples.

3. Preliminaries

We consider datasets x = (x1, , xn) X n in some data universe X, typically X = {0, 1}d where d is the dimensionality. We ﬁrst present the standard deﬁnition of differential privacy, as well as its variant concentrated differential privacy. We say that two datasets x,x X n

are neighboring if they differ in at most one element, i.e. i [n] such that xi = x i and j = i, xj = x j. Deﬁnition 3.1 (Dwork et al. (2006b;a)). An algorithm M : X n Y is (ϵ, δ)-differentially private (DP) if for all neighboring datasets x,x X n and for all outcomes S Y, we have Pr [M(x) S] eϵPr [M(x ) S] + δ. Deﬁnition 3.2 (Bun & Steinke (2016)). An algorithm M : X n Y is ρ-zero-mean concentrated differentially private (z CDP) if for all neighboring datasets

Local Private Hypothesis Testing: Chi-Square Tests

x,x X n, we have the following bound for all t > 0 where the expectation is over outcomes y M(x),

E h exp t ln Pr[M(x)=y] Pr[M(x )=y] ρ i et2ρ.

Note that in both of these privacy deﬁnitions, it is assumed that all the data is stored in a central location and the algorithm M can access all the data. Most of the work in differential privacy has been in this trusted curator model. We then deﬁne local differential privacy, formalized by Raskhodnikova et al. (2008) and Dwork & Roth (2014), which does not require the subjects to release their raw data, rather each data entry is perturbed to prevent the true entry from being stored. Thus, local differential privacy ensures one of the strongest privacy guarantees. Deﬁnition 3.3 (LR Oracle). Given a dataset x, a local randomizer oracle LRx( , ) takes as input an index i [n] and an ϵ-DP algorithm R, and outputs y Y chosen according to the distribution of R(xi), i.e. LRx(i, R) = R(xi). Deﬁnition 3.4 (Raskhodnikova et al. (2008)). An algorithm M : X n Y is (ϵ, δ)-local differentially private (LDP) if it accesses the input database x via the LR oracle LRx with the following restriction: if LR(i, Rj) for j [k] are M s invocations of LRx on index i, then each Rj for j [k] is (ϵj, δj)- DP and Pk j=1 ϵj ϵ, Pk j=1 δj δ.

From this we have that a (ϵ, δ)-LDP algorithm is also (ϵ, δ)- DP. Note that these deﬁnitions can be extended to include ρ-local z CDP (Lz CDP) where each local randomizer is ρjz CDP and Pk j=1 ρj ρ. We point out the following connection between Lz CDP and LDP, which follows directly from results in (Bun & Steinke, 2016) Lemma 3.5. If M : X n Y is (ϵ, 0)-LDP then it is also ϵ2/2-Lz CDP. If M is ρ-Lz CDP, then it is also ρ + p

2ρ ln(2/δ) , δ -LDP for any δ > 0.

4. Chi-Square Hypothesis Tests

As was studied in (Gaboardi et al., 2016), (Wang et al., 2015), and (Kifer & Rogers, 2017), we will study hypothesis tests with categorical data. A null hypothesis, or model H0 is how we might expect the data to be generated. The goal for hypothesis testing is to reject the null hypothesis if the data is not likely to have been generated from the given model. As is common in statistical inference, we want to design hypothesis tests to bound the probability of a false discovery (or Type I error), i.e. rejecting a null hypothesis when the data was actually generated from it, by at most some amount α, such as 5%. However, designing tests that achieve this is easy, because we can just ignore the data and always fail to reject the null hypothesis, i.e. have an inconclusive test. Thus, we want additionally to design our tests so that they can reject H0 if the data was not actually generated from the given model. We then want to minimize

the probability of a Type II error, which is failing to reject H0 when the model is false, subject to a given Type I error.

For goodness of ﬁt testing, we assume that each individual s data X i for i [n] is sampled i.i.d. from Multinomial(1,p) where p Rd >0 and p 1 = 1. The classical chi-square hypothesis test (without privacy) forms the histogram H = (H1, , Hd) = Pn i=1 X i and computes the chi-square

statistic T = Pd j=1 (Hj np0 j) 2

np0 j . The reason for using this

statistic is that it converges in distribution to χ2 d 1 as more data is collected, i.e. n , when H0 : p = p0 holds. Hence, we can ensure the probability of false discovery to be close to α as long as we only reject H0 when T > χ2 d 1,1 α where the critical value χ2 d 1,1 α is deﬁned as the following

quantity Pr h χ2 d 1 > χ2 d 1,1 α i = α.

Prior Private Chi-square Tests in the Curator Model. One approach for chi-square private hypothesis tests is to add noise (Gaussian or Laplace) directly to the histogram to ensure privacy and then use the classical test statistic (Gaboardi et al., 2016; Wang et al., 2015) . Note that the resulting asymptotic distribution needs to be modiﬁed for such changes to the statistic it is no longer a chi-square random variable. To introduce the different statistics, we will consider goodness of ﬁt testing after adding noise Z from distribution Dn to the histogram of counts e H = H + Z, which ensures ρ-z CDP when D = N (0, 1/ρ) and ϵ-DP when D = Lap(2/ϵ). The chi-square statistic then becomes

Hi + Zi np0 i 2

np0 i where Z Dn. (1)

The previous works then show that this statistic converges in distribution to a linear combination of chi-squared variables, when D N (0, 1/ρ) and ρ is also decreasing with n.

Kifer & Rogers (2017) showed that modifying the chisquare statistic to account for the additional noise leads to tests with better empirical power. The projected statistic from Kifer & Rogers (2017) is the following where we use projection matrix Π defn = Id 1

d11 , middle ma-

trix Mσ = Π Diag p0 + σ p0 p0 1 Π, and sample noise Z Dn, with b H = H + Z

T(n) KR (σ; D) = n

We use D = Lap(2/ϵ) with σ = 8 nϵ2 for an ϵ-DP claim or D = N (0, 1/ρ) with σ = 1 nρ for a ρ-z CDP claim. When comparing the power of all our tests, we will be considering the alternate H1 : p = p1 n where p1 n = p0 + n where 1 = 0.

Local Private Hypothesis Testing: Chi-Square Tests

Theorem 4.1 (Kifer & Rogers (2017)). Under the null hypothesis H0 : p = p0, the statistic T(n) KR 1 nρ; N (0, 1/ρ)

given in (2) for ρ > 0 converges in distribution to χ2 d 1. Further, under the alternate hypothesis H1 : p = p1 n, the resulting asymptotic distribution is a noncentral chi-square random variable with d 1 degrees of freedom and noncentral parameter Diag(p0) p0 p0 + 1/ρId 1

When D = Lap(2/ϵ), Gaboardi et al. (2016) showed that we can still obtain the null hypothesis distribution using Monte Carlo simulations to estimate the critical value, since the asymptotic distribution will no longer be chi-square. That is, we can obtain m samples from the statistic under the null hypothesis with Laplace noise added to the histogram of counts. We can then guarantee that the probability of a false discovery is at most α as long as m > 1/α .

5. Local Private Goodness of Fit

We now turn to designing local private goodness of ﬁt tests. We ﬁrst show how the existing statistics from the previous section can be adapted to the local setting and then develop new tests based on the generalized randomized response mechanism that returns one of d > 1 categories and bit ﬂipping (Bassily & Smith, 2015). Each test is locally private because it perturbs each individual s data through a local randomizer. However, each of them has a different asymptotic behavior and so we need different analyses to identify the different critical values. We empirically check the power of each test to see which tests outperform others in different parameter regimes. An interesting result of this analysis is that the power of a test is directly related to the size of the noncentral parameter of the chi-square statistic under the alternate distribution.

Testing with Noise Addition. In the local model we can add Z i N 0, 1

ρ Id independent noise to each individ-

ual s data X i to ensure ρ-Lz CDP or Z i i.i.d. Lap 2

ϵ independent noise to X i to ensure ϵ-LDP. In either case, the resulting noisy histogram b H = H + Z where Z = P

i Z i will have variance that scales with n for ﬁxed privacy parameters ϵ, ρ > 0. Consider the case where we add Gaussian noise, which results in the following histogram, b H = H +Z where Z N 0, n

ρ Id . Thus, we can use either statistic

e T (ρ/n) or T(n) KR (ρ/n), with the latter statistic typically having better empirical power (Kifer & Rogers, 2017). We then give our ﬁrst local private hypothesis test in Algorithm 1.

Theorem 5.1. Local Noise GOF is ρ-Lz CDP when D = N (0, 1/ρ) and ϵ-LDP when D = Lap(2/ϵ).

Although we cannot guarantee the probability of a Type I error at most α due to the fact that we use the asymptotic

Algorithm 1 Locally Private GOF Test:Local Noise GOF

Input: x = (x1, ,xn), ρ, α, H0 : p = p0.

Let H = Pn ℓ=1 xℓ if D = N (0, n/ρ) then

Set q = T(n) KR (n/ρ; D) given in (2). if q > χ2 d 1,1 α Decision Reject. else Decision Fail to Reject. end if if D = Pn i=1 Lap(2/ϵ) then Set q = T(n) KR 8n/ϵ2; D given in (2). Sample m > 1/α from the distribution of T(n) KR 8n/ϵ2; D assuming H0 Set τ to be the (m + 1)(1 α) th largest sample. if q > τ Decision Reject. else Decision Fail to Reject. end if Output: Decision

distribution (as in the tests from prior work and the classical chi-square tests without privacy), we expect the Type I errors to be similar to those from the nonprivate test. Note that the test can be modiﬁed to accommodate arbitrary noise distributions, e.g. Laplace to ensure differential privacy. In this case, we can use a Monte Carlo (MC) approach to estimate the critical value τ that ensures the probability of a Type I error is at most α if we reject H0 when the statistic is larger than τ. For the local setting, if each individual perturbs each coordinate by adding Lap (2/ϵ) then this will ensure our test is ϵ-LDP. However, the sum of independent Laplace random variables is not Laplace, so we will need to estimate a sum of n independent Laplace random variables using MC. We can do this by sampling m entries from the exact distribution under H0 to ﬁnd the critical value. In the experiments section we will use this method to compare the power of the other local private tests with the one of the version of Local Noise GOF using Laplace noise, which has a better power than the one using Gaussian noise.

Testing with Generalized Randomized Response. Rather than having to add noise to each component of the original histogram, we consider applying randomized response to obtain a LDP hypothesis test. We will use a generalized form of randomized response given in Algorithm 2 which takes a single data entry from the set {e1, ,ed}, where ej Rd is the standard basis element with a 1 in the jth coordinate and is zero elsewhere, and reports the original entry with probability slightly more than uniform and otherwise reports a different element with equal probability. Note that MGen RR is ϵ-DP.

We have the following result when we use MGen RR on each data entry to obtain a private histogram.

Local Private Hypothesis Testing: Chi-Square Tests

Algorithm 2 Generalized Randomized Response: MGen RR Input: x {e1, ,ed}, ϵ.

Let q(x,z) = 1{x = z} Select ˇx with probability exp[ϵ q(x,ˇx)]

eϵ 1+d Output: ˇx

Lemma 5.2. If we have histogram H = Pn i=1 X i,

where {X i} i.i.d. Multinomial(1,p) and we write ˇH = Pn i=1 MGen RR(X i, ϵ) for each i [n], then ˇH Multinomial(n, ˇp) where

+ (1 p) 1 eϵ + d 1

Once we have ˇH, we can create a chi-square statistic by subtracting ˇH by its expectation and dividing the difference by the expectation. Hence testing H0 : p = p0 after the generalized randomized response mechanism, is equivalent to testing H0 : p = ˇp0 with data ˇH.

We can then form a chi-square statistic using the histogram ˇH which will have the correct asymptotic distribution. Theorem 5.3. Let H Multinomial(n,p) and ˇH be given in Theorem 5.2 with privacy parameter ϵ > 0. Under the null hypothesis H0 : p = p0, we have for ˇp0 = 1 eϵ+d 1 eϵp0 + (1 p0) ,

T(n) Gen RR (ϵ) =

( ˇHj nˇp0 j)2

D χ2 d 1. (4)

Further, with alternate H1 : p = p1 n, the resulting asymptotic distribution is a noncentral chi-square distribution with d 1 degrees of freedom and noncentral parameter, eϵ 1 eϵ+d 1 2 Pd j=1 2 j ˇp0 j .

We then base our LDP goodness of ﬁt test on this result to obtain the correct critical value to reject the null hypothesis based on a chi-square distribution. The test is presented in Algorithm 3. The following result is immediate from the

Algorithm 3 Local DP GOF Test: Local Gen RRGOF

Input: x = (x1, ,xn), ϵ, α, H0 : p = p0.

Let ˇp0 = 1 eϵ+d 1 eϵp0 + (1 p0) . Let ˇH = Pn i=1 MGen RR(xi, ϵ).

Set q = Pd j=1 ( ˇ Hj nˇp0 j)2

nˇp0 j if q > χ2 d 1,1 α Decision Reject. else Decision Fail to Reject. Output: Decision

generalized randomized response mechanism being ϵ-DP and the fact that we use it as a local randomizer.

Theorem 5.4. Local Gen RRGOF is ϵ-LDP.

Testing with Bit Flipping. Note that the noncentral parameter in Theorem 5.3 goes to zero as d grows large due

to the coefﬁcient being eϵ 1 eϵ+d 1 2 . Thus, for large dimensional data the generalized randomized response cannot reject a false null hypothesis. We next consider another differentially private algorithm M : {e1, ,ed} {0, 1}d, given in Algorithm 4 used in (Bassily & Smith, 2015) that ﬂips each bit with some biased probability. 1

Algorithm 4 Bit Flip Local Randomizer: Mbit Input: x {e1, ,ed}, ϵ.

for j [d] do

Set zj = xj with probability eϵ/2

eϵ/2+1, otherwise zj = (1 xj). end for Output: z

Theorem 5.5. The algorithm Mbit is ϵ-DP.

We then want to form a statistic based on the output z {0, 1}d that is asymptotically distributed as a chi-square under the null hypothesis. We defer the proof to the supplementary material.

Lemma 5.6. Consider X i Multinomial(1,p) for each i [n]. We deﬁne the following covariance matrix Σ(p) and

mean vector ep = [(eϵ/2 1)p+1]

eϵ/2+1 , in terms of αϵ = eϵ/2 1 eϵ/2+1

Σ(p) =α2 ϵ [Diag (p) p (p) ] + eϵ/2 eϵ/2 + 1 2 Id (5)

The histogram e H = Pn i=1 Mbit(X i) has the following

asymptotic distribution n e H

n ep D N (0, Σ(p)) . Fur-

ther, Σ(p) is invertible for any ϵ > 0 and p > 0.

Following a similar analysis in (Kifer & Rogers, 2017), we can form the following statistic for null hypothesis H0 : p = p0 in terms of the histogram e H and projection matrix Π = Id 1

d11 , as well as the covariance Σ = Σ p0 and mean ep0 both given in (5) where we replace p with p0:

T(n) Bit Flip (ϵ) = n

We can then design a hypothesis test based on the outputs from Mbit in Algorithm 5

Theorem 5.7. Local Bit Flip GOF is ϵ-LDP.

1Special thanks to Adam Smith for recommending to use this particular algorithm.

Local Private Hypothesis Testing: Chi-Square Tests

10 20 30 40

0.0 0.1 0.2 0.3 0.4 0.5

Noncentral Parameter with eps = 1

Dimension d

Noncentral Parameter Coefficient

10 20 30 40

0.0 0.1 0.2 0.3 0.4 0.5

10 20 30 40

0.0 0.1 0.2 0.3 0.4 0.5

Local Gauss GOF

Local Exp GOF

Local Bit Flip GOF

0 20 40 60 80

0.0 0.5 1.0 1.5 2.0

Noncentral Parameter with eps = 2

Dimension d

Noncentral Parameter Coefficient

0 20 40 60 80

0.0 0.5 1.0 1.5 2.0

0 20 40 60 80

0.0 0.5 1.0 1.5 2.0

Local Gauss GOF

Local Exp GOF

Local Bit Flip GOF

0 100 200 300 400

0 1 2 3 4 5 6

Noncentral Parameter with eps = 3

Dimension d

Noncentral Parameter Coefficient

0 100 200 300 400

0 1 2 3 4 5 6

0 100 200 300 400

0 1 2 3 4 5 6

Local Gauss GOF

Local Exp GOF

Local Bit Flip GOF

0 200 400 600 800 1000

0 5 10 15 20

Noncentral Parameter with eps = 4

Dimension d

Noncentral Parameter Coefficient

0 200 400 600 800 1000

0 5 10 15 20

0 200 400 600 800 1000

0 5 10 15 20

Local Gauss GOF

Local Exp GOF

Local Bit Flip GOF

Figure 1: The coefﬁcient on in the noncentral parameter for the local private tests where Local Gauss GOF" is Local Noise GOF with Gaussian noise, Local Gen RRGOF, and Local Bit Flip GOF for various dimensions d and ϵ.

Algorithm 5 Local DP GOF Test: Local Bit Flip GOF

Input: x = (x1, ,xn), ϵ, α, H0 : p = p0.

Let e H = Pn i=1 Mbit(xi, ϵ). Set q = T(n) Bit Flip (ϵ) if q > χ2 d 1,1 α Decision Reject. else Decision Fail to Reject. Output: Decision

We now show that the statistic in (6) is asymptotically distributed as χ2 d 1, with proof in the supplementary ﬁle.

Theorem 5.8. If the null hypothesis H0 : p = p0 holds, then the statistic T(n) Bit Flip (ϵ) is asymptotically distributed as a chi-

square, i.e. T(n) Bit Flip (ϵ) D χ2 d 1. Further, if we consider the

alternate H1 : p = p1 then T(n) Bit Flip (ϵ) converges in distribution to a noncentral chi-square with d 1 degrees of freedom

and noncentral parameter eϵ/2 1 eϵ/2+1 2 Σ(p0) 1 .

Comparison of Noncentral Parameters. We now compare the noncentral parameters of the three local private tests we presented in Algorithms 1, 3 and 5. We consider the null hypothesis p0 = (1/d, , 1/d) for d > 2, and alternate H1 : p = p0 + n. In this case, we can easily compare the various noncentral parameters for various privacy parameters and dimensions d. In Figure 1 we give the coefﬁcient to the term in the noncentral parameter of the asymptotic distribution for each local private test presented thus far. The larger this coefﬁcient is, the better the power will be for any alternate vector. Note that in Local Noise GOF, we set ρ = ϵ2/8 which makes the variance the same as for a random variable distributed as Lap(2/ϵ) for an ϵ-DP guarantee recall that Local Noise GOF with Gaussian noise does not satisfy ϵ-DP for any ϵ > 0. We give results for ϵ {1, 2, 3, 4} which are all in the range of privacy parameters that have been considered in actual locally differentially private algorithms used in practice.2 From

2In (Erlingsson et al., 2014), we know that Google uses ϵ = ln(3) in RAPPOR and from Aleksandra Korolova s Twitter post on Sept. 13, 2016 https://twitter.com/korolova/

the plots, we see how Local Gen RRGOF may outperform Local Bit Flip GOF depending on the privacy parameter and dimension of the data. We can use these plots to determine which test to use given ϵ and the dimension of data d. When H0 is not uniform, we can use the noncentral parameters given for each test to ﬁnd the test with the largest noncentral parameter for a particular privacy budget ϵ.

Empirical Results. We then empirically compare the power between Local Noise GOF with Laplace noise in Algorithm 1, Local Gen RRGOF in Algorithm 3, and Local Bit Flip GOF in Algorithm 5. Recall that all three of these tests have the same privacy benchmark of local differential privacy. For Local Noise GOF with Laplace noise, we will use m = 999 samples in our Monte Carlo simulations. In our experiments we ﬁx α = 0.05 and ϵ {1, 2, 4}. We then consider null hypotheses of the form p0 = (1/d, 1/d, , 1/d) and alternate H1 : p = p0 + η(1, 1, , 1, 1) for some η > 0. In Figure 2, we plot the number of times our tests correctly rejects the null hypothesis in 1000 independent trials for various sample sizes n and privacy parameters ϵ. From Figure 2, we can see that the test statistics that have the largest noncentral parameter for a particular dimension d and privacy parameter ϵ will have the best empirical power. When d = 4, we see that Local Gen RRGOF performs the best. However, for d = 40 it is not so clear cut. When ϵ = 4, we can see that Local Gen RRGOF does the best, but then when ϵ = 2, Local Bit Flip GOF does best. Thus, the best Local DP Goodness of Fit test depends on the noncentral parameter, which is a function of ϵ, the null hypothesis p0, and alternate p = p0 + . Note that the worst local DP test also depends on the privacy parameter and the dimension d. Based on our empirical results, we see that no single locally private test is best for all data dimensions. However, knowing the corresponding noncentral parameter for a given problem is useful in determining which tests to use. Indeed, the larger the noncentral parameter is the higher the power will be.

status/775801259504734208, Apple uses ϵ = 1, 4.

Local Private Hypothesis Testing: Chi-Square Tests

5000 10000 15000 20000

0.0 0.2 0.4 0.6 0.8 1.0

Empirical Power Curves for Local DP GOF

5000 10000 15000 20000

0.0 0.2 0.4 0.6 0.8 1.0

5000 10000 15000 20000

0.0 0.2 0.4 0.6 0.8 1.0

5000 10000 15000 20000

0.0 0.2 0.4 0.6 0.8 1.0

5000 10000 15000 20000

0.0 0.2 0.4 0.6 0.8 1.0

5000 10000 15000 20000

0.0 0.2 0.4 0.6 0.8 1.0

5000 10000 15000 20000

0.0 0.2 0.4 0.6 0.8 1.0

5000 10000 15000 20000

0.0 0.2 0.4 0.6 0.8 1.0

5000 10000 15000 20000

0.0 0.2 0.4 0.6 0.8 1.0

5000 10000 15000 20000

0.0 0.2 0.4 0.6 0.8 1.0

Non Private

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

Empirical Power Curves for Local DP GOF

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

5000 10000 15000 20000 25000

0.0 0.2 0.4 0.6 0.8 1.0

Empirical Power Curves for Local DP IND

5000 10000 15000 20000 25000

0.0 0.2 0.4 0.6 0.8 1.0

5000 10000 15000 20000 25000

0.0 0.2 0.4 0.6 0.8 1.0

5000 10000 15000 20000 25000

0.0 0.2 0.4 0.6 0.8 1.0

5000 10000 15000 20000 25000

0.0 0.2 0.4 0.6 0.8 1.0

5000 10000 15000 20000 25000

0.0 0.2 0.4 0.6 0.8 1.0

5000 10000 15000 20000 25000

0.0 0.2 0.4 0.6 0.8 1.0

5000 10000 15000 20000 25000

0.0 0.2 0.4 0.6 0.8 1.0

5000 10000 15000 20000 25000

0.0 0.2 0.4 0.6 0.8 1.0

5000 10000 15000 20000 25000

0.0 0.2 0.4 0.6 0.8 1.0

Non Private

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

Empirical Power Curves for Local DP IND

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

0 10000 20000 30000 40000 50000

0.0 0.2 0.4 0.6 0.8 1.0

Figure 2: Plot 1 & 2: comparison of empirical power among the classical non-private test and the local private tests: Local Noise GOF with Laplace noise (solid line), Local Gen RRGOF (dashed line), and Local Bit Flip GOF (dotted line); in Plot 1 d = 4 and η = 0.01, in Plot 2 d = 40 and η = 0.005. Plot 3 & 4: comparison of empirical power among classical non-private test versus local private tests: adding Laplace noise (solid line) which is described in the supplementary ﬁle, Local Gen RRIND (dashed line), and Local Bit Flip IND (dotted line) for the contingency table data distribution given in (12) where Plot 3 (r, c) = (2, 2) and η = 0.01, Plot 4 (r, c) = (10, 4) and η = 0.005.

6. Local Private Independence Tests

Our techniques can be extended to include composite hypothesis tests, where we test whether the data comes from a whole family of probability distributions. We will focus on independence testing, but much of the theory can be extended to general chi-square tests. We will closely follow the presentation and notation as in (Kifer & Rogers, 2017).

We consider two multinomial random variables {U ℓ}n ℓ=1 i.i.d. Multinomial(1,π(1)) for π(1) Rr,

{V ℓ}n ℓ=1 i.i.d. Multinomial(1,π(2)) for π(2) Rc and no component of π(1) or π(2) is zero and each sums to 1. Without loss of generality, we will consider an individual to be in one of r groups who reports a data record that is in one of c categories. The collected data consists of n joint outcomes H whose (i, j)th coordinate is Hi,j = Pn ℓ=1 1{Uℓ,i = 1 & Vℓ,j = 1}. Note that H is then the contingency table over the joint outcomes. Under the null hypothesis of independence between {U ℓ}n ℓ=1 and {V ℓ}n ℓ=1, for probability vector p(π(1),π(2)) = π(1) π(2) , we have H Multinomial n,p(π(1),π(2))

What makes this test difﬁcult is that the analyst does not know the data distribution p(π(1),π(2)) and so cannot simply plug it into the chi-square statistic. Rather, we use the data to estimate the best guess for the unknown probability distribution that satisﬁes the null hypothesis. Note that without privacy, each individual ℓ [n] is reporting a r c matrix X ℓwhich would be 1 in exactly one location. Thus we can alternatively write the contingency table as H = Pn ℓ=1 X ℓ. We then use the three local private algorithms we presented earlier to see how we can form a private chi-square statistic for independence testing. We want to be able to ensure the privacy of both the group and the category that each individual belongs to. Due to space we will only cover private independence tests that use the generalized randomized response mechanism from Algorithm 2 and the

bit ﬂipping local randomizer from Algorithm 4. We defer our independence test with noise addition in the local setting to the supplementary ﬁle.

Testing with Generalized Randomized Response. We want to design an independence test when the data is generated from MGen RR given in Algorithm 2. In this case our contingency table can be written as ˇH Multinomial n, ˇp(π(1),π(2)) where βϵ = 1 eϵ+rc 1 and we use (3) to get

ˇp(π(1),π(2)) = βϵ (eϵ 1)π(1) π(2) + 1 (7)

We then obtain an estimate for the unknown parameters,

ˇπ(1) = 1 βϵ (eϵ 1)

n cβϵ : i [r] ,

ˇπ(2) = 1 βϵ (eϵ 1)

n rβϵ : j [c]

ˇT (n) Gen RR (ϵ) = X

ˇHi,j nˇpi,j ˇπ(1), ˇπ(2) 2

nˇpi,j(ˇπ(1), ˇπ(2)) (8)

We can then prove the following result, where the full proof is in the supplementary ﬁle.

Theorem 6.1. Assuming U and V are independent with true probability vectors π(1),π(2) > 0 respectively, then as n we have ˇT (n) Gen RR (ϵ) D χ2 (r 1)(c 1).

We then use this result to design Algorithm 6.

Theorem 6.2. Local Gen RRIND is ϵ-LDP.

Testing with Bit Flipping. Lastly, we design an independence test when the data is reported via Mbit in Algorithm 4. Assuming that H = Pn ℓ=1 X ℓ

Local Private Hypothesis Testing: Chi-Square Tests

Algorithm 6 Local DP IND Test: Local Gen RRIND

Input: x = (x1, ,xn), ϵ, α, H0 : p = p0.

Let ˇH = Pn i=1 MGen RR(xi, ϵ).

Set q = ˇT (n) Gen RR (ϵ) from (8) if q > χ2 d 1,1 α, Decision Reject. else Decision Fail to Reject. Output: Decision

Multinomial n,p(π(1),π(2)) , then we know that replacing p0 with p(π(1),π(2)) in Section 5 gives us the following asymptotic distribution (treating the contingency table of values as a vector) with covariance matrix Σ( ) given in (5)

π(1) π(2) + 1 eϵ/2 + 1 | {z } ep(π(1),π(2))

D N 0, Σ π(1) π(2) (9)

Similar to analysis for Theorem 6.1, we start with a rough estimate for the unknown parameters which converges in probability to the true estimates, so we use αϵ = eϵ/2 1 eϵ/2+1

n c eϵ/2 + 1 : i [r]

n r eϵ/2 + 1 : j [c]

We then give the resulting statistic, parameterized by the unknown parameters π(ℓ), for ℓ {1, 2}. For middle matrix

f M = ΠΣ eπ(1) eπ(2) 1 Π, we have

e T (n) Bit Flip θ(1),θ(2); ϵ = 1

e H nep θ(1),θ(2)

f M e H nep θ(1),θ(2) (11)

Minimizing e T (n) Bit Flip θ(1),θ(2); ϵ over (θ(1),θ(2)) results in a statistic that is distributed as a chi-square random variable, we defer the full proof to the supplementary ﬁle. Theorem 6.3. Under the null hypothesis where U and V are independent with true probability vectors π(1),π(2) > 0 respectively, then we have as n ,

minθ(1),θ(2) e T (n) Bit Flip θ(1),θ(2); ϵ D χ2 (r 1)(c 1).

We present the test in Algorithm 7. The following result follows from same privacy analysis as before. Theorem 6.4. Local Bit Flip IND is ϵ-LDP.

Algorithm 7 Local DP IND Test: Local Bit Flip IND

Input: (x1, ,xn), ϵ, α.

Let e H = Pn i=1 Mbit(xi, ϵ).

q = minπ(1),π(2) e T (n) Bit Flip π(1),π(2); ϵ from (11).

if q > χ2 (r 1)(c 1),1 α Decision Reject. else Decision Fail to Reject. Output: Decision

Empirical Results. As we did for the goodness of ﬁt tests, we empirically compare the power for our various tests for independence. We consider the null hypothesis that the two sequences of categorical random variables {U ℓ}n ℓ=1 and {V ℓ}n ℓ=1 are independent of one another. Under an alternate hypothesis, we generate the contingency data according to a non-product distribution. We ﬁx the distribution p1 for the contingency table to be of the following form, where π(1) Rr is the unknown distribution for {U ℓ}n ℓ=1, π(2) Rc is the unknown distribution for {V ℓ}n ℓ=1, and r, c are even

p1 = π(1) π(2)

+ η(1, 1, , 1, 1) (1, 1, , 1, 1) (12)

Note that the hypothesis test does not know the underlying π(i) for i {1, 2}, but to generate the data we must ﬁx these distributions. We show power results when the marginal distributions satisfy π(1) = (1/r, , 1/r) and π(2) = (1/c, , 1/c). In Figure 2, we give results for various n and ϵ {1, 2, 4} .

7. Conclusion

We have designed several hypothesis tests, each depending on different local differentially private algorithms. We showed that each statistic has a noncentral chi-square distribution when the data is drawn from some alternate hypothesis H1. Depending on the form of the alternate probability distribution, the dimension of the data, and the privacy parameter, either Local Gen RRGOF or Local Bit Flip GOF gave the best power. This corroborates the results from Kairouz et al. (2014) who showed that in hypothesis testing, different privacy regimes have different optimal local differentially private mechanisms, although utility in their work was in terms of KL divergence. Our results show that the power of the test is directly related to the noncentral parameter of the test statistic that is used. This requires the data analyst to carefully consider alternate hypotheses, as well as the data dimension and privacy parameter for a particular test and then see which test statistic results in the largest noncentral parameter.

Local Private Hypothesis Testing: Chi-Square Tests

Acknowledgements

Marco Gaboardi has been partially supported by NSF under grant TWC-1565365.

American Statistical Association. Discovery with Data: Leveraging Statistics with Computer Science to Transform Science and Society, 2014.

Apple Press Info. Apple previews ios 10, the biggest ios release ever, 2016.

Bassily, R. and Smith, A. Local, private, efﬁcient protocols for succinct histograms. In Proceedings of the Forty-seventh Annual ACM Symposium on Theory of Computing, STOC 15, pp. 127 135, New York, NY, USA, 2015. ACM. ISBN 978-1-4503-3536-2. doi: 10.1145/2746539.2746632.

Bun, M. and Steinke, T. Concentrated differential privacy: Simpliﬁcations, extensions, and lower bounds. In Theory of Cryptography - 14th International Conference, TCC 2016-B, Beijing, China, October 31 - November 3, 2016, Proceedings, Part I, pp. 635 658, 2016. doi: 10.1007/ 978-3-662-53641-4_24.

Cai, B., Daskalakis, C., and Kamath, G. Priv it: Private and sample efﬁcient identity testing. In International Conference on Machine Learning, 2017.

Duchi, J. C., Jordan, M. I., and Wainwright, M. J. Local privacy and statistical minimax rates. In 54th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2013, 26-29 October, 2013, Berkeley, CA, USA, pp. 429 438, 2013a. doi: 10.1109/FOCS.2013.53.

Duchi, J. C., Wainwright, M. J., and Jordan, M. I. Local privacy and minimax bounds: Sharp rates for probability estimation. In Advances in Neural Information Processing Systems 26: 27th Annual Conference on Neural Information Processing Systems 2013. Proceedings of a meeting held December 5-8, 2013, Lake Tahoe, Nevada, United States., pp. 1529 1537, 2013b.

Dwork, C. and Roth, A. The algorithmic foundations of differential privacy. Foundations and TrendsÂ o in Theoretical Computer Science, 9(3â A S4):211 407, 2014. ISSN 1551-305X. doi: 10.1561/0400000042.

Dwork, C., Kenthapadi, K., Mc Sherry, F., Mironov, I., and Naor, M. Our data, ourselves: Privacy via distributed noise generation. In Advances in Cryptology - EUROCRYPT 2006, 25th Annual International Conference on the Theory and Applications of Cryptographic Techniques, St. Petersburg, Russia, May 28 - June 1, 2006, Proceedings, pp. 486 503, 2006a. doi: 10.1007/11761679_29.

Dwork, C., Mcsherry, F., Nissim, K., and Smith, A. Calibrating noise to sensitivity in private data analysis. In In Proceedings of the 3rd Theory of Cryptography Conference, pp. 265 284. Springer, 2006b.

Erlingsson, U., Pihur, V., and Korolova, A. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security, CCS 14, pp. 1054 1067, New York, NY, USA, 2014. ACM. ISBN 978-1-4503-2957-6. doi: 10.1145/2660267. 2660348.

Gaboardi, M., Lim, H. W., Rogers, R., and Vadhan, S. P. Differentially private chi-squared hypothesis testing: Goodness of ﬁt and independence testing. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, ICML 16, pp. 2111 2120. JMLR.org, 2016.

Hunter, D. R., Goodreau, S. M., and Handcock, M. S. Goodness of ﬁt of social network models. Journal of the American Statistical Association, 103:248 258, 2008.

Johnson, A. and Shmatikov, V. Privacy-preserving data exploration in genome-wide association studies. In Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 13, pp. 1079 1087, New York, NY, USA, 2013. ACM.

Kairouz, P., Oh, S., and Viswanath, P. Extremal mechanisms for local differential privacy. In Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N. D., and Weinberger, K. Q. (eds.), Advances in Neural Information Processing Systems 27, pp. 2879 2887. Curran Associates, Inc., 2014.

Kairouz, P., Bonawitz, K., and Ramage, D. Discrete distribution estimation under local privacy. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, New York City, NY, USA, June 19-24, 2016, pp. 2436 2444, 2016.

Kakizaki, K., Fukuchi, K., and Sakuma, J. Differentially private chi-squared test by unit circle mechanism. In Precup, D. and Teh, Y. W. (eds.), Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pp. 1761 1770, International Convention Centre, Sydney, Australia, 06 11 Aug 2017. PMLR.

Karwa, V. and Slavkovi c, A. Inference using noisy degrees: Differentially private β-model and synthetic graphs. Ann. Statist., 44(P1):87 112, 02 2016.

Kifer, D. and Rogers, R. A New Class of Private Chi-Square Hypothesis Tests. In Singh, A. and Zhu, J. (eds.), Proceedings of the 20th International Conference on Artiﬁcial

Local Private Hypothesis Testing: Chi-Square Tests

Intelligence and Statistics, volume 54 of Proceedings of Machine Learning Research, pp. 991 1000, Fort Lauderdale, FL, USA, 20 22 Apr 2017. PMLR.

Mc Sherry, F. and Talwar, K. Mechanism design via differential privacy. In Annual IEEE Symposium on Foundations of Computer Science (FOCS), Providence, RI, October 2007. IEEE.

Pastore, A. and Gastpar, M. Locally differentially-private distribution estimation. In 2016 IEEE International Symposium on Information Theory (ISIT), pp. 2694 2698, July 2016. doi: 10.1109/ISIT.2016.7541788.

Raskhodnikova, S., Smith, A., Lee, H. K., Nissim, K., and Kasiviswanathan, S. P. What can we learn privately? 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, 00:531 540, 2008. ISSN 0272-5428. doi: doi.ieeecomputersociety.org/10.1109/FOCS.2008.27.

Sheffet, O. Differentially private least squares: Estimation, conﬁdence and rejecting the null hypothesis. ar Xiv preprint ar Xiv:1507.02482, 2015.

Sheffet, O. Locally private hypothesis testing. Co RR, abs/1802.03441, 2018. URL http://arxiv.org/ abs/1802.03441.

Steele, J. E., Sundsøy, P. R., Pezzulo, C., Alegana, V. A., Bird, T. J., Blumenstock, J., Bjelland, J., Engø-Monsen, K., de Montjoye, Y.-A., Iqbal, A. M., Hadiuzzaman, K. N., Lu, X., Wetter, E., Tatem, A. J., and Bengtsson, L. Mapping poverty using mobile phone and satellite data. Journal of The Royal Society Interface, 14(127), 2017. ISSN 1742-5689. doi: 10.1098/rsif.2016.0690.

Uhler, C., Slavkovic, A., and Fienberg, S. E. Privacypreserving data sharing for genome-wide association studies. Journal of Privacy and Conﬁdentiality, 5(1), 2013.

Wang, Y., Lee, J., and Kifer, D. Differentially private hypothesis testing, revisited. ar Xiv preprint ar Xiv:1511.03376, 2015.

Warner, S. L. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60:63 69, 1965.

Ye, M. and Barg, A. Optimal schemes for discrete distribution estimation under local differential privacy. In 2017 IEEE International Symposium on Information Theory (ISIT), pp. 759 763, June 2017. doi: 10.1109/ISIT.2017. 8006630.

Yu, F., Fienberg, S. E., Slavkovic, A. B., and Uhler, C. Scalable privacy-preserving data sharing methodology for genome-wide association studies. Journal of Biomedical Informatics, 50:133 141, 2014.