# differentially_private_changepoint_detection__5668d70e.pdf

Differentially Private Change-Point Detection

Rachel Cummings Georgia Institute of Technology

rachelc@gatech.edu

Sara Krehbiel University of Richmond krehbiel@richmond.edu

Yajun Mei Georgia Institute of Technology

ymei@gatech.edu

Rui Tuo Texas A&M University

ruituo@tamu.edu

Wanrong Zhang Georgia Institute of Technology

wanrongz@gatech.edu

The change-point detection problem seeks to identify distributional changes at an unknown change-point k in a stream of data. This problem appears in many important practical settings involving personal data, including biosurveillance, fault detection, ﬁnance, signal detection, and security systems. The ﬁeld of differential privacy offers data analysis tools that provide powerful worst-case privacy guarantees. We study the statistical problem of change-point detection through the lens of differential privacy. We give private algorithms for both online and ofﬂine change-point detection, analyze these algorithms theoretically, and provide empirical validation of our results.

1 Introduction

The change-point detection problem seeks to identify distributional changes at an unknown changepoint k in a stream of data. The estimated change-point should be consistent with the hypothesis that the data are initially drawn from pre-change distribution P0 but from post-change distribution P1 starting at the change-point. This problem appears in many important practical settings, including biosurveillance, fault detection, ﬁnance, signal detection, and security systems. For example, the CDC may wish to detect a disease outbreak based on real-time data about hospital visits, or smart home Io T devices may want to detect changes in activity within the home. In both of these applications, the data contain sensitive personal information.

The ﬁeld of differential privacy offers data analysis tools that provide powerful worst-case privacy guarantees. Informally, an algorithm that is -differentially private ensures that any particular output of the algorithm is at most e more likely when a single data entry is changed. In the past decade, the theoretical computer science community has developed a wide variety of differentially private algorithms for many statistical tasks. The private algorithms most relevant to this work are based on the simple output perturbation principle that to produce an -differentially private estimate of some statistic on the database, we should add to the exact statistic noise proportional to / , where indicates the sensitivity of the statistic, or how much it can be inﬂuenced by a single data entry.

We study the statistical problem of change-point problem through the lens of differential privacy. We give private algorithms for both online and ofﬂine change-point detection, analyze these algorithms theoretically, and then provide empirical validation of these results.

Primary author. Authors are listed in alphabetical order.

32nd Conference on Neural Information Processing Systems (Neur IPS 2018), Montréal, Canada.

1.1 Related work

The change-point detection problem originally arose from industrial quality control, and has since been applied in a wide variety of other contexts including climatology [LR02], econometrics [BP03], and DNA analysis [ZS12]. The problem is studied both in the ofﬂine setting, in which the algorithm has access to the full dataset X = {x1, . . . , xn} up front, and in the online setting, in which data points arrive one at a time X = {x1, . . .}. Change-point detection is a canonical problem in statistics that has been studied for nearly a century; selected results include [She31, Pag54, Shi63, Rob66, Lor71, Pol85, Pol87, Mou86, Lai95, Lai01, Kul01, Mei06, Mei08, Mei10, Cha17].

Our approach is inspired by the commonly used Cumulative Sum (CUSUM) procedure [Pag54]. It follows the generalized log-likelihood ratio principle, calculating

for each k 2 [n] and declaring that a change occurs if and only if (ˆk) T for MLE ˆk = argmaxk (k) and appropriate threshold T > 0. The existing change-point literature works primarily in the asymptotic setting when k

n/n ! r for some r 2 (0, 1) as n ! 1 (see, e.g., [Hin70, Car88]). In contrast, we consider ﬁnite databases and provide the ﬁrst accuracy guarantees for the MLE from a ﬁnite sample (n < 1).

In offering the ﬁrst algorithms for private change-point detection, we primarily use two powerful tools from the differential privacy literature. REPORTMAX [DR14] calculates noisy approximations of a stream of queries on the database and reports which query produced the largest noisy value. We instantiate this with partial log-likelihood queries to produce a private approximation of the the change-point MLE in the ofﬂine setting. ABOVETHRESH [DNR+09] calculates noisy approximations of a stream of queries on the database iteratively and aborts as soon as a noisy approximation exceeds a speciﬁed threshold. We extend our ofﬂine results to the harder online setting, in which a bound on k is not known a priori, by using ABOVETHRESH to identify a window of ﬁxed size n in which a change is likely to have occurred so that we can call our ofﬂine algorithm at that point to estimate the true change-point.

1.2 Our results

We use existing tools from differential privacy to solve the change-point detection problem in both ofﬂine and online settings, neither of which have been studied in the private setting before.

Private ofﬂine change-point detection. We develop an ofﬂine private change-point detection algorithm OFFLINEPCPD (Algorithm 1) that is accurate under one of two assumptions about the distributions from which data are drawn. As is standard in the privacy literature, we give accuracy guarantees that bound the additive error of our estimate of the true change-point with high probability. Our accuracy theorem statements (Theorems 2 and 4) also provide guarantees for the non-private estimator for comparison. Since traditional statistics typically focuses on the the asymptotic consistency and unbiasedness of the estimator, ours are the ﬁrst ﬁnite-sample accuracy guarantees for the standard (non-private) MLE. As expected, MLE accuracy decreases with the sensitivity of the measured quantity but increases as the preand post-change distribution grow apart. Interestingly, it is constant with respect to the size of the database. In providing MLE bounds alongside accuracy guarantees for our private algorithms, we are able to quantify the cost of privacy as roughly DKL(P0||P1)/ .

We are able to prove -differential privacy under the ﬁrst distributional assumption, which is that the measured quantity has bounded sensitivity ( ), by instantiating the general-purpose REPORTMAX algorithm from the privacy literature with our log-likelihood queries (Theorem 1). Importantly and in contrast to our accuracy results, the distributional assumption need only apply to the hypothesized distributions from which data are drawn; privacy holds for arbitrary input databases. We offer a limited privacy guarantee for our second distributional assumption, ensuring that if an individual data point is drawn from one of the two hypothesized distributions, redrawing that data from either of the distributions will not be detected, regardless of the composition of the rest of the database (Theorem 3).

Private online change-point detection. In ONLINEPCPD (Algorithm 2), we extend our ofﬂine results to the online setting by using the ABOVETHRESH framework to ﬁrst identify a window in which the change is likely to have happened and then call the ofﬂine algorithm to identify a more precise approximation of when it occurred. Standard -differential privacy under our ﬁrst distributional assumption follows from composition of the underlying privacy mechanisms (Theorem 5).2 Accuracy of our online mechanism relies on appropriate selection of the threshold that identiﬁes a window in which a change-point has likely occurred, at which point the error guarantees are inherited from the ofﬂine algorithm (Theorem 6).

Empirical validation. Finally, we run several Monte Carlo experiments to validate our theoretical results for both the online and ofﬂine settings. We consider data drawn from Bernoulli and Gaussian distributions, which satisfy our ﬁrst and second distributional assumptions, respectively. Our ofﬂine experiments are summarized in Figure 1, which shows that change-point detection is easier when P0 and P1 are further apart and harder when the privacy requirement is stronger ( is smaller). Additionally, these experiments enhance our theoretical results, ﬁnding that OFFLINEPCPD performs well even when we relax the assumptions required for our theoretical accuracy bounds by running our algorithm on imperfect hypotheses P0 and P1 that are closer together than the true distributions from which data are drawn. Figure 2 shows that ONLINEPCPD also performs well, consistent with our theoretical guarantees.

2 Preliminaries

Our work considers the statistical problem of change-point detection through the lens of differential privacy. Section 2.1 deﬁnes the change-point detection problem, and Section 2.2 describes the differentially private tools that will be brought to bear.

2.1 Change-point background

Let X = {x1, . . . , xn} be n real-valued data points. The change-point detection problem is parametrized by two distributions, P0 and P1. The data points in X are hypothesized to initially be sampled i.i.d. from P0, but at some unknown change time k 2 [n], an event may occur (e.g., epidemic disease outbreak) and change the underlying distribution to P1. The goal of a data analyst is to announce that a change has occurred as quickly as possible after k . Since the xi may be sensitive information such as individuals medical information or behaviors inside their home the analyst will wish to announce the change-point time in a privacy-preserving manner.

In the standard non-private ofﬂine change-point literature, the analyst wants to test the null hypothesis H0 : k = 1, where x1, . . . , xn iid P0, against the composite alternate hypothesis H1 : k 2 [n], where x1, . . . , xk 1 iid P0 and xk , . . . , xn iid P1. The log-likelihood ratio of k = 1 against k = k is given by

P0(xi). (1)

The maximum likelihood estimator (MLE) of the change time k is given by

ˆk(X) = argmaxk2[n] (k, X). (2)

When X is clear from context, we will simply write (k) and ˆk.

An important quantity in our accuracy analysis will be the Kullback-Leibler distance between probability distributions P0 and P1, deﬁned as DKL(P1||P0) =

1 P1(x) log P1(x)

P0(x)dx = Ex P1[log P1(x)

P0(x)]. We always use log to refer to the natural logarithm, and when necessary, we interpret log 0

We will measure the additive error of our estimations of the true change point as follows. Deﬁnition 1 (( , β)-accuracy). A change-point detection algorithm that produces a change-point estimator k(X) where a distribution change occurred at time k is ( , β)-accurate if Pr[| k k | < ] 1 β, where the probability is taken over randomness of the algorithm and sampling of X.

2We note that we can relax our distributional assumption and get a weaker privacy guarantee as in the ofﬂine setting if desired.

2.2 Differential privacy background

Differential privacy bounds the maximum amount that a single data entry can affect analysis performed on the database. Two databases X, X0 are neighboring if they differ in at most one entry.

Deﬁnition 2 (Differential Privacy [DMNS06]). An algorithm M : Rn ! R is ( , δ)-differentially private if for every pair of neighboring databases X, X0 2 Rn, and for every subset of possible outputs S R,

Pr[M(X) 2 S] exp( ) Pr[M(X0) 2 S] + δ.

If δ = 0, we say that M is -differentially private.

One common technique for achieving differential privacy is by adding Laplace noise. The Laplace distribution with scale b is the distribution with probability density function: Lap(x|b) = 1 2b exp

We will write Lap(b) to denote the Laplace distribution with scale b, or (with a slight abuse of notation) to denote a random variable sampled from Lap(b).

The sensitivity of a function or query f is deﬁned as (f) = maxneighbors X,X0 |f(X) f(X0)|. The Laplace Mechanism of [DMNS06] takes in a function f, database X, and privacy parameter , and outputs f(X) + Lap( (f)/ ). Our algorithms rely on two existing differentially private algorithms, REPORTMAX [DR14] and ABOVETHRESH [DNR+09], which are overviewed in Appendix A. Appendix B covers the concentration inequalities used in the proofs of our bounds.

3 Ofﬂine private change-point detection

In this section, we investigate the differentially private change point detection problem in the setting that n data points X = {x1, . . . , xn} are known to the algorithm in advance. Given two hypothesized distributions P0 and P1, our algorithm OFFLINEPCPD privately approximates the MLE ˆk of the change time k . We provide accuracy bounds for both the MLE and the output of our algorithm under two different assumptions about the distributions from which the data are drawn, summarized in Table 1.

Table 1: Summary of non-private and private ofﬂine accuracy guarantees under H1. The expressions ( ), Aδ, C, and CM are deﬁned in (4), (5), (8), (9), resp.

Assumption MLE OFFLINEPCPD

A := ( ) < 1 2A2

A := Aδ < 1 67 C2

3β , 2A log(16/β)

The ﬁrst assumption essentially requires that P1(x)/P0(x) cannot be arbitrarily large or arbitrarily small for any x. We note that this assumption is not satisﬁed by several important families of distributions, including Gaussians. The second assumption, motivated by the δ > 0 relaxation of differential privacy, instead requires that the x for which this log ratio exceeds some bound Aδ have probability mass at most δ.

Although the accuracy of OFFLINEPCPD only holds under the change-point model s alternate hypothesis H1, it is -differentially private for any hypothesized distributions P0, P1 with ﬁnite ( ) and privacy parameters > 0, δ = 0 regardless of the distributions from which X is drawn. We offer a similar but somewhat weaker privacy guarantee when ( ) is inﬁnite but Aδ is ﬁnite, which roughly states that a data point sampled from either P0 or P1 can be replaced with a fresh sample from either P0 or P1 without detection.

3.1 Ofﬂine algorithm

Our proposed ofﬂine algorithm OFFLINEPCPD applies the report noisy max algorithm [DR14] to the change-point problem by adding noise to partial log-likelihood ratios (k) used to estimate the change

point MLE ˆk. The algorithm chooses Laplace noise parameter A/ depending on input hypothesized distributions P0, P1 and privacy parameters , δ and then outputs

{ (k) + Zk}. (3)

Our algorithm can be easily modiﬁed to additionally output an approximation of ( k) and incur 2 privacy cost by composition.

Algorithm 1 Ofﬂine private change-point detector : OFFLINEPCPD(X, P0, P1, , δ, n)

Input: database X, distributions P0, P1, privacy parameters , δ, database size n if δ = 0 then

Set A = maxx log P1(x)

P0(x) minx0 log P1(x0)

P0(x0) # set A = as in (4) else

Set A = min{t : maxi=0,1 Prx Pi[2| log P1(x)

P0(x)| > t] < δ/2} # set A = Aδ as in (5) end if for k = 1, . . . , n do

Compute (k) = Pn

i=k log P1(xi)

P0(xi) Sample Zk Lap( A

) end for Output k = argmax

{ (k) + Zk} # Report noisy argmax

In the change-point or statistical process control (SPC) literature, when the preand postchange distributions are unknown in practical settings, researchers often choose hypotheses P0, P1 with the smallest justiﬁable distance. While it is easier to detect and accurately estimate a larger change, larger changes are often associated with a higher-sensitivity MLE, requiring more noise (and therefore additional error) to preserve privacy. We propose that practitioners using our private change point detection algorithm choose input hypotheses accordingly. This practical setting is considered in our numerical studies, presented in Section 5.

In the case that δ = 0, we sample Laplace noise directly proportional to the sensitivity of the partial log-likelihood ratios we compute:

= max k2[n],X,X02Rn

||X X0||1=1

|| (k, X) (k, X0)||1 = max

x2R log P1(x)

x02R log P1(x0)

P0(x0). (4)

The algorithm should not be invoked with δ = 0 unless ( ) is ﬁnite. In the case that has inﬁnite sensitivity, we instead allow the user to select a privacy parameter δ > 0 and identify a value Aδ for which most values of x P0, P1 have bounded log-likelihood ratio:

2|log P1(x)

As a concrete canonical example, ( ) is unbounded for two Gaussian distributions, but Aδ is bounded for Gaussians with different means as follows: Example 1. For P0 = N(0, 1), P1 = N(µ, 1), and δ > 0, we have Aδ = 2µ[Φ 1(1 δ/2)+µ/2], where Φ is the cumulative distribution function (CDF) of the standard normal distribution.

3.2 Theoretical properties under the uniform bound assumption

In this subsection, we prove privacy and accuracy of OFFLINEPCPD when δ = 0 and P0, P1 are such that ( ) is ﬁnite. Note that if ( ) is inﬁnite, then the algorithm will simply add noise with inﬁnite scale and will still be differentially private. Theorem 1. For arbitrary data X, OFFLINEPCPD(X, P0, P1, , 0) is ( , 0)-differentially private.

The proof follows by instantiation of REPORTMAX [DR14] with queries (k) for k 2 [n], which have sensitivity A = ( ). It is included in Appendix C for completeness.

Next we provide accuracy guarantees of the standard (non-private) MLE ˆk and the output k of our private algorithm OFFLINEPCPD when the data are drawn from P0, P1 with true change point k 2 (1, n). By providing both bounds, Theorem 2 quantiﬁes the cost of requiring privacy in change point detection.

Our result for the standard (non-private) MLE is the ﬁrst ﬁnite-sample accuracy guarantee for this estimator. Such non-asymptotic properties have not been previously studied in traditional statistics, which typically focuses on consistency and unbiasedness of the estimator, with less attention to the convergence rate. We show that the additive error of the MLE is constant with respect to the sample size, which means that the convergence rate is OP (1). That is, it converges in probability to the true change-point k in constant time.

Note that accuracy depends on two measures A and C of the distances between distributions P0 and P1. Accuracy both of MLE ˆk and OFFLINEPCPD output k is best for distributions for which A = ( ) is small relative to KL-divergence, which is consistent with the intuition that larger changes are easier to detect but output sensitivity degrades the robustness of the estimator and requires more noise for privacy, harming accuracy.

A technical challenge that arises in proving accuracy of the private estimator is that the xi are not identically distributed when the true change-point k 2 (1, n], and so the partial log-likelihood ratios (k) are dependent across k. Hence we need to investigate the impact of adding i.i.d. noise draws to a sequence of (k) that may be neither independent nor identically distributed. Fortunately, the differences (k) (k + 1) = log P1(xk)

P0(xk) are piecewise i.i.d. This property is key in our proof. Moreover, we show that we can divide the possible outputs of the algorithm into regions that of doubling size with exponentially decreasing probability of being selected by the algorithm, resulting in accuracy bounds that are independent of the number of data points n. Theorem 2. For hypotheses P0, P1 such that ( ) < 1 and n data points X drawn from P0, P1 with true change time k 2 (1, n], the MLE ˆk is ( , β)-accurate for any β > 0 and

For hypotheses and data drawn this way with privacy parameter > 0, OFFLINEPCPD(X, P0, P1, , 0, n) is ( , β)-accurate for any β > 0 and

In both expressions, A = ( ) and C = min{DKL(P1||P0), DKL(P0||P1)}.

3.3 Relaxing uniform bound assumptions

In this subsection, we prove accuracy and a limited notion of privacy for OFFLINEPCPD when δ > 0 and P0, P1 are such that Aδ is ﬁnite. Since we are no longer able to uniformly bound log P1(x)/P0(x), these accuracy results include worse constants than those in Section 3.2, but the relaxed assumption about P0, P1 makes the results applicable to a wider range of distributions, including Gaussian distributions (see Example 1). Note of course that for some pairs of very different distributions, such as distributions with non-overlapping supports, the assumption that Aδ < 1 may still fail. A true change point k can always be detected with perfect accuracy given xk 1 and xk , so we should not expect to be able to offer any meaningful privacy guarantees for such distributions.

By similar rationale, relaxing the uniform bound assumption means that we may have a single data point xj that dramatically increases (k) for k j, so we cannot add noise proportional to ( ) and privacy no longer follows from that of REPORTMAX. Instead we offer a weaker notion of privacy in Theorem 3 below. As with the usual deﬁnition of differential privacy, we guarantee that the output of our algorithm is similarly distributed on neighboring databases, only our notion of neighboring databases depends on the hypothesized distributions. Speciﬁcally, the a single entry in X drawn from either P0 or P1 may be replaced without detection by another entry drawn from either P0 or P1, even if the rest of the database is arbitrary. The proof is given in Appendix C. Theorem 3. For any , δ > 0, any hypotheses P0, P1 such that Aδ < 1, any index j 2 [n], any i, i0 2 {0, 1}, and any x1, . . . , xj 1, xj+2, . . . , xn, let Xi = {x1, . . . , xn} denote the random

variable with xj Pi and let X0

i0 = {x1, . . . , xj 1, x0

j, xj+1, . . . , xn} denote the random variable with x0

j Pi0. Then for any S [n], we have

Pr[OFFLINEPCPD(Xi, P0, P1, , δ, n) 2 S]

exp( ) Pr[OFFLINEPCPD(X0

i0, P0, P1, , δ, n) 2 S] + δ,

where the probabilities are over the randomness of the algorithm and of Xi, X0

Allowing ( ) to be inﬁnite precludes our use of Hoeffding s inequality as in Theorem 2. The main idea in the proof, however, can be salvaged by decomposing the change into a change from P0 to the average distribution (P0 + P1)/2 and then the average distribution to P1. Correspondingly, we will use CM, an alternate distance measure between P0 and P1, deﬁned below next to C from the previous section for comparison:

C = min {DKL(P0||P1), DKL(P1||P0)} (8)

DKL(P0||P0 + P1

2 ), DKL(P1||P0 + P1

i=0,1 Ex Pi

log 2Pi(x) P0(x) + P1(x)

Because (2Pi)/(P0 + P1) 2, we have 0 DKL(Pi||(P0 + P1)/2) log 2, and thus the constant CM in (9) is well-deﬁned. The proof of the following theorem is given in Appendix C.

Theorem 4. For δ > 0 and hypotheses P0, P1 such that Aδ < 1 and n data points X drawn from P0, P1 with true change time k 2 (1, n), the MLE ˆk is ( , β)-accurate for any β > 0 and

For hypotheses and data drawn this way with privacy parameter > 0, OFFLINEPCPD(X, P0, P1, , δ, n) is ( , β)-accurate for any β > 0 and

3β , 2A log(16/β)

In both expressions, A = Aδ and CM = min

DKL(P0|| P0+P1

2 ), DKL(P1|| P0+P1

4 Online private change-point detection

In this section, we give a new differentially private algorithm for change point detection in the online setting, ONLINEPCPD. In this setting, the algorithm initially receives n data points x1, . . . , xn and then continues to receive data points one at a time. As before, the goal is to privately identify an approximation of the time k when the data change from distribution P0 to P1. Additionally, we want to identify this change shortly after it occurs.

Our ofﬂine algorithm is not directly applicable because we do not know a priori how many points must arrive before a true change point occurs. To resolve this, ONLINEPCPD works like ABOVETHRESH, determining after each new data entry arrives whether it is likely that a change occurred in the most recent n entries. When ONLINEPCPD detects a sufﬁciently large (noisy) partial log likelihood ratio (k) = Pj

i=k log P1(xi)

P0(xi), it calls OFFLINEPCPD to privately determine the most likely change point k in the window {xj n+1, . . . , xj}.

Privacy of ONLINEPCPD is immediate from composition of ABOVETHRESH and OFFLINEPCPD, each with privacy loss /2. As before, accuracy requires X to be drawn from P0, P1 with some true change point k . This algorithm also requires a suitable choice of T to guarantee that OFFLINEPCPD is called for a window of data that actually contains k . Speciﬁcally, T should be large enough that the algorithm is unlikely to call OFFLINEPCPD when j < k but small enough so that it is likely to call OFFLINEPCPD by time j = k + n/2. When both of these conditions hold, we inherit the accuracy of OFFLINEPCPD, with an extra log n factor arising from the fact that the data are no longer distributed exactly as in the change-point model after conditioning on calling OFFLINEPCPD in a correct window.

With our ﬁnal bounds, we note that n A

C log(k /β) sufﬁces for existence of a suitable threshold, and an analyst must have a reasonable approximation of k in order to choose such a threshold. Otherwise, the accuracy bound itself has no dependence on the change-point k .

Algorithm 2 Online private change-point detector : ONLINEPCPD(X, P0, P1, , n, T)

Input: database X, distributions P0, P1, privacy parameter , starting size n, threshold T Let A = maxx log P1(x)

P0(x) minx0 log P1(x0)

P0(x0) Let ˆT = T + Lap(4A/ ) for each new data point xj, j n do

Compute j = maxj n+1 k j (k) Sample Zj Lap( 8A

) if j + Zj > ˆT then

Output OFFLINEPCPD({xj n+1, . . . , xj}, P0, P1, /2, 0, n) + (j n) Halt else

Output ? end if end for

Theorem 5. For arbitrary data X, ONLINEPCPD(X, P0, P1, , n, T) is ( , 0)-differentially private.

This privacy guarantee follows from simple composition of ABOVETHRESH and OFFLINEPCPD, each with privacy loss /2. The proof of the accuracy bound is given in Appendix D. Theorem 6. For hypotheses P0, P1 such that ( ) < 1, a stream of data points X with starting size n drawn from P0, P1 with true change time k n/2, privacy parameter > 0, and threshold T 2 [TL, TU] with

n log(8/β) 16A

we have that ONLINEPCPD(X, P0, P1, , n, T) is ( , β) accurate for any β > 0 and

In the above expressions, A = ( ) and C = min{DKL(P0||P1), DKL(P1||P0)}.

5 Numerical studies

We now report the results of Monte Carlo experiments designed to validate the theoretical results of previous sections. We only consider our accuracy guarantees because the nature of differential privacy provides a strong worst-case guarantee for all hypothetical databases, and therefore is impractical and redundant to test empirically. Our simulations consider both ofﬂine and online settings for two canonical problems: detecting a change in the mean of Bernoulli and Gaussian distributions.

We begin with the ofﬂine setting to verify performance of our OFFLINEPCPD algorithm. We use n = 200 observations where the true change occurs at time k = 100. This process is repeated 104 times. For both the Bernoulli and Gaussian models, we consider the following three different change scenarios, corresponding to the size of the change and parameter selection for OFFLINEPCPD. For each of these cases, we consider privacy parameter = 0.1, 0.5, 1, 1, where = 1 corresponds to the non-private problem, which serves as our baseline. The results are summarized in Figure 1, which plots the empirical probabilities β = Pr[| k k | > ] as a function of .

(A) Large change. Bernoulli model: detecting a change from p0 = 0.2 to p1 = 0.8. Gaussian

model: detecting a change from µ0 = 0 to µ1 = 1.

(B) Small change. Bernoulli model: detecting a change from p0 = 0.2 to p1 = 0.4. Gaussian

model: detecting a change from µ0 = 0 to µ1 = 0.5. (C) Misspeciﬁed change Bernoulli model: algorithm tests for change from p0 = 0.2 to p1 = 0.4

when true distributions have p0 = 0.2 and p1 = 0.8. Gaussian model: algorithm tests for change from µ0 = 0 to µ1 = 0.5 when true distributions have µ0 = 0 and µ1 = 1.

Figure 1 highlights three positive results for our algorithm when data is drawn from Bernoulli or Gaussian distributions: accuracy is best when the true change in data is large (plots a and d) compared to small (plots b and e), accuracy deteriorates as decreases for stronger privacy, and the algorithm performs well even when the true change is larger than that hypothesized (plots c and f). This ﬁgure emphasizes that our algorithm performs well even for quite strong privacy guarantees ( < 1). The misspeciﬁed change experiments bolster our theoretical results substantially, indicating that our hypotheses can be quite far from the distributions of the true data and our algorithms will still identify a change-point accurately. We also run Monte Carlo simulations of our online change-point detection algorithm ONLINEPCPD. These are displayed in Figure 2 and discussed in Appendix E.

0 20 40 60 80 100

0.0 0.2 0.4 0.6 0.8 1.0

ε=0.1 ε=0.5 ε=1 MLE(ε= )

(a) Bernoulli, large change

0 20 40 60 80 100

0.0 0.2 0.4 0.6 0.8 1.0

ε=0.1 ε=0.5 ε=1 MLE(ε= )

(b) Bernoulli, small change

0 20 40 60 80 100

0.0 0.2 0.4 0.6 0.8 1.0

ε=0.1 ε=0.5 ε=1 MLE(ε= )

(c) Bernoulli, misspeciﬁed change

0 20 40 60 80 100

0.0 0.2 0.4 0.6 0.8 1.0

ε=0.1 ε=0.5 ε=1 MLE(ε= )

(d) Gaussian, large change

0 20 40 60 80 100

0.0 0.2 0.4 0.6 0.8 1.0

ε=0.1 ε=0.5 ε=1 MLE(ε= )

(e) Gaussian, small change

0 20 40 60 80 100

0.0 0.2 0.4 0.6 0.8 1.0

ε=0.1 ε=0.5 ε=1 MLE(ε= )

(f) Gaussian, misspeciﬁed change

Figure 1: Accuracy for large change, small change, and misspeciﬁed change Monte Carlo simulations with Bernoulli and Gaussian data. Each simulation involves 104 runs of OFFLINEPCPD with varying on data generated by 200 i.i.d. samples from appropriate distributions with change point k = 100.

Acknowledgments

R.C. and S.K. were supported in part by a Mozilla Research Grant. Y.M. and W.Z. were supported in part by NSF grant CMMI-1362876. R.T. was supported in part by NSF grant DMS-156443. R.T. s contribution was completed while the author was visiting the Georgia Institute of Technology.

[BP03] J. Bai and P. Perron. Computation and analysis of multiple structural change models.

Journal of Applied Econometrics, 18(1):1 22, 2003.

[Car88] E. Carlstein. Nonparametric change-point estimation. Ann. Statist., 16:188 197, 1988.

[Cha17] H. P. Chan. Optimal sequential detection in multi-stream data. Ann. Statist., 45(6):2736

2763, 2017.

[DMNS06] Cynthia Dwork, Frank Mc Sherry, Kobbi Nissim, and Adam Smith. Calibrating noise to

sensitivity in private data analysis. In Proceedings of the 3rd Conference on Theory of Cryptography, TCC 06, pages 265 284, 2006.

[DNR+09] Cynthia Dwork, Moni Naor, Omer Reingold, Guy N. Rothblum, and Salil P. Vadhan. On

the complexity of differentially private data release: efﬁcient algorithms and hardness results. In STOC 09, pages 381 390, 2009.

[DR14] Cynthia Dwork and Aaron Roth. The algorithmic foundations of differential privacy.

Foundations and Trends R in Theoretical Computer Science, 9(3 4):211 407, 2014.

[Hin70] D. V. Hinkley. Inference about the change-point in a sequence of random variables.

Biometrika, 57:1 17, 1970.

[Kul01] M. Kulldorff. Prospective time periodic geographical disease surveillance using a scan

statistic. J. Roy. Statist. Soc. Ser. A, 164(1):61 72, 2001.

[Lai95] T. L. Lai. Sequential changepoint detection in quality control and dynamical systems

(with discussions). J. Roy. Statist. Soc. Ser. B, 57(4):613 658, 1995.

[Lai01] T. L. Lai. Sequential analysis: some classical problems and new challenges (with

discussions). Statistica Sinica, 11(2):303 408, 2001.

[Lor71] G. Lorden. Procedures for reacting to a change in distribution. Ann. Math. Statist.,

42(6):1897 1908, 1971.

[LR02] R. Lund and J. Reeves. Detection of undocumented changepoints: A revision of the

two-phase regression model. Journal of Climate, 15(17):2547 2554, 2002.

[Mei06] Y. Mei. Sequential change-point detection when unknown parameters are present in the

pre-change distribution. Ann. Statist., 34(1):92 122, 2006.

[Mei08] Y. Mei. Is average run length to false alarm always an informative criterion? (with

discussions). Sequential Analysis, 27(4):354 419, 2008.

[Mei10] Y. Mei. Efﬁcient scalable schemes for monitoring a large number of data streams.

Biometrika, 97(2):419 433, 2010.

[Mou86] G. V. Moustakides. Optimal stopping times for detecting changes in distributions. Ann.

Statist., 14(4):1379 1387, 1986.

[Pag54] E. S. Page. Continuous inspection schemes. Biometrika, 41(1/2):100 115, 1954.

[Pol85] M. Pollak. Optimal detection of a change in distribution. Ann. Statist., 13(1):206 227,

[Pol87] M. Pollak. Average run lengths of an optimal method of detecting a change in distribu-

tion. Ann. Statist., 15(2):749 779, 1987.

[Rob66] S. W. Roberts. A comparison of some control chart procedures. Technometrics, 8(3):411

[She31] W. A. Shewhart. Economic Control of Quality of Manufactured Product. New York: D

Van Norstrand. Preprinted by ASQC Quality Press, Wisconsin, 1980., 1931.

[Shi63] A. N. Shiryaev. On optimum methods in quickest detection problems. Theory of

Probability & Its Applications, 8(1):22 46, 1963.

[VDVW96] Aad W Van Der Vaart and Jon A Wellner. Weak convergence. Springer, 1996.

[ZS12] N. Zhang and D. O. Siegmund. Model selection for high-dimensional, multi-sequence

change-point problems. Statistica Sinica, 22:1507 1538, 2012.