# private_lossless_multiple_release__f1a161c2.pdf

Private Lossless Multiple Release

Joel Daniel Andersson 1 2 Lukas Retschmeier 1 2 Boel Nelson 2 Rasmus Pagh 1 2

Koufogiannis et al. (2016) showed a gradual release result for Laplace noise-based differentially private mechanisms: given an ε-DP release, a new release with privacy parameter ε > ε can be computed such that the combined privacy loss of both releases is at most ε

and the distribution of the latter is the same as a single release with parameter ε . They also showed gradual release techniques for Gaussian noise, later also explored by Whitehouse et al. (2022). In this paper, we consider a more general multiple release setting in which analysts hold private releases with different privacy parameters corresponding to different access/trust levels. These releases are determined one by one, with privacy parameters in arbitrary order. A multiple release is lossless if having access to a subset S of the releases has the same privacy guarantee as the least private release in S, and each release has the same distribution as a single release with the same privacy parameter. Our main result is that lossless multiple release is possible for a large class of additive noise mechanisms. For the Gaussian mechanism we give a simple method for lossless multiple release with a short, self-contained analysis that does not require knowledge of the mathematics of Brownian motion. We also present lossless multiple release for the Laplace and Poisson mechanisms. Finally, we consider how to efficiently do gradual release of sparse histograms, and present a mechanism with running time independent of the number of dimensions.

1. Introduction

Differential privacy (Dwork et al., 2006) is a statistical notion that provides provable privacy guarantees. Differentially private (DP) algorithms typically introduce inaccuracy through noise to achieve privacy, and the resulting privacy-

1Basic Algorithms Research Copenhagen (BARC), Denmark 2University of Copenhagen, Denmark. Correspondence to: Joel Daniel Andersson <jda@di.ku.dk>, Lukas Retschmeier <lure@di.ku.dk>, Boel Nelson <bn@di.ku.dk>, Rasmus Pagh <pagh@di.ku.dk>.

Proceedings of the 42 nd International Conference on Machine Learning, Vancouver, Canada. PMLR 267, 2025. Copyright 2025 by the author(s).

accuracy trade-off is the key object of study in the area. Of specific interest is the privacy budget that determines how much information the output of a differentially private algorithm may reveal about its inputs a smaller budget means more private but less accurate results.

Motivation. In deployments of differential privacy, it may be hard to determine the appropriate privacy budget to grant an analyst since it depends on trust and accuracy assumptions that may change over time. A system may also have different security clearance levels a data analyst might have a higher clearance level than a developer, but both might require access to some statistics. Similarly, a company that wants to release, for example, user statistics could make an accurate release for their own data analysts, a less accurate release for external consultants, and include an even less accurate release in a report for shareholders or other external actors. A different example setting is users that want to sell their data on data markets: users could let the accuracy of the release depend on how much they are paid, and use different budgets for different releases.

Another usage scenario, relevant in distributed or federated settings, is local differential privacy where the data owner could be sharing a differentially private function of their data with multiple servers, and the set of servers might change over time. Here the privacy budget might depend on the server: for example, a patient may trust their local hospital more than their national hospitals, but might not trust the hospitals not to collude by sharing data among themselves.

Multiple releases. These scenarios motivate creating multiple releases with different privacy budgets aimed at different analysts. However, these releases should be coordinated such that a group of analysts who combine their information do not gain more knowledge about the input than the most knowledgeable member of the group. This kind of collusion resilience was first studied by Xiao et al. (2009) with a non DP privacy objective we refer to their work for additional motivation.

A related aspect is that we may want to provide an analyst with a less private, more accurate release after the trust we place in them increases. In this case, we want the accuracy of the latest release to match the accuracy that can be obtained, given the combined privacy budget of both releases. That is, no additional cost should be incurred for

Private Lossless Multiple Release

making two releases rather than one. For example, an external consultant that later gets employed directly by the company should be able to get access to the more accurate release without constituting a privacy violation or requiring an increased privacy budget to reach the same accuracy.

Baseline. It is possible to create multiple releases for any differentially private mechanism if we are willing to increase the privacy budget by a constant factor. In particular, we can create a sequence of independent releases with geometrically increasing privacy parameters, referred to as ε-doubling by Ligett et al. (2017), and provide each analyst with the most accurate release they are entitled to. Using composition results, the combined privacy budget for the information given to a set of analysts is a constant factor from the highest budget of a single analyst in the group. In this paper, we study how to make multiple releases in a lossless way without accuracy or privacy penalty.

1.1. Related Work

Koufogiannis, Han, and Pappas (2016) introduced the concept of (lossless) gradual release, which makes it possible to increase the privacy budget with no loss in accuracy. They also considered privacy tightening for the Laplace mechanism, where successive releases are increasingly private. In the journal version of the paper, they also introduce gradual release for the Gaussian mechanism under approximate DP which is based on the machinery of Brownian motion.

This technique was later applied in a noise reduction framework by Ligett, Neel, Roth, Waggoner, and Wu (2017) with the goal of producing mechanisms with ex-post privacy, i.e., where the privacy budget is set based on what is required to achieve a desired accuracy level. Follow-up work by Whitehouse, Ramdas, Wu, and Rogers (2022) gave results for noise reduction under approximate DP using Brownian motion. These works were motivated by work on privacy filters and odometers (Rogers et al., 2016), keeping track of privacy budgets over time, rather than multiple release settings. More recently, Pan (2024) demonstrated lossless gradual release for randomized response.

Girgis et al. (2020) studied privacy-utility-randomness tradeoffs for a multi-analyst setting under local differential privacy. Here each user has a sample Xi D and makes a single public release of Xi. Each analyst then utilizes the public release (together with shared randomness among the users) to estimate D at a more accurate level than what the public release alone would allow. Later Zhang and He (2023) also used lossless gradual release techniques for the Gaussian mechanism in designing a database system supporting analysts with different privilege levels.

Similar research questions have been investigated outside of the DP literature. Xiao, Tao, and Chen (2009) studied

ρ ρk ρk+1 ρ ρ1

Yρ1 Yρk Yρk+1 Yρ Yρ

most private least private

Figure 1. The idea behind lossless multiple release. For concreteness we consider additive noise mechanism and zero-concentrated differential privacy. Each Yρi denotes a noisy estimate, and to release a new estimate with ρ > 0, we can combine the adjacent estimates for ρk and ρk+1 together with some fresh noise to obtain a new release Yρ that is exactly ρ-z CDP. Note that these estimates do not need to be strictly increasing (or decreasing) but can be released in any order. Furthermore, releasing any subset of these estimates is exactly max(S)-z CDP, where S is the set of privacy parameters.

releasing a sensitive dataset where each element is kept with probability p, and otherwise sampled uniformly from the universe. Li, Chen, Li, and Zhang (2012) considered privatizing data by additive Gaussian noise of scale σ. Both works dealt with arbitrary sequences of parameters (probabilities p or noise scalings σ), and demonstrated how to correlate releases to guarantee that (1) each release matches the single-release case and, (2) limiting the sensitive information derived from combining releases. Our method for adaptively producing Gaussian releases deviates from (Li et al., 2012), in that ours does not need to maintain any covariance matrix for past releases. This makes our approach more timeand space-efficient.

Finally, we mention the classic work of Equitz and Cover (1991) who studied successive refinement of information where a source is encoded once and then refined in later stages without decreasing the rate compared to a one-shot optimal code. They showed that this is possible precisely when the sequence of reconstructions can be written as a Markov chain. While their work did not focus on privacy, successive refinement is conceptually related to lossless gradual release. Indeed, parts of the mathematical machinery was independently re-discovered by subsequent work on lossless gradual release.

1.2. Basic Technique

We illustrate our framework in the setting of additive Gaussian noise and ρ-zero-concentrated differential privacy (ρ-z CDP)1. We want to release a private estimate of a real-valued query f : X R with ℓ2sensitivity 2 = 1 using the Gaussian mechanism. Now consider the simple case where one wants to privately release an estimate with two privacy levels ρ < ρ . Then releasing Yρ = f(x) + N(0, 1

2ρ) together with Yρ ρ = f(x) + N(0, 1 2(ρ ρ)) is ρ -z CDP by compo-

1see definitions in Appendix D

Private Lossless Multiple Release

sition. Now observe that we can combine these estimates to produce a better estimate using inverse variance weighting: Y = ρ ρ+(ρ ρ)Yρ + ρ ρ ρ+(ρ ρ)Yρ ρ yields exactly the same utility as a single release under ρ -z CDP. That is, instead of using a privacy budget of ρ + ρ for independently releasing both estimates, the overall budget spent is just the maximum of both, which is ρ .

To do multiple lossless releases for more general additive noise mechanisms, it turns out that one can always combine existing releases with fresh noise as illustrated in Figure 1. In fact, to make any new release with privacy parameter ρ, it suffices to have saved the two adjacent releases whose privacy parameters are closest to ρ. Our approach applies more generally to a class of (independent) additive noise mechanisms that support gradual release, but the releases can be made in any order, and we do not require the set of releases to be known in advance.

Our contributions. We introduce a framework for lossless multiple release, generalizing past work on gradual release. In addition to allowing for making multiple private releases and having the privacy loss only scale with the least private release, we impose no specific ordering on the privacy parameters used. Furthermore, we formalize a general theorem (Section 4.2) showing that lossless multiple release is possible for a large class of mechanisms based on adding i.i.d. noise to each coordinate whose distribution satisfies a convolution preorder, as defined next.

Definition 1.1 (Convolution preorder). A family of realvalued distributions D(ρ) parameterized by ρ R+ is said to satisfy a convolution preorder, if given D(ρ1) and D(ρ2) for ρ1 < ρ2, there exists a distribution C(ρ2, ρ1) such that D(ρ2) C(ρ2, ρ1) = D(ρ1).

This relation is a natural way of stating that the distributions become more noisy as the parameter ρ gets smaller. We are unaware of any existing term in the literature but the definition is closely related to convolution order (Shaked & Shanthikumar, 2007). Examples of noise distributions satisfying Definition 1.1 include the Laplace, Poisson, and the Gaussian mechanisms, parameterized by a decreasing function of their variance.

Theorem 1.2 (Meta theorem, informal). Let Af,ρ : X Rd be a mechanism that adds independent, identically distributed noise to coordinates of a function f : X Rd. If the noise distribution of Af,ρ satisfies convolution preorder, then there exists an algorithm enabling lossless multiple release. Also, any invertible post-processing of Af,ρ preserves this property.

For the Gaussian mechanism in particular, we show how to get lossless multiple release by only using basic properties of Gaussians, and we show similar results for the Laplace and Poisson mechanisms from first principles. We give concrete

instantiations of Gaussian sparse histograms (Section 5.2) and factorization mechanisms (Section 5.1).

2. Threat Model and Goals

Our setting is a multi-user system with a set of analysts/users S, all able to query the same dataset using a differentially private mechanism. User s has their own security clearance level with corresponding privacy budget ρs. This setting allows for multi-level security where each user belongs to a security clearance level, like in the classic Bell La Padula model (Bell & La Padula, 1976), where users with high clearance levels have higher values of ρ. A common example of such clearance levels is using increasing levels with labels such as public, restricted, confidential, and top secret. The user with the highest clearance in the system has a privacy budget of max s S (ρs), which we denote ρmax.

We consider an adversary who gains partial knowledge, that is, an adversary that sees some of the releases. Our goal is to design a differentially private mechanism such that an adversary with access to releases from some users S S, can learn at most what they could have learned from a release with privacy parameter max s S (ρs ). This means

that even by compromising or colluding with more users, the adversary s knowledge may not increase. In case an adversary observes all releases (e.g., by compromising all users), the privacy loss would be bounded by ρmax.

However, our goal is to design a mechanism where the releases are lossless in the sense that the noise distribution from multiple releases with a combined budget ρmax would be indistinguishable from one single release with the privacy budget ρmax. In other words, the privacy loss should be determined by ρmax, while independent releases would usually have privacy parameter P

s S ρs due to composition.

3. Gaussian Lossless Multiple Release

Extending the work in (Koufogiannis et al., 2016; Li et al., 2012), we demonstrate next that in the case of the Gaussian mechanism, providing lossless multiple release is clean and follows immediately from simple properties of Gaussians. Throughout the paper, we use the symbol Y to be a random variable that depends on the private dataset and Z to be one that does not.

We first consider the gradual release setting where an increasingly accurate estimate is released, and the overall privacy loss is determined solely by the latest, least private release. In Lemma 3.3, we drop this restriction, allowing releases in any order while still guaranteeing that the overall privacy guarantee is the maximum ρ value provided. For simplicity, we consider the one-dimensional case, where we

Private Lossless Multiple Release

want to release some query f : X R. The d-dimensional setting is handled by sampling each coordinate independently.

Getting started. Foreshadowing the usage of ρ-z CDP, we initially use 1/(2ρ) for denoting the variance of a Gaussian. Our inquiry starts with a basic observation about inversevariance weighting of Gaussians.

Lemma 3.1. Let Yρ N(β, 1

2ρ) and Yρ N(β, 1 2ρ ) where β R and ρ, ρ > 0. Then

Y = ρ ρ+ρ Yρ + ρ

ρ+ρ Yρ N β, 1 2(ρ+ρ )

Proof. The mean of Y is immediate by the fact that Y is a weighted-average of Yρ and Yρ , both of mean β. For the variance, direct computation yields:

Var[Y ] = ρ2/(2ρ) + ρ 2/(2ρ )

(ρ1 + ρ )2 = 1 2(ρ + ρ ) .

As the sum of two Gaussians is itself Gaussian, we are done.

The essence of what is being claimed is that a Gaussian of variance 1 2ρ and another Gaussian of variance 1 2ρ with the same mean can be combined into a new Gaussian with variance 1 2(ρ+ρ ) and the same mean. Inspired by this, we can repeatedly invoke the lemma for the following result.

Lemma 3.2. Given 0 < ρ1 < < ρm and β R define Yρ1, . . . , Yρm R where Yρ1 N(β, 1

ρ1 ), and for k > 1:

Yρk+1 = ρk ρk+1 Yρk + ρk+1 ρk

ρk+1 N β, 1 2(ρk+1 ρk)

Then for any i [m] : Yρi N(β, 1 2ρi ) and for any j [m] : Cov(Yρi, Yρj) = 1 2 max(ρi,ρj).

Proof. We first show the distribution of Yρk by induction on k. Assume that Yρk N(β, 1

ρk ), which is true for the base case of k = 1. Note that the inductive step follows immediately from invoking Lemma 3.1. For the covariance, note that we can expand the expressions inside the covariance and throw away the independent noise added in each recurrence. Assuming i j we get Cov(Yρi, Yρj) = Cov(Yρi, ρi

ρj Yρi) = ρi

ρj Var[Yρi] = 1 2ρj , implying the stated covariance.

Releases in arbitrary order. The Gaussian sequence in Lemma 3.2 will be the basis for our lossless multiple release version of the Gaussian mechanism. What the lemma does not address is generating the sequence in arbitrary order: Statically, given the full sequence (ρk)k [m], we can generate the Gaussians, but what if we receive them one-by-one and in arbitrary order? We will address this next.

Lemma 3.3. For ρ R+ (possibly ρ = ) and β R, let M = {(0, ), (ρ , Yρ )} where Yρ N(β, 1 2ρ ). Consider a finite subset S (0, ρ ) and the following process that runs for |S| iterations:

1. Pick an arbitrary ρ S and delete it from S;

2. Let (ρl, Yρl), (ρr, Yρr) M where ρ (ρl, ρr) and ρr ρl is minimal;

3. Sample Z N 0, (1 ρl/ρ)(1/ρ 1/ρr)

Yρ = 1 ρl/ρ 1 ρl/ρr Yρr + ρl/ρ ρl/ρr

1 ρl/ρr Yρl + Z,

and add (ρ, Yρ) to M.

Then the sequence of random variables generated by the process has the same distribution as described in Lemma 3.2.

Proof sketch. The argument is inductive. Under the hypothesis that all values generated up to the given point have the distribution described by Lemma 3.2, we argue that the newly generated value does too. The argument considers the four different cases for ρl, ρr, e.g., ρl = 0, ρr = ρ . As the proof involves tedious computation we refer the reader to Appendix B for the formal proof.

Formalizing lossless multiple release. Lemma 3.3 will constitute the basis for our implementation of lossless multiple release, but we have yet to formally define this notion. We do so next.

Definition 3.4 (Lossless multiple release). Let Mρ : X Y be a family of mechanisms on a domain X, indexed by a privacy parameter ρ R+. We say that M : X R+ Y implements Mρ with lossless multiple release if for every x X it satisfies:

1. ρ: M(x, ρ) and Mρ(x) are identically distributed.

2. For every finite subset S R+, processed in arbitrary order by M, and y Y, conditioned on M(x, max(S)) = y the joint distribution of (M(x, ρ))ρ S is uniquely determined by y and S.

Functionally, a mechanism meets the definition if its outputs can be correlated such that for any set of outputs, their joint distribution can be viewed as (randomized) post-processing of the least private release. An implementation necessarily has to store information about releases that have been made, i.e., the sequence of inputs to M and the corresponding outputs, to fulfill the requirement that releases for different privacy parameters are correlated. When M only supports outputting releases for a sequence of increasing privacy parameters, we call it lossless gradual release.

Private Lossless Multiple Release

Algorithm 1 Gaussian Multiple Release Parameters: ℓ2 sensitivity 2 Inputs: Set of releases M, privacy parameter ρ

1: Find (ρk, Yρk), (ρk+1, Yρk+1) M such that 2: ρ [ρk, ρk+1) and (ρ , ) M : ρ / (ρk, ρk+1)

3: Sample Zρ N(0, 2 2 (1 ρk/ρ) (1/ρ 1/ρk+1)

2(1 ρk/ρk+1) )

4: Let Yρ := Zρ + (1 ρk/ρ)Yρk+1+(ρk/ρ ρk/ρk+1)Yρk

1 ρk/ρk+1 5: Add M = M {(ρ, Yρ)} 6: Return Yρ

Having stated the definition for lossless multiple release, consider Algorithm 1. It implements Lemma 3.3, and the idea is visualized in Figure 1. Corollary 3.5. With M initialized as M = {(0, ), ( , f(x))}, Algorithm 1 implements the Gaussian mechanism with lossless multiple release.

Proof. Let {Yρ}ρ S be the set of outputs produced by Algorithm 1 on receiving the set S of privacy parameters in arbitrary order. Observe that the algorithm is implementing the (adaptive) sampling in Lemma 3.3, and so it produces outputs with the same distribution as Lemma 3.2. Property 1 of Definition 3.4 follows immediately from observing that Yρ N(f(x), 1

2ρ). For property 2, note that every release in Lemma 3.2 can be viewed as randomized post-processing of the least private release. One way to see this is to note that for any two releases Yρ and Yρ where ρ < ρ , we have that Cov(Yρ, Yρ ) = Var[Yρ ], implying that Yρ = Yρ + Z for aptly scaled zero-mean Gaussian noise Z.

4. Extending to Independent Additive Noise

We will next show that lossless multiple release holds for a larger class of mechanisms. To proceed we introduce the notion of an independent additive noise mechanism. Definition 4.1 (Independent Additive Noise Mechanism). We define an independent additive noise mechanism Af,ρ : X Rd as a mechanism of the form

Af,ρ(x) = f(x) + Z , Z D(ρ) ,

where D(ρ) is a probability distribution parameterized in ρ, that draws a d-dimensional vector with i.i.d. samples.

It turns out that any independent additive noise mechanism Af,ρ satisfying Definition 1.1 supports lossless multiple release. Lemma 4.2. Any independent additive noise mechanism Af,ρ with noise distribution D(ρ) satisfying convolution preorder can be implemented with lossless multiple release.

Proof. The proof is for the one-dimensional case, but the same argument can be invoked for the multidimensional

case as each coordinate has independent noise. We begin by proving that given S = {ρk : k [m]} R+, where ρ1 < < ρm, it is possible to construct a set of releases {Yρk : k [m]} that satisfy Definition 3.4. By Definition 1.1, it is possible to express each Yρk as

Yρk = f(x) + Zρm +

j=k Wρj+1,ρj , (1)

where Zρm D(ρm) and Wρj+1,ρj C(ρj+1, ρj) are sam-

pled independently. To prove that Yρk d= Af,ρk(x), observe that Yρk is distributed as f(x) plus a random variable drawn from D(ρm) C(ρm, ρm 1) C(ρk+1, ρk) = D(ρk), as needed. For the second property in Definition 3.4, observe that conditioning on Yρm = y we can express every other Yρk for k [m 1] as

j=k Wρj+1,ρj .

As a result, the joint distribution on (Yρ1, . . . , Yρm)|Yρm=y is uniquely determined by y, proving the second property.

We will now argue (inductively) that we can produce these releases adaptively. Let S1 S2 Sm = S be a sequence of subsets of S where |Si| = i. For the base case of S1 we can make a single release using Af,ρ. For our inductive hypothesis, assume we have produced releases corresponding to all the privacy parameters in Sn. Now, for Sn+1, we re-label the privacy parameters such that Sn+1 = {ρk}k [n+1] where ρ1 < < ρn+1. For the unique ρi Sn+1 \ Sn, we can conditionally sample it from the joint distribution over (Yρk)k [n+1], conditioned on the value of each release made in the previous round. By induction, it follows that we can adaptively release S.

4.1. Sampling in Concrete Settings

Lemma 4.2 proves the existence of a sampling procedure for lossless multiple release. In Section 3 we showed a sampling procedure in the particular case of Gaussian noise. In Appendix A we show that the structure of Algorithm 1 holds for general independent additive noise mechanisms. Namely, if the privacy parameters we support come from a bounded range (ρ0, ρ ) R+, then the corresponding algorithm has a similar structure (see Algorithm 5).

More precisely, consider two neighboring releases Yρk and Yρk+1 with noise parameters ρk and ρk+1, respectively, and a new lossless release Yρ with parameter ρ [ρk, ρk+1]. We can use (1) on this set of m + 1 releases and condition on the values of the m previous releases Yρ1, . . . , Yρm. Next, write Wρk+1,ρk = W1 + W2 where W1 = Yρ Yρk+1 and W2 = Yρk Yρ. In Appendix A, we show that sampling

Private Lossless Multiple Release

Yρ can be reduced to the following task:

Sample W1 conditioned on W1+W2 = Yρk Yρk+1. (2)

Example: Poisson mechanism. Consider the independent additive noise mechanism using the Poisson distribution, Poi(λ). This mechanism has the property that noise is always a non-negative integer, making it a natural noise distribution for integer vectors in settings where negative noise is undesirable. Appendix E states some basic privacy properties of the Poisson mechanism.

The family of Poisson distributions parameterized by ρ = 1/λ satisfies convolution preorder since for any λ1 > λ2, Poi(λ1) = Poi(λ1 λ2) Poi(λ2). To adaptively perform private lossless multiple release of a value f(x) using the Poisson mechanism and parameters λ1 > > λm we notice that the bridging distribution in the kth term of the sum in (1) has distribution Poi(λk λk+1). Note that without conditioning on Yρ1, . . . , Yρm, W1 Poi(λ λk+1) and W2 Poi(λk λ). By Lemma E.4 in the appendix we have that the sampling in (2) reduces to W1 Binomial(N, p) for N = Yρk Yρk+1 and p = (λ λk+1)/(λk λk+1).

Example: Laplace Mechanism. Koufogiannis et al. (2016) have already shown that the Laplace distribution parameterized by ρ = 1/b satisfies convolution preorder for any scale parameters b1 > b2. Let Lap Bridge(b2, b1) be the probability distribution that draws 0 with probability b2 2/b2 1 and from Lap(0, b1) with the remaining probability. Then Lap(0, b2) Lap Bridge(b2, b1) = Lap(0, b1) exactly. To implement lossless release via the sampling in (2), it turns out that the conditional distribution is a mixture W1 Lap Bridge(b, b2) of three distributions: With some probability W1 is equal to either 0 or Yρk Yρk+1, and otherwise it is sampled from the convolution of two Laplace distributions. We also show that the related exponential distribution satisfies convolution preorder, see Appendix F for the full details.

4.2. Lossless Multiple Release as a Blackbox

Inspired by Corollary 15 in Koufogiannis et al. (2016), we provide a meta theorem for lossless multiple release based on independent additive noise mechanisms. The central component is the following lemma, showing that the class of lossless multiple release mechanisms is closed under invertible post-processing.

Lemma 4.3. Let Mρ : X Y satisfy lossless multiple release, and let H : Y Y be an invertible function. Then M ρ = H Mρ also satisfies lossless multiple release.

Proof. Let M : X R+ Y be an implementation of Mρ satisfying lossless multiple release, and let M = H M, which we will argue implements lossless multiple release. The only property that does not

trivially hold for M is the second property of Definition 3.4. For a set S R+, note that to condition on M (x, max(S)) = y is equivalent to conditioning on H 1(M (x, max(S))) = M(x, max(S)) = H 1(y) Y. We thus have that conditioning M (x, max(S)) = y implies that the joint distribution over (M(x, ρ))ρ S is fully determined, and consequently so is (H(M(x, ρ)))ρ S = (M (x, ρ))ρ S, and we are done.

Theorem 4.4 (Meta theorem). Let Mρ : X Y be a family of mechanisms on a domain X, indexed by a privacy parameter ρ R+. Furthermore, assume Mρ can be decomposed as Mρ = H Af,ρ where

Af,ρ : X Rd is an independent additive noise mechanism for releasing f : X Rd with privacy parameter ρ and with noise distribution satisfying a convolution preorder (Definition 1.1);

H : Rd Y is an invertible post-processing step;

Then Mρ can be implemented with lossless multiple release.

Proof. The theorem follows from invoking Lemma 4.2 together with Lemma 4.3.

Supporting non-invertible post-processing. To support non-invertible post-processing, we also introduce a weaker notion: weakly lossless multiple release.

Definition 4.5 (Weakly Lossless Multiple Release). A mechanism Mρ supports weakly lossless multiple release if it can be written as Mρ = H M ρ where M ρ supports lossless multiple release and H is an arbitrary function (possibly chosen from some distribution).

We define weakly lossless gradual release analogously. It follows that these classes of mechanisms are closed under all post-processing. While this notion is indeed weaker, any algorithm M(x, ρ) implementing weakly lossless multiple release for a ρ-private mechanism Mρ(x), will have the property that the set of releases (M(x, ρ))ρ S are max(S)- private, if ρ-privacy is closed under post-processing. Examples of such privacy notions are ε-DP and ρ-z CDP. We get the following immediate corollary to Theorem 4.4.

Corollary 4.6. If the function H in Theorem 4.4 is not invertible, then Mρ = H Af,ρ supports weakly lossless multiple release.

Weakly lossless multiple release will play a role in the next section, where we consider non-invertible post-processing such as truncation.

Private Lossless Multiple Release

5. Applications

In this section, we describe two applications supported by our framework for lossless multiple release:

Factorization mechanisms (Li et al., 2015), where we want to privately release a linear query Ax, and

Sparse Gaussian histograms, also known as stability histograms (Wilkins et al., 2024; Google Anonymization Team, 2020).

We will make use of both lossless multiple release (Definition 3.4), and its weaker variant (Definition 4.5). This will be necessary since, e.g., the truncation used for sparse Gaussian histograms is not an invertible function, and so not covered by Theorem 4.4.

Because the noise generation for histograms on large domains is expensive, we give a dimension-independent algorithm that works in the gradual release setting. Throughout, we state our results as post-processings of the Gaussian mechanism, but analogous results hold for any other independent additive noise mechanism meeting Definition 1.1.

5.1. Lossless Multiple Release Factorization Mechanism

Let A be a query matrix and fix a public factorization A = LR. Denote the factorization mechanism (Li et al., 2015) as Fρ(x) = Ax + Lz = L(Rx + Z) on some private dataset x where Z is a vector drawn from a distribution D(ρ) satisfying Definition 1.1 parameterized by ρ, typically Gaussian or Laplace noise.

Lemma 5.1. Fρ can be implemented with weakly lossless multiple release, and lossless multiple release if L has a left-inverse.

Proof. Observe that Fρ(x) = L AR,ρ for an IAN mechanism AR,ρ(x) = Rx + Z where Z D(ρ) for a D(ρ) satisfying Definition 1.1. The result follows from Theorem 4.4 and Corollary 4.6, and noting that the linear map L is invertible exactly when L has a left-inverse.

Besides proving existence in Lemma 5.1, we also give an explicit instantiation in Algorithm 2 using the Gaussian mechanism.

Lemma 5.2. With initialization M = {(0, ), ( , Ax)}, Algorithm 2 implements the Gaussian noise factorization mechanism Fρ with (weakly) lossless multiple release.

Proof sketch. Note that the algorithm is practically a copy of Algorithm 1, but for the specific sensitive function Rx. The only structural difference is that the linearity of the post-processing L allows for storing correlated Gaussian noise directly in M.

Algorithm 2 Factorization Multiple Release Parameters: Factorization A = LR, ℓ2 sensitivity 2 Inputs: Set of releases M, privacy parameter ρ

1: Find (ρk, Yρk), (ρk+1, Yρk+1) M such that 2: ρ [ρk, ρk+1) and (ρ , ) M : ρ / (ρk, ρk+1)

3: Zρ N 0, 2 2 (1 ρk/ρ) (1/ρ 1/ρk+1)

2(1 ρk/ρk+1) d

4: Let Yρ := L Zρ + (1 ρk/ρ)Yρk+1+(ρk/ρ ρk/ρk+1)Yρk

1 ρk/ρk+1 5: Set M = M {(ρ, Yρ)} 6: Return Z

5.2. Weakly Lossless Multiple Release of Histograms

We will now describe an example where the post-processing is neither linear nor invertible: releasing sparse (Gaussian) histograms (Korolova et al., 2009; Google Anonymization Team, 2020; Balle & Wang, 2018). Let x = (Xi)n be a dataset of n users, and we want to privately release the histogram H(x) = Pn i=1 Xi where Xi {0, 1}d and a user can contribute to l distinct counts and the Gaussian mechanism which scales proportional to

l is preferable. In many natural settings, the domain size d can be very large, and therefore, the resulting histogram is usually very sparse, k = H(x) 0 d. Releasing a noisy histogram where noise has been added to each coordinate would destroy the sparsity. One can cope by introducing a threshold τ > 0 where only noisy counts that exceed τ are released. τ is usually set high enough such that the noise to a zero count will (after thresholding with high probability) still be zero. We denote the support of the histogram as U(H(x)) = {i [d] : H(x)i = 0} as the index of the non-zero coordinates. Define the thresholding function Tτ : Rd Rd as:

Tτ(x) = (yi)d i=1 , where yi =

( xi, if xi τ 0, otherwise .

Denote the (independent additive noise) sparse histogram mechanism as Hρ(x) = Tτ(H(x) + Z) where Z D(ρ) is a distribution satisfying a convolution preorder (Definition 1.1).

Lemma 5.3. Hρ can be implemented with weakly lossless multiple release.

Proof. Observe that Hρ can be expressed as a composite function Tτ Aρ,f(x) for an independent additive noise mechanism Aρ,f(x) meeting Definition 1.1. The statement thus follows from Corollary 4.6.

It is straightforward to implement weakly lossless gradual release for stability histograms using our framework with time and space complexity linear in dimension d; see Algorithm 3. The following lemma is proved in Appendix C.

Private Lossless Multiple Release

Algorithm 3 Histogram Gradual Release Parameters: ℓ2 sensitivity 2 Inputs: histogram H(x), set of releases M, privacy parameter ρ, threshold τ 1: Extract the single element (ρ , Z ) M

2: Sample Z N 0, 1

2 2 2(ρ ρ ) d

3: Let Z = ρ

4: for each i [d] do 5: if H(x)i + Zi > τ then Yi = H(x)i + Zi 6: else Yi = 0 7: end for 8: Set M = {(ρ, Z)} 9: Return Y

Lemma 5.4. With initialization M = {(0, 0)}, Algorithm 3 implements Hρ with weakly lossless gradual release.

Improving efficiency for weakly lossless gradual release. Korolova, Kenthapadi, Mishra, and Ntoulas (2009) showed that under approximate differential privacy, one can skip the noise generation for zero counts if thresholding is enabled, adding only a small probability for infinite privacy loss if this coordinate is non-zero in a neighboring dataset. Unfortunately, this approach does not fit our meta theorem, so instead, we apply a technique due to Cormode, Procopiuc, Srivastava, and Tran (2012) that efficiently simulates the noise distribution of those zero counts that would exceed the threshold and thus gives an identical distribution to the truncated noisy histogram. The procedure is described in the following lemma.

Lemma 5.5. For a fixed noise distribution D, the output distribution of releasing Tτ(H(x) + Z) with Z D are independent noise samples is equal to the following process: (a) add noise to every non-zero coordinate in H(x)i for i U(H(x)) and then (b) draw q Binomial(d k, p) where p = Pr Z D[Z > τ]. (c) Sample a subset Q [d] \ U(H(x)) of size |Q| = q uniformly at random and (d) set the noise for H(x) to Z D conditioned on being above the threshold τ.

Proof sketch. A formal argument can be found in Cormode et al. (2012), here we just provide a sketch. It is clear that every entry in the support gets the correct output distribution. Furthermore, the number of zero counts q whose noise exceeds τ follows a binomial distribution, and we can add conditional noise to a uniformly drawn subset of size q.

Algorithm 6 in Appendix C implements the efficient routine described by Lemma 5.5 under weakly lossless gradual release. The key challenge is that the probability for a histogram count to exceed the threshold in a given round now depends on whether it exceeded the threshold in any prior

round. Intuitively, all zero counts that have never exceeded the threshold have an equal probability to exceed the threshold in any given round, and so the simulation technique in Lemma 5.5 can be used for these counts (with different probabilities and conditional distributions). However, once any zero count exceeds the threshold, it will have to be treated the same as non-zero counts in all future rounds.

The utility and privacy of Algorithm 6 is proved by arguing that its output distribution matches that of Algorithm 3. A proof sketch for the following lemma is given in Appendix C. Lemma 5.6. Let H(x) be a histogram over [d], ρ1, . . . , ρm be a sequence increasing of privacy budgets, τ1, . . . , τm be a sequence of thresholds and 2 > 0 the ℓ2-sensitivity of the histogram. Also let the sequences of outputs (Y (1), . . . , Y (m)) and ( ˆY (1), . . . , ˆY (m)) be derived from running Algorithm 3 and Algorithm 6 respectively with the preceding parameters as input. Then the sequences (Y (1), . . . , Y (m)) and ( ˆY (1), . . . , ˆY (m)) are identically distributed.

The benefit of the efficient sampling technique is that the number of sampled Gaussians will never exceed that of the na ıve approach and can potentially be in the order of the sparsity depending on the parameter regime. Lemma 5.7. Let c be the number of zero-counts in the histogram output by Algorithm 6 least once over its execution. Then Algorithm 6 will sample (k + c)m (truncated or otherwise) Gaussians.

We again refer the reader to Appendix C for a proof.

6. Empirical Evaluation

To confirm our theoretical claims, we empirically evaluate the accuracy of lossless multiple release against a baseline algorithm that uses independent releases. We evaluate the impact by focusing on noise in isolation to avoid capturing the effect of specific queries. The baseline algorithm is a simple Gaussian mechanism where noise is drawn independently for each consecutive release. To showcase our algorithm s performance, we demonstrate how the cost incurred by uncoordinated releases grow with the amount of releases, in contrast to the lossless multiple release where there is no additional cost. We repeat our experiments 106

times, and measure the variance of the noise. The plot (Figure 2) shows, as expected, that our mechanism does not lose any utility from making multiple releases. As we can see, there is an expected increase in variance when going from one release to multiple releases the initial jump is larger the more releases we want to make as the budget used for the second release is the difference between the starting point (ρ = 0.001) and a constant increase in budget, whereas the subsequent releases all have the same difference in budget.

Private Lossless Multiple Release

10 2 10 1 100 Cumulated ρ

100 releases 50 releases 20 releases 10 releases 5 releases Lossless

Figure 2. Accuracy comparison of multiple uncorrelated releases compared to lossless multiple release. Budgets are spaced evenly on a logarithmic scale between ρ = 0.001 and ρ = 5 on the x-axis. Creating independent releases with a denser set of privacy parameters comes at the cost of increased variance. In the lossless setting we get the best possible variance with no bound on the number of releases.

7. Conclusion and Open Questions

We have initiated a systematic study of differential privacy with multiple releases, motivated by settings in which many levels of privacy or trust may co-exist. The main message is that it is possible to generalize a large class of lossless gradual release techniques to this setting, even when releases are determined online and in no particular order. For mechanisms based on Gaussian, Laplace, or Poisson additive noise we give simple and efficient sampling procedures for creating new lossless releases. In particular, we are able to do lossless multiple release for any factorization mechanism using invertible matrices. Finally, we consider algorithmic challenges related to lossless gradual release and show that private sparse histograms may be computed much more efficiently than what a direct application of our general results would imply.

There are still many open questions concerning mechanisms that do not inherit their privacy guarantee from an independent additive noise mechanism. In particular, it would be interesting to determine under which conditions the exponential mechanism (Mc Sherry & Talwar, 2007) supports lossless multiple release. Koufogiannis et al. (2016) conjecture that this is always possible. Other central private algorithms, such as report noisy max (Dwork et al., 2014) also do not have known lossless multiple release mechanisms, even in the gradual release setting. A final challenge we would like to mention is implementing our mechanisms on a finite computer, e.g., creating a multiple release version of the discrete Gaussian mechanism (Canonne et al., 2020). An appealing approach would be to base this on Poisson noise, which meets the technical conditions for our framework and approaches the Gaussian distribution in the

Acknowledgments

Andersson, Pagh and Retschmeier carried out this work at Basic Algorithms Research Copenhagen (BARC), which was supported by the VILLUM Investigator grant 54451. Providentia, a Data Science Distinguished Investigator grant from the Novo Nordisk Fonden, supported Andersson, Pagh and Retschmeier. Nelson carried out part of this work at Uppsala University. We would like to thank the ICML attendees at our poster presentation for pointing us to Equitz & Cover (1991).

Impact Statement

This paper presents work whose goal is to advance the field of private machine learning. There are many potential societal consequences of our work, none which we feel must be specifically highlighted here.

Agarwal, N., Suresh, A. T., Yu, F. X., Kumar, S., and Mc Mahan, B. cp SGD: Communication-Efficient and Differentially-Private Distributed SGD. In Advances in Neural Information Processing Systems, volume 31, pp. 7575 7586, 2018.

Balle, B. and Wang, Y. Improving the Gaussian Mechanism for Differential Privacy: Analytical Calibration and Optimal Denoising. In Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholm, Sweden, July 10-15, 2018, volume 80 of Proceedings of Machine Learning Research, pp. 403 412. PMLR, 2018.

Bell, D. E. and La Padula, L. J. Secure Computer System: Unified Exposition and Multics Interpretation:. Technical report, Defense Technical Information Center, Fort Belvoir, VA, mar 1976.

Bun, M. and Steinke, T. Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds. In Theory of Cryptography, Lecture Notes in Computer Science, pp. 635 658, Berlin, Heidelberg, 2016. Springer. ISBN 978-3-662-53641-4. doi: 10.1007/978-3-662-536414 24.

Canonne, C. L., Kamath, G., and Steinke, T. The Discrete Gaussian for Differential Privacy. In Advances in Neural Information Processing Systems, volume 33, pp. 15676 15688. Curran Associates, Inc., 2020.

Cormode, G., Procopiuc, C. M., Srivastava, D., and Tran, T. T. L. Differentially Private Summaries for Sparse Data.

Private Lossless Multiple Release

In 15th International Conference on Database Theory, pp. 299 311, New York, NY, USA, 2012. ACM. ISBN 9781450307918. doi: 10.1145/2274576.2274608.

Ding, Z., Kifer, D., Saghaian N. E., S. M., Steinke, T., Wang, Y., Xiao, Y., and Zhang, D. The Permute-and Flip Mechanism is Identical to Report-Noisy-Max with Exponential Noise. Co RR, abs/2105.07260, 2021.

Dwork, C., Mc Sherry, F., Nissim, K., and Smith, A. Calibrating Noise to Sensitivity in Private Data Analysis. In Theory of Cryptography, Lecture Notes in Computer Science, pp. 265 284. Springer Berlin Heidelberg, 2006. ISBN 978-3-540-32732-5. doi: 10.1007/11681878 14.

Dwork, C., Roth, A., et al. The Algorithmic Foundations of Differential Privacy. Foundations and Trends in Theoretical Computer Science, 9(3 4):211 407, 2014. doi: 10.1561/0400000042.

Equitz, W. H. R. and Cover, T. M. Successive Refinement of Information. IEEE Trans. Inf. Theory, 37(2):269 275, 1991. doi: 10.1109/18.75242.

Girgis, A. M., Data, D., Chaudhuri, K., Fragouli, C., and Diggavi, S. N. Successive refinement of privacy. IEEE Journal on Selected Areas in Information Theory, 1(3): 745 759, 2020. doi: 10.1109/JSAIT.2020.3040403.

Google Anonymization Team. Delta for Thresholding. https://github.com/ google/differential-privacy/blob/ a7cc26ad91f74756fbe39bab44af6d655b37cc61/ common_docs/Delta_For_Thresholding. pdf, 2020.

Korolova, A., Kenthapadi, K., Mishra, N., and Ntoulas, A. Releasing Search Queries and Clicks Privately. In Proceedings of the 18th International Conference on World Wide Web, pp. 171 180, 2009. doi: 10.1145/1526709. 1526733.

Kotz, S., Kozubowski, T. J., and Podg orski, K. The Laplace Distribution and Generalizations. Birkh auser Boston, 2001. ISBN 978-1-4612-6646-4 978-1-4612-0173-1. doi: 10.1007/978-1-4612-0173-1.

Koufogiannis, F., Han, S., and Pappas, G. J. Gradual Release of Sensitive Data under Differential Privacy. Journal of Privacy and Confidentiality, 7(2), 2016. ISSN 2575-8527. doi: 10.29012/jpc.v7i2.649. Number: 2.

Li, C., Miklau, G., Hay, M., Mc Gregor, A., and Rastogi, V. The Matrix Mechanism: Optimizing Linear Counting Queries under Differential Privacy. VLDB J., 24(6):757 781, 2015. doi: 10.1007/s00778-015-0398-x.

Li, Y., Chen, M., Li, Q., and Zhang, W. Enabling Multilevel Trust in Privacy Preserving Data Mining. IEEE Trans. Knowl. Data Eng., 24(9):1598 1612, 2012. doi: 10.1109/ TKDE.2011.124.

Ligett, K., Neel, S., Roth, A., Waggoner, B., and Wu, Z. S. Accuracy First: Selecting a Differential Privacy Level for Accuracy Constrained ERM. In Advances in Neural Information Processing Systems, volume 30, pp. 2566 2576, 2017.

Mc Sherry, F. and Talwar, K. Mechanism Design via Differential Privacy. In IEEE Symposium on Foundations of Computer Science, pp. 94 103. IEEE, 2007. doi: 10.1109/FOCS.2007.41.

Pan, M. Randomized Response with Gradual Release of Privacy Budget. Co RR, abs/2401.13952, 2024. doi: 10. 48550/ar Xiv.2401.13952.

Rogers, R. M., Vadhan, S. P., Roth, A., and Ullman, J. R. Privacy Odometers and Filters: Pay-as-you-Go Composition. In Advances in Neural Information Processing Systems, volume 29, pp. 1921 1929, 2016.

Shaked, M. and Shanthikumar, J. G. (eds.). Univariate Stochastic Orders, pp. 3 79. Springer New York, New York, NY, 2007. ISBN 978-0-387-34675-5. doi: 10.1007/ 978-0-387-34675-5 1.

Whitehouse, J., Ramdas, A., Wu, Z. S., and Rogers, R. M. Brownian Noise Reduction: Maximizing Privacy Subject to Accuracy Constraints. In Advances in Neural Information Processing Systems, volume 35, Red Hook, NY, USA, 2022. Curran Associates Inc. ISBN 9781713871088.

Wilkins, A., Kifer, D., Zhang, D., and Karrer, B. Exact Privacy Analysis of the Gaussian Sparse Histogram Mechanism. Journal of Privacy and Confidentiality, 14(1), feb 2024. ISSN 2575-8527. doi: 10.29012/jpc.823.

Xiao, X., Tao, Y., and Chen, M. Optimal Random Perturbation at Multiple Privacy Levels. Proc. VLDB Endow., 2 (1):814 825, 2009. doi: 10.14778/1687627.1687719.

Zhang, S. and He, X. DProv DB: Differentially Private Query Processing with Multi-Analyst Provenance. Proc. ACM Manag. Data, 1(4):267:1 267:27, 2023. doi: 10. 1145/3626761.

Private Lossless Multiple Release

Algorithm 4 Generic Multiple Release Parameters: Noise distribution family D(ρ) satisfying a convolution preorder, bridging noise distribution C(ρ, ρ ) where D(ρ ) C(ρ , ρ) = D(ρ) for 0 < ρ < ρ . Inputs: Sensitive value f(x), new privacy parameter ρk, set of past releases M = {(ρi, Yρi) : i [m] \ {k}}

1: if M = then 2: Sample Zρk from D(ρρk) 3: Set Yρk = f(x) + Zρk 4: else if k = 1 then 5: Sample Wρk+1,ρk from C(ρk+1, ρk) 6: Set Yρk = Yρk+1 + Wρk+1,ρk 7: else if k (1, m) then 8: Sample Wρk+1,ρk from C(ρk+1, ρk) conditioned on Wρk+1,ρk + Wρk,ρk 1 = Yρk 1 Yρk+1 9: Set Yρk = Yρk+1 + Wρk+1,ρk 10: else if k = m then 11: Sample Zρk from D(ρk) conditioned on Zρk + Wρk,ρk 1 = Yρk 1 f(x) 12: Set Yρk = f(x) + Zρk 13: end if 14: Set M = M {(ρk, Yρk)} 15: Return Yρk

A. On Lossless Multiple Release Sampling

Lemma 4.2 makes no claim about how easy it is to sample noise from the conditional noise distribution, but guarantees its existence. We will use this section for discussing this sampling in greater detail. Consider the joint distribution of the releases defined by Equation (1) from the proof of Lemma 4.2, re-stated below for convenience:

Yρk = f(x) + Zρm +

j=k Wρj+1,ρj , k = 1, . . . , m , (1)

where Yρk is the ρk-private release, Zρm D(ρm) and Wρj+1,ρj C(ρj+1, ρj). Recall that C(ρ , ρ) is the unique distribution where for any 0 < ρ < ρ : D(ρ) = D(ρ ) C(ρ , ρ). As argued for in the proof of Lemma 4.2, to make a new ρ-private release, we consider the joint distribution, and sample the new releases conditioned on all past releases.

Formally, for a set of privacy parameters ρ1 < < ρm, we are interested in releasing ρk for a single k [m], conditioned on the set of past releases M = {(ρi, Yρi) : i [m] \ {k}}. We give pseudocode for this in Algorithm 4, and a lemma for its correctness.

Lemma A.1. Initializing M = and then running Algorithm 4 for a set of privacy parameters {ρi : i [m]} (processed in arbitrary order), will produce a set of outputs {Yρi : i [m]} with distribution given by Equation (1).

Proof. For M = , we have that Yρk = f(x) + D(ρk) = Af,ρk(x) on lines 2-3, as expected, where A is our independent additive noise noise mechanism releasing f with ρ-privacy. For the remaining cases, assume M contains m 1 releases with the correct joint distribution, and we are generating a new release at privacy level ρk. We will argue that Algorithm 4 produces a Yρk whose joint distribution with the releases in M matches (1). For k = 1, observe that Yρ1 = Yρ2 + Wρ2,ρ1, and so lines 5-6 are correct. For 1 < k < m, note that Yρk = Yρk+1 + Wρk+1,ρk and Yρk 1 = Yρk + Wρk,ρk 1, which combined allows us to identify the correct conditional distribution. Yρk is given by Yρk+1 + Wρk+1,ρk, conditioned on Wρk+1,ρk + Wρk,ρk 1 = Yρk 1 Yρk+1, matching lines 8-9. For the final case of k = m, then Yρm = f(x) + Zρm and we have that Yρm 1 = Yρm + Wρm,ρm 1. It follows that the correct noise distribution is Yρm = f(x) + Zρm, conditioned on Zρm + Wρm,ρm 1 = Yρm 1 f(x), matching lines 11-12. An inductive argument identical to that given in the proof of Lemma 4.2 completes the proof.

A.1. Simplification and Removing Dependency on the Dataset

Note that f(x) is only used in Algorithm 4 when a more accurate release is generated (lines 1 and 10). In the remaining cases we are only adding noise to past releases. This already allows us to simplify the algorithm, and argue for not having to

Private Lossless Multiple Release

Algorithm 5 Simplified Generic Multiple Release Parameters: Noise distribution family D(ρ) satisfying a convolution preorder, bridging noise distribution C(ρ, ρ ) where D(ρ ) C(ρ , ρ) = D(ρ) for 0 < ρ < ρ

Inputs: Privacy parameter ρ, set of releases M

1: Find (ρk, Yρk), (ρk+1, Yρk+1) M such that ρ [ρk, ρk+1) and (ρ , ) M : ρ / (ρk, ρk+1) 2: Sample Wρk+1,ρ from C(ρk+1, ρ) conditioned on Wρk+1,ρ + Wρ,ρk = Yρk Yρk+1 3: Set Yρ = Yρk+1 + Wρk+1,ρ 4: Set M = M {(ρ, Yρ)} 5: Return Yρ

store f(x) in memory indefinitely. The algorithm reduces to the case (see line 7 of Algorithm 4) where only the two closest releases are combined into a new one. Such an algorithm is given in Algorithm 5, together with Lemma A.2.

Lemma A.2. Let M = {(ρ0, Yρ0), (ρ , Yρ )} for Yρ = Aρ ,f(x), and Yρ0 = Yρ +Wρ ,ρ0. Then running Algorithm 5 with input M and a set of privacy parameters {ρi : i [m]} (ρ0, ρ ) (processed in arbitrary order), will produce a set of outputs {Yρi : i [m]} with distribution given by Equation (1). Moreover, the set M will at all times satisfy ρ -privacy.

Proof. The statement follows from carefully comparing to Algorithm 4. Let S = (ρ1, . . . , ρm) be a sequence of noise values from the lemma statement, and let O = (Yρ1, . . . , Yρm) be the outputs. Consider a second sequence S = (ρ , ρ0, ρ1, . . . , ρm), and consider the corresponding sequence of outputs O = (Y ρ , Y ρ0, Y ρ1, . . . , Y ρm) from inputting

S and M = to Algorithm 4. We can directly check that Yρ0 d= Y ρ0 and Yρ d= Y ρ . For the remaining outputs, note that just before each algorithm is called, their internal states M and M are also identically distributed. Now, the fact that each ρi (ρ0, ρ ) implies that the remaining outputs Y ρ1, . . . , Y ρm are generated from lines 10-12 in Algorithm 4.

Checking carefully, lines 1-3 in Algorithm 5 are implementing the same routine, and since M d= M , it follows that (Yρ1, . . . , Yρm) d= (Y ρ1, . . . , Y ρm), and so the statement follows from Lemma A.1. The last statement on the ρ -privacy of M follows from Algorithm 5 implements Equation (1), and so each release in M can at any time during execution be viewed as randomized post-processing of Y .

Essentially, if we commit to supporting a bounded range of privacy parameters then we get a simpler algorithm. Lemma A.2 also says something more: If we commit to supporting a lowest level of privacy ρ , then Algorithm 5 can be implemented in such a way that its internal state is ρ -private. After initializing M using the sensitive function f(x), we can erase f(x) from memory and Y will contain enough private information to generate all future releases. This could prove useful in settings where the time between releases is large, and we want to limit the private information leaked if the state of the algorithm were to be compromised. This can be compared with the gradual release setting, where natural implementations would require consistent access to f(x).

B. Omitted Proof for Gaussian Lossless Multiple Release

Proof of Lemma 3.3. Throughout the proof we will ignore the factor 2 in the denominator of the variance. To argue that the values generated by the process match the distribution Lemma 3.2, we will argue for increasing subsets of releases. Let Sn = {ρk : k [n 1]} where 0 < ρ1 < < ρn < ρ be the set of values in S for which we have generated Gaussians at the start of the nth round of the process. Note that we re-label the ρ s between the rounds such that ρk Sn always is the kth smallest value in Sn. Our argument will proceed by induction: assume that all the Gaussians {Yρk : k [n]} generated after n rounds have the distribution given by Lemma 3.2. Then we will show that {Yρk : k [n 1]} {Yρ} has the same distribution as predicted by Sn {ρ} from invoking Lemma 3.2.

We begin with our base case. For n = 1, we have that ρl = 0 and ρr = ρ , and so Yρ = Yρ + N(0, 1/ρ 1/ρ ). Since Yρ N(β, 1/ρ ), we have that Yρ N(β, 1/ρ), as expected, and so the base case passes.

For n 2, we first consider the following cases.

Case 1: ρl = 0, ρr = ρ1 Sn. In this case, Yρ = Yρ1 + N(1/ρ 1/ρ1), and so Yρ N(β, 1/ρ). Since ρ < ρ1, we have that i [n 1] : Cov(Yρ, Yρi) = Cov(Yρ1, Yρi) = 1/ρi, as expected for the release with the smallest value in {ρ} Sn, and so the case is complete.

Private Lossless Multiple Release

Case 2: ρl = ρn 1, ρr = ρ = . We have to deal with the case where we have set ρ = separately, as it is a bit of a

trick. Note that we get Yρ = (1 ρn 1)β+ρn 1Yρn 1

ρ + N(0, 1 ρn 1/ρ

ρ ), since Yρ = β in this case. The end result is once

more a sum of Gaussians, with mean β, and for the variance we can explicitly compute that Var[Yρ] = ρ2 n 1 ρn 1ρ2 ρ ρn 1

ρ2 = 1/ρ, as expected. For the covariance, for i [n 1] : Cov(Yρ, Yρi) = ρn 1

ρ Cov(Yρn 1, Yρi) = 1/ρ, as expected for the largest value in {ρ} Sn, and so we are done. Now we consider the most general case.

Case 3: ρl Sn 1 and ρr = . We begin with computing the variance of Y using the hypothesis Var[Yρr] = 1/ρr and Var[Yρl] = 1/ρl

Var[Yρ] = (ρ 1 l ρ 1)(ρ 1 ρ 1 r ) ρ 1 l ρ 1 r + (ρ 1 l ρ 1)2 Var[Yρr] + (ρ 1 ρ 1 r )2 Var[Yρl] (ρ 1 l ρ 1 r )2

= 1 (ρ 1 l ρ 1 r )2

(ρ 1 l ρ 1)(ρ 1 ρ 1 r )(ρ 1 l ρ 1 r ) + ρ 1 r (ρ 1 l ρ 1)2 + ρ 1 l (ρ 1 ρ 1 r )2

= 1 (ρ 1 l ρ 1 r )2 ρ 1(ρ 1 l ρ 1 r )2 = 1

where the third equality follows from applying the identity

(a b)(b c)(a c) + c(a b)2 + a(b c)2 = b(a c)2 ,

to the expression in the brackets for a = ρ 1 l , b = ρ 1 and c = ρ 1 r , proving the correctness of the variance. Furthermore, Yρ is a sum of Gaussians and a convex combination of two Gaussians with mean β, so Yρ N(β, 1/ρ). What remains to show is that the covariances match up, which we do next.

We start with the case where ρr = ρ , and so there exists ρk, ρk+1 Sn such that ρ (ρk, ρk + 1) For j [n 1], we therefore have that

Cov(Yρ, Yρj) = Cov (ρ 1 l ρ 1)Yρr + (ρ 1 ρ 1 r )Yρl ρ 1 l ρ 1 r , Yρj

= 1 ρ 1 l ρ 1 r

(ρ 1 l ρ 1)Cov(Yρr, Yρj) + (ρ 1 ρ 1 r )Cov(Yρl, Yρj)

= 1 ρ 1 l ρ 1 r

(ρ 1 l ρ 1) min(ρ 1 r , ρ 1 j ) + (ρ 1 ρ 1 r ) min(ρ 1 l , ρ 1 j ) .

We consider the cases of j k and j > k separately. For j k, we have that

Cov(Yρ, Yρj) = 1 ρ 1 l ρ 1 r

(ρ 1 l ρ 1)ρ 1 r + (ρ 1 ρ 1 r )ρ 1 l

and similarly for j k + 1 we get

Cov(Yρ, Yρj) = 1 ρ 1 k ρ 1 k+1

(ρ 1 k ρ 1)ρ 1 j + (ρ 1 ρ 1 k+1)ρ 1 j

It follows that Cov(Yρ, Yρj) = 2 2 2 max(ρ,ρj) for j [m], and so we are done with this part of the covariances.

For the last step, we consider what happens when ρr = ρ < . In this case, ρl = ρn 1 and so for j [n 1]:

Cov(Yρ, Yρj) = Cov (1 ρn 1/ρ)Yρ + (ρn 1/ρ ρn 1/ρ )Yρn 1

1 ρn 1/ρ , Yρj

= 1 ρn 1/ρ 1 ρn 1/ρ Cov(Yρ , Yρj) + ρn 1/ρ ρn 1/ρ

1 ρn 1/ρ Cov(Yρn 1, Yρj)

ρ ρn 1 + 1/ρ 1/ρ

1 ρn 1/ρ = 1 ρn 1/ρ + ρ /ρ 1

ρ ρn 1 = 1/ρ ,

and now we are done.

Private Lossless Multiple Release

Algorithm 6 Efficient Histogram Gradual Release Parameters: ℓ2-sensitivity 2 Inputs: Private histogram H(x), privacy budgets ρ1 < < ρm, thresholds τ1, . . . , τm

1: Let S(0) = U(H(x)) // Tracking counts that have already been released 2: Also let Z(0) = {0}d and ρ0 = 0. 3: r [m] : let distribution D(r) = N 0, 1

2 2 2(ρr ρr 1)

4: for each round r [m] do 5: Initialize Y (r) = {0}d

6: for each tracked count in preceding round j S(r 1) do

7: Draw fresh noise Z(r) j D(r)

8: Update aggregate noise Z(r) j = ρr 1

ρr Z(r 1) j + 1

ρr Z(r) j 9: if H(x)j + Z(r) j > τr then

10: Y (r) j = H(x)j + Z(r) j 11: end if 12: end for // Simulating Noise for zero counts 13: Let Z(k) D(k) for k [r] be random variables.

14: Compute p(r) = Pr h Pr k=1 Z(k) > τrρr | { ℓ [r 1] : Pℓ k=1 Z(k) ρℓτℓ} i

15: Draw q Binomial d |S(r 1)|, p(r) . 16: Select a subset Q [d] \ S(r 1) uniformly at random of size q. 17: for each index j Q do 18: Initialize Z(r) j = 0 19: for k [r 1] do

20: Draw fresh noise Z(k) j D(k) conditioned on Z(k) j ρkτk Z(r) j 21: Update aggregate noise Z(r) j = Z(r) j + Z(k) j 22: end for 23: Draw fresh noise Z(k) j D(k) conditioned on Z(k) j > ρkτk Z(r) j 24: Update aggregate noise Z(r) j = Z(r) j + Z(k) j 25: Y (r) j = H(x)j + Z(r) j // Guaranteed to be above the threshold. 26: end for 27: Set S(r) = S(r 1) Q 28: output private histogram Y (r)

29: end for

C. Details on (Efficient) Weakly Lossless Gradual Release of Sparse Gaussian Histograms

Algorithm 3 implements weakly lossless gradual release of private histograms. We prove this next.

Proof of Lemma 5.4. Observe that for any increasing sequence (ρk)k m, the corresponding sequence of variables Z on line 3 produced by the algorithm, have the same distribution as in Lemma 3.1 for β = 0. The Y that is ultimately returned, however, is a post-processing of H(x) + Z, which has the same distribution as Lemma 3.1 for β = H(x). Therefore Algorithm 3 is implementing lossless gradual release for the Gaussian mechanism applied to H(x), combined with a non-invertible post-processing. The lemma statement follows from Corollary 4.6.

Note that one only has to store the noisy terms from the preceding round to implement Algorithm 3. Nevertheless, it might be infeasible, say, when the domain is really large, to sample the noise for the zero coordinates. Algorithm 6 uses the computational trick in Lemma 5.5 to speed up this computation. The algorithm is static, where the privacy budgets are fixed upfront but can easily be converted to an online algorithm. Recall that U(H(x)) = {i [d] : H(x)i = 0} is the support of the histogram.

We proceed to give proofs for Lemma 5.6 and 5.7.

Private Lossless Multiple Release

Proof sketch for Lemma 5.6. The claim directly follows an inductive argument. For the base case, note that the first iteration of Algorithm 6 is running the same routine as described by Lemma 5.5, and so ˆY (1) must be identically distributed to Y (1). For our inductive hypothesis, assume that the subsequences (Y (1), . . . , Y (k)) and ( ˆY (1), . . . , ˆY (k)) are identically distributed. Note that for the (k + 1)st release, Algorithm 6 handles all the true nonzero counts and every zero count that has ever exceeded the threshold in a past round in the same way. These counts in ˆY (k+1) and Y (k+1) clearly have the same distribution.

Now, note that each of the remaining zero counts have up to and including the kth round been reported as zero in each round, and so their probability of exceeding the threshold, p(k+1), should be equal across all of them.

The sampling performed by Algorithm 6 is structured in the same manner as Lemma 5.5, but with sampling probability p(k+1), and the noise terms added for any zero count exceeding the threshold, are different. For the distributions to match, the probability of exceeding the threshold and the noise distribution for an element chosen to exceed the threshold are more complex. The probability of exceeding the threshold is conditioned on prefixes of Gaussians, which correctly simulates the probability of not exceeding the threshold at any prior round until first doing so in the (k + 1)st one. The noise added once selected to exceed the threshold is a sum of truncated Gaussians, which simulate the same event. As the principle is built on the same logic as Lemma 5.5, we have that (Y (1), . . . , Y (k+1)) and ( ˆY (1), . . . , ˆY (k+1)) are identically distributed, and so the claim is true by induction.

Proof of Lemma 5.7. Note that each of the k non-zero true counts will, in each round, have fresh noise added on line 7. If a non-zero count is selected by the binomial sampling in the kth round, the for-loop starting on line 17 will result in sampling k noise terms, after which for future rounds it will get treated as a non-zero count. It follows that non-zero entries contribute km samples, and zero counts exceeding the threshold contribute cm samples.

D. Definitions

We begin with the most common version of differential privacy.

Definition D.1 ((ε, δ)-Differential Privacy (Dwork et al., 2014)). A randomized algorithm M : X Y is (ε, δ)- differentially private if for all S Range(M) and all pairs of neighboring inputs x, x X, it holds that

Pr[M(x) S] exp(ε) Pr[M(x ) S] + δ ,

where (ε, 0)-DP is referred to as ε-DP.

Zero-Concentrated Differential Privacy (z CDP) is a notion of differential privacy that provides a simple but accurate analysis of privacy loss, particularly under composition.

Definition D.2 (Bun & Steinke (2016), ρ-z CDP). Let ρ > 0. An algorithm M : X Y satisfies ρ-z CDP, if for all α > 1 and all pairs of neighboring inputs x, x X, it holds that

Dα (M (x) ||M (x )) ρα,

where Dα (M (x) ||M (x )) denotes the α-R enyi divergence between two output distributions of M(x) and M(x ).

Lemma D.3 (Bun & Steinke (2016), Composition). If M1(x) and M2(x) satisfy ρ1-z CDP and ρ2-z CDP, respectively, then (M1(x), M2(x)) satisfies (ρ1 + ρ2)-z CDP.

Lemma D.4 (Bun & Steinke (2016), Gaussian Mechanism). Let f : X Rd be a query with ℓ2-sensitivity 2. Consider

the mechanism Gf,ρ : X Rd that, on private input x, releases a sample from f(x) + N 0, 2 2 2ρ d . Then Gf,ρ satisfies ρ-z CDP.

One way to uniquely describe probability distributions is via their Characteristic Functions.

Definition D.5 (Characteristic Function). The characteristic function of a random variable X is defined as φX(t) = E[eit X].

For proving Lemma F.4 and Claim F.7, we will use a nice property of CFs for the convolution of two random variables:

Lemma D.6 (Convolution of Characteristic Functions). Let X, Y be two independent RVs with CFs φX(t) and φY (t) respectively, then φX+Y (t) = φX(t) φY (t).

Private Lossless Multiple Release

Proof. Furthermore, by linearity of expectation, we have for the sum of X + Y :

φX+Y (t) = E[eit(X+Y )] = E[eit X] + E[eit Y ] = φX(t) φX(t) .

E. The Poisson Mechanism

We are not aware of any explicit statements in the literature on the privacy guarantees obtained by adding Poisson distributed noise to a d-dimensional vector of integers. However, since the Poisson distribution is the limiting distribution of binomial distributions with the same mean λ = Np, where N is the number of trials, such bounds can be derived from existing bounds on the binomial mechanism. For the sake of completeness, we include such statements based on the following theorem from (Agarwal et al., 2018):

Theorem E.1 (Agarwal et al. (2018)). For any δ, parameters N, p and sensitivity bounds 1, 2, such that

Np(1 p) max 23 log 10d

the d-dimensional Binomial mechanism is (ε, δ)-differentially private for

Np(1 p) + 2cp q

Np(1 p)(1 δ/10) +

2 3 log 1.25

δ + dp log 20d

δ Np(1 p) .

3 p2 + (1 p)2 , bp 2(p2 + (1 p)2)

3 + (1 2p), cp

2 3p3 + 3(1 p)3 + 2p2 + 2(1 p)2 .

Setting p = λ/N and considering the limiting bound when N we get:

Theorem E.2 (Privacy guarantees of the Poisson Mechanism). The d-dimensional Poisson mechanism with parameter λ > max(23 log(10d/δ), 2 ), is (ε, δ)-differentially private with

λ(1 δ/10) +

2 3 log 1.25

A simpler expression can be derived for unit sensitivities by relaxing the constants and assuming that δ is not too large:

Corollary E.3 (Simplified upper bound with unit sensitivities). Assume that 1 = 2 = = 1. Then for δ < 1/100, the d-dimensional Poisson mechanism with parameter λ > 23 log(10d/δ) is (ε, δ)-differentially private for

λ + 2 log 20d

We will need the following lemma that determines the sampling step of private lossless multiple release for the Poisson mechanism:

Lemma E.4. Let X1 Poi(λ1) and X2 Poi(λ2) be independent Poisson random variables. Then, for any nonnegative integer k, the conditional distribution of X1 given X1 + X2 = k is

X1 | (X1 + X2 = k) Binomial k, λ1 λ1 + λ2

Proof. Since X1 and X2 are independent, their joint probability mass function is for all x = 0, 1, , k

P[X1 = x, X2 = k x] = e (λ1+λ2) λx 1 x! λ k x 2 (k x)! .

Private Lossless Multiple Release

Moreover, the sum X1 + X2 is Poisson with parameter λ1 + λ2, so that

Pr[X1 + X2 = k] = e (λ1+λ2) (λ1 + λ2)k

Thus, by the definition of conditional probability,

Pr[X1 = x | X1 + X2 = k] = Pr[X1 = x, X2 = k x]

Pr[X1 + X2 = k]

= e (λ1+λ2) λx 1 x! λ k x 2 (k x)!

e (λ1+λ2) (λ1 + λ2)k

x λ2 λ1 + λ2

This is precisely the probability mass function of a Binomial random variable with parameters k and λ1 λ1+λ2 .

F. The Laplace Mechanism

Koufogiannis et al. (2016) showed how to do lossless gradual releases for the Laplace mechanism and supports either tightening the privacy guarantees or loosening them. We will now strengthen this result by exactly showing how to do them in arbitrary order, similar to what was done in Section 4 for the Gaussian mechanism. We will first show that the Laplace distribution satisfies Definition 1.1. This already implies the existence of an algorithm supporting gradual lossless release via Lemma 4.2. After, we will derive how to sample a new release with scale parameter b given two distinct releases with scaling b2 < b and b < b1.

Definition F.1 (Laplace distribution). The zero-centered Laplace distribution Lap(0, b) with scale parameter b > 0 has probability density function fb(x) = 1 2b exp( |x|/b) for all x R.

Lemma F.2 ((Kotz et al., 2001) Characteristic function of Laplace). Let X be Laplace random variable with probability density function as in Definition F.1, then for all t R, the characteristic function of X is

φX(t) = E[eit X] =

2be |x|/bdx = 1 1 + b2t2 .

Proof. By a simple integration:

E[eit X] = 1

eitx |x|/bdx = 1

e(1/b+ti)xdx +

0 e( 1/b+ti)xdx

1 1/b + ti + 1 1/b ti

= 1 1 + b2t2 .

We will now show that the zero-centered Laplace distribution with scale parameter b satisfies convolution preorder (Definition 1.1). Note that a larger value of b corresponds to a more private release by definition. Building on this, we will condition on the two closest releases to create a new one in the middle, as done in Lemma 3.3 for the Gaussian. The following fact was already shown in Koufogiannis et al. (2016), but we give a proof here for completeness using a technique based on characteristic functions as done by Equitz & Cover (1991).

Claim F.3 (Convolution preorder: Laplace). Fix b2 < b1 R+ let X Lap(0, b2) and draw

( 0 with probability b2 2/b2 1 Lap(0, b1) otherwise , then X + W Lap(0, b1) .

Private Lossless Multiple Release

Proof. We know that the characteristic function of X is φX(t) = 1 1+b2 2t2 and for the convolution φX+W = 1 1+b2 1t2 . Because of independence, the convolution is defined for all t as:

φX(t)φW (t) = φX+W (t)

φW (t) = 1 + b2 2t2

1 + b2 1t2 = b2 1 + b2 1b2 2t2

b2 1(1 + b2 1t2) = b2 2(1 + b2 1t2) + b2 1 b2 2 b2 1(1 + b2 1t2) = b2 2 b2 1 + 1 b2 2 b2 1

1 1 + b2 1t2 .

The last expression encodes the convex combination of the claimed mixture distribution because 1 1+b2 1t2 is again the characteristic function of a zero-centered Laplace distribution with scaling parameter b1.

We next prove a simple result about the convolution of two Laplace distributions (compare also eq., 2.3.23 of Kotz et al. (2001)).

Lemma F.4 (Convolution). For two fixed scaling parameters b1 = b2 R+, let X1 Lap(0, b1) and X2 Lap(0, b2). Then the density of X1 + X2 is given by

(fb1 fb2)(t) = 1 2(b2 1 b2 2)

b1e |t|/b1 b2e |t|/b2 .

Proof. Assume t 0 and note that the other case follows by symmetry. We compute the density of the convolution by a straightforward integration:

(fb1 fb2)(t) = Z

fb1(x)fb2(t x)dx = 1 4b1b2

= 1 4b1b2 Z 0

b2 dx + Z t

= 1 4b1b2 exp (x/b1 (t x)/b2)

b 1 1 + b 1 2

+ exp( x/b1 (t x)/b2)

b 1 2 b 1 1

0 + exp( x/b1 (x t)/b2)

b 1 2 b 1 1

4 exp( t/b2)

b1 + b2 + exp( t/b1) exp( t/b2)

b1 b2 + exp( t/b1)

= 1 2(b2 1 b2 2)

b1e t/b1 b2e t/b2 .

Now, we are ready to show that the Laplace mechanism can be implemented with multiple releases.

Lemma F.5 (Multiple release Laplace). For fixed 0 < b2 < b < b1, let µ1 = b2/b2 1 and µ2 = b2 2/b2 and D1 Ber(µ1) and D2 Ber(µ2). Furthermore, let

( 0 if D1 = 1 Lap(0, b) otherwise and X2 =

( 0 if D2 = 1 Lap(0, b2) otherwise .

Then we have that Pr[X1 = 0 | X1 + X2 = 0] = 1, and for every real number k = 0 the distribution of X1 conditioned on X1 + X2 = k is:

X1 | (X1 + X2 = k)

0 with probability µ1 (1 µ2) exp( |k|/b2) 2b2 f X1+X2(k)

k with probability (1 µ1) µ2 exp( |k|/b) 2b f X1+X2(k)

H(b, b2, k) with probability (1 µ1) (1 µ2) (fb fb2)(k)

Private Lossless Multiple Release

where 0 and k denote the constant distributions, H(b, b2, k) is the probability distribution with probability density function

h(x) = fb(x)fb2(k x)

(fb fb2)(k) ,

f X1+X2(k) = µ1(1 µ2

2b2 e |k|/b2 + (1 µ1)µ2

2b e |k|/b (1 µ1)(1 µ2)

be |k|/b b2e |k|/b2 + µ1µ2δ0(x), and

(fb fb2)(k) = 1 2(b2 1 b2 2)

b1e t/b1 b2e t/b2 ,

where δ0 is the Dirac delta function.

Proof. First we consider Pr[X1 = 0|X1 + X2 = 0], arguing that

Pr[X1 = 0|X1 + X2 = 0] = Pr[X1 = 0] Pr[X2 = 0]

Pr[X1 + X2 = 0] = µ1µ2 µ1µ2 + 0 = 1 .

To see why the first equality holds split Pr[X1 + X2 = 0] into each of the four combinations of discrete/continuous for X1, X2. The only non-zero contribution to the probability mass comes from the discrete/discrete case, as the remaining cases contribute mass proportional to the probability that a continuous distribution assumes an exact value, which is zero.

Next we turn to the distribution of X1 conditioned on X1 + X2 = k where k = 0. Denote by fb the probability density function of Lap(0, b). Note that the mixture densities become

f X1(x) = µ1δ0(x) + (1 µ1)fb(x) ,

f X2(x) = µ2δ0(x) + (1 µ2)fb2(x) .

Before analyzing the different cases separately, we compute the convolution X1 + X2.

Convolution f X1+X2.

Note that have to take care of a subtle technicality: The case that both X1 and X2 are zero from the discrete part can only happen when we condition on X1 + X2 = 0. We can now give the density of the convolution X1 + X2 at any real point x:

f X1+X2(x) = X

(d1,d2) {0,1}2 f X1+X2|D1=d1,D2=d2(x) Pr[D1 = d1 D2 = d2]

= µ1(1 µ2) fb2(x) + (1 µ1)µ2 fb(x) + (1 µ1)(1 µ2) (fb fb2)(x) + µ1µ2δ0(x)

2b2 e |x|/b2 + (1 µ1)µ2

2b e |x|/b + (1 µ1)(1 µ2) 1 2(b2 b2 2)

be |x|/b b2e |x|/b2

+ µ1µ2δ0(x) ,

where the third term follows from Lemma F.4. Note that the last term only contributes when x = 0.

We can now show how the sampling procedure in the claim is justified. We first assume k = 0 and analyze three possible cases how X1 + X2 is built up: Either both of them are drawn from the (continuous) Laplace distribution or exactly one. (The case where both are from their respective discrete parts can only happen if k = 0, analyzed above.)

Case 1: X1 = 0 and X2 = k. X2 = k is necessarily from its continuous part. By the definition of conditional probability, we have for k = 0:

Pr[X1 = 0|X1 + X2 = k, k = 0] = Pr[X1 = 0 X1 + X2 = k]

Pr[X1 + X2 = k] = Pr[X1 = 0] Pr[X2 = k]

Pr[X1 + X2 = k]

= µ1 f X2(k) f X1+X2(k) = µ1(1 µ2) 2b2f X1+X2(k) e |k|/b2 := p1 .

For the third equality, we simply used definition of a probability density function via its limit:

lim 0+ Pr [X2 [k , k + ]] Pr [X1 + X2 [k , k + ]] = f X2(k) f X1+X2(k) .

Private Lossless Multiple Release

Case 2: X1 = k and X2 = 0 Now assume the flipped case. By a similar argument:

Pr[X1 = k|X1 + X2 = k] = Pr[X1 = k, X1 + X2 = k]

Pr[X1 + X2 = k] = Pr[X1 = k] Pr[X2 = 0]

Pr[X1 + X2 = k] = (1 µ1)µ2 2bf X1+X2(k) e |k|/b := p2 .

Case 3: X1 = 0, X2 = 0, X1 + X2 = k In the remaining case, both X1 and X2 are independently sampled from continuous Laplace distributions with probability density functions fb(x) and fb2(x). Therefore, with the remaining probability 1 (p1 + p2), we know that X1 is sampled according to the following conditional probability density function:

f X1(x)|X1+X2=k = f X1,X1+X2(x, k)

(fb fb2)(k) = fb(x)fb2(k x)

(fb fb2)(k) ,

where the last line follows from the trivial identity X1 + X2 = k X2 = k X1. This is a valid probability density function because f is trivially non-negative due to its parts being non-negative and furthermore Z

fb(x)fb2(x k)

(fb fb2)(k) dx = 1 (fb fb2)(k)

fb(x)fb2(k x)dx = (fb fb2)(k)

(fb fb2)(k) = 1 .

What is left is to verify that the probabilities in Equation (3) indeed add up to one for k = 0:

(1 µ1)(1 µ2)(fb fb2)(k)

f X1+X2(k) + (1 µ1)µ2 exp( |k|/b)

2bf X1+X2(k) + µ1(1 µ2) exp( |k|/b2)

2b2f X1+X2(k)

= 1 f X1+X2(k)

(1 µ1)(1 µ2)(fb fb2)(k) + (1 µ1)µ2 exp( |k|/b)

2b + µ1(1 µ2) exp( |k|/b2)

2b2 e λ2|k|

| {z } f X1+X2(k) for k =0

F.1. Showing Convolution Preorder for Exponential Noise

The exponential distribution is closely related to the Laplace distribution, but gives poor differential privacy guarantees. Nevertheless, it still serves as a building block for some private mechanisms, e.g., Report-Noisy-Max (Ding et al., 2021). We show next that it also satisfies a convolution preorder.

Definition F.6 (Exponential distribution). The exponential distribution Exp(λ) with rate parameter λ > 0 has probability density function fλ(x) = λ exp( λx) for all x R+. Furthermore, its characteristic function is given by φX(t) = E[eit X] = λ λ it Claim F.7 (Convolution preorder: Exponential distribution). Fix λ1 < λ2 R+, let X Exp(λ2) and draw

( 0 with probability λ1/λ2 Exp(λ1) otherwise , then X + W Exp(λ1) .

Proof. Using the same trick as in the proof of Claim F.3, we have that

φW (t) = φX+W (t)

where the final expression is the characteristic function of the claimed mixture distribution.