# unbounded_differentially_private_quantile_and_maximum_estimation__88cee4ed.pdf

Unbounded Differentially Private Quantile and Maximum Estimation

David Durfee Anonym Inc. david@anonymco.com

In this work we consider the problem of differentially private computation of quantiles for the data, especially the highest quantiles such as maximum, but with an unbounded range for the dataset. We show that this can be done efficiently through a simple invocation of Above Threshold, a subroutine that is iteratively called in the fundamental Sparse Vector Technique, even when there is no upper bound on the data. In particular, we show that this procedure can give more accurate and robust estimates on the highest quantiles with applications towards clipping that is essential for differentially private sum and mean estimation. In addition, we show how two invocations can handle the fully unbounded data setting. Within our study, we show that an improved analysis of Above Threshold can improve the privacy guarantees for the widely used Sparse Vector Technique that is of independent interest. We give a more general characterization of privacy loss for Above Threshold which we immediately apply to our method for improved privacy guarantees. Our algorithm only requires one 𝑂(𝑛) pass through the data, which can be unsorted, and each subsequent query takes 𝑂(1) time. We empirically compare our unbounded algorithm with the state-of-the-art algorithms in the bounded setting. For inner quantiles, we find that our method often performs better on non-synthetic datasets. For the maximal quantiles, which we apply to differentially private sum computation, we find that our method performs significantly better.

1 Introduction

In statistics, quantiles are values that divide the data into specific proportions, such as median that divides the data in half. Quantiles are a central statistical method for better understanding a dataset. However, releasing quantile values could leak information about specific individuals within a sensitive dataset. As a result, it becomes necessary to ensure that individual privacy is ensured within this computation. Differential privacy offers a rigorous method for measuring the amount that one individual can change the output of a computation. Due it s rigorous guarantees, differential privacy has become the gold standard for measuring privacy. This measurement method then offers an inherent tradeoff between accuracy and privacy with outputs of pure noise achieving perfect privacy. Thus, the goal of designing algorithms for differentially private quantile computation is to maximize accuracy for a given level of privacy.

There are a variety of previous methods for computing a given quantile of the dataset that we will cover in Section 1.2 , but each of these requires known bounds on the dataset. The most effective and practical method invokes the exponential mechanism Smith (2011). For computing multiple quantiles this method can be called iteratively. Follow-up work showed that it could be called recursively by splitting the dataset at each call to reduce the privacy cost of composition Kaplan et al. (2022). Further, a generalization can be called efficiently in one shot Gillenwater et al. (2021).

37th Conference on Neural Information Processing Systems (Neur IPS 2023).

1.1 Our contributions

In this work, we offer an alternative practical and accurate approach, Unbounded Quantile Estimation (UQE), that also invokes a well-known technique and can additionally be applied to the unbounded setting. While the commonly-used technique designs a distribution to draw from that is specific to the dataset, our method will simply perform a noisy guess-and-check. Initially we assume there is only a lower bound on the data, as non-negative data is common in real world datasets with sensitive individual information. Our method will simply iteratively increase the candidate value by a small percentage and halt when the number of data points below the value exceeds the desired amount dictated by the given quantile. While the relative increase will be small each iteration, the exponential nature still implies that the candidate value will become massive within a reasonable number of iterations. As a consequence, our algorithm can handle the unbounded setting where we also show that two calls to this procedure can handle fully unbounded data. Computing multiple quantiles can be achieved by applying the recursive splitting framework from Kaplan et al. (2022).

Performing our guess-and-check procedure with differential privacy exactly fits Above Threshold, a method that is iteratively called in the Sparse Vector Technique Dwork et al. (2009). We also take a deeper look at Above Threshold and unsurprisingly show that similar to report noisy max algorithms, the noise addition can come from the Laplace, Gumbel or Exponential distributions. We further push this analysis to show that for monotonic queries, a common class of queries which we will also utilize in our methods, the privacy bounds for composition within the Sparse Vector Technique can be further improved. Given the widespread usage of this technique,1 we believe this result is of independent interest. Furthermore, we give a more general characterization of query properties that can improve the privacy bounds of Above Threshold. We immediately utilize this characterization in our unbounded quantile estimation algorithm to improve privacy guarantees.

While the commonly used algorithms for quantile estimation can still apply incredibly loose bounds to ensure the data is contained within, this can have a substantial impact upon the accuracy for estimating the highest quantiles such as maximum. This leads to an especially important application for our algorithm, differentially private sum computation, which can thereby be used to compute mean as well. Performing this computation practically without assumptions upon the distribution often requires clipping the data and adding noise proportionally. Clipping too high adds too much noise, and clipping too low changes the sum of the data too much. The highest quantiles of the data are used for clipping to optimize this tradeoff. The unbounded nature of our approach fundamentally allows us to estimate the highest quantiles more robustly and improve the accuracy of differentially private sum computation.

This improvement in differentially private sum computation is further evidenced by our empirical evaluation, with significant improvements in accuracy. Our empericial comparison will be upon the same datasets from previous work in the bounded setting. We also compare private computation of the inner quantiles on these datasets. For synthetic datasets generated from uniform or guassian distributions, we see that the more structured approach of designing a distribution for the data from the exponential mechanism consistently performs better. However, for the real-world datasets, we see that our unstructured approach tends to perform better even within this bounded setting. By design our algorithm is less specific to the data, so our alternative approach becomes advantageous when less is known about the structure and bounds of the data a priori. As such, for large-scale privacy systems that provide statistical analysis for a wide variety of datasets, our methods will be more flexible to handle greater generality accurately.

1.2 Background literature

The primary algorithm for privately computing a given quantile, by which we compare our technique, applies the exponential mechanism with a utility function based upon closeness to the true quantile Smith (2011). We will discuss this algorithm, which we denote as Exponential Mechanism Quantile (EMQ), in greater detail in Appendix A. This approach was then extended to computing multiple quantiles more cleverly by recursively splitting the data and establishing that only one partition of

1For example, see Roth and Roughgarden (2010); Hardt and Rothblum (2010); Dwork et al. (2015); Nissim et al. (2016); Nissim and Stemmer (2018); Kaplan et al. (2020a,b); Hasidim et al. (2020); Bun et al. (2017); Bassily et al. (2018); Cummings et al. (2020); Ligett et al. (2017); Barthe et al. (2016a,b); Steinke and Ullman (2016); Cummings et al. (2015); Ullman (2015); Nandi and Bassily (2020); Shokri and Shmatikov (2015); Hsu et al. (2013); Sajed and Sheffet (2019); Feldman and Steinke (2017); Blum et al. (2015); Chen et al. (2016)

the dataset can change between neighbors thereby reducing the composition costs Kaplan et al. (2022). Additional follow-up work showed that a generalization of the utility function to multiple quantiles could be efficiently drawn upon in one shot Gillenwater et al. (2021). Another recent result examined this problem in the streaming data setting and gave a method that only uses strongly sub-linear space complexity Alabi et al. (2022).

Quantile computation can also be achieved through CDF estimation Bun et al. (2015); Kaplan et al. (2020a). However these techniques offer limited practicality as they rely upon several reductions and parameter tuning. Recursively splitting the data is also done for CDF estimation algorithms where the statistics from each split can be aggregated for quantile computation Dwork et al. (2010); Chan et al. (2011). These techniques tend to be overkill for quantile estimation and thus suffer in accuracy comparatively.

We will also give improved privacy analysis of the Sparse Vector Technique which was originally introduced in Dwork et al. (2009). A more detailed analysis of the method can be found in Lyu et al. (2017). Additional recent work has shown that more information can be output from the method at no additional privacy cost Kaplan et al. (2021); Ding et al. (2023).

1.3 Organization

We provide the requisite notation and definitions in Section 2. In Section 3, we review the Above Threshold algorithm from the literature and show that privacy analysis can be further improved. In Section 4, we provide our unbounded quantile estimation method. In Section 5, we test our method compared to the previous techniques on synthetic and real world datasets. In Appendix A, we consider the estimation of the highest quantiles which has immediate application to differentially private sum and mean estimation. In Appendix B, we give further results on the Above Threshold algorithm and provide the missing proofs from Section 3. In Appendix C, we provide further variants and extensions of our unbounded quantile estimation technique.

2 Preliminaries

We will let 𝑥, 𝑥 denote datasets in our data universe . Definition 2.1. Datasets 𝑥, 𝑥 are neighboring if at most one individual s data has been changed.

Note that we use the swap definition, but our analysis of the Above Thresholdalgorithm will be agnostic to the definition of neighboring. Using this definition as opposed to the add-subtract definition is necessary to apply the same experimental setup as in Gillenwater et al. (2021). Our differentially private quantile estimation will apply to either and we will give the privacy guarantees if we instead use the add-subtract definition in Appendix C.2. Definition 2.2. A function 𝑓 ℝhas sensitivity Δ if for any neighboring datasets |𝑓(𝑥) 𝑓(𝑥 )| Δ Definition 2.3. Dwork et al. (2006b,a) A mechanism 𝑀 is (𝜀, 𝛿)-differentially-private (DP) if for any neighboring datasets 𝑥, 𝑥 and 𝑆 :

Pr[𝑀(𝑥) 𝑆] 𝑒𝜀Pr[𝑀(𝑥 ) 𝑆] + 𝛿.

We will primarily work with pure differential privacy in this work where 𝛿= 0. We will also be considering the composition properties of the Sparse Vector Technique, and the primary method for comparison will be Concentrated Differential Privacy that has become widely used in practice due to it s tighter and simpler advanced composition properties Bun and Steinke (2016). This definition is instead based upon Reny divergence where for probability distributions 𝑃, 𝑄over the same domain and 𝛼> 1

𝐷𝛼(𝑃||𝑄) = 1 𝛼 1 ln E 𝑧 𝑃[( 𝑃(𝑧) 𝑄(𝑧))

Definition 2.4. Bun and Steinke (2016) A mechanism 𝑀 is 𝜌-zero-concentrateddifferentially-private (z CDP) if for any neighboring datasets 𝑥, 𝑥 and all 𝛼 (1, ):

𝐷𝛼(𝑀(𝑥)||𝑀(𝑥 )) 𝜌𝛼.

We can translate DP into z CDP in the following way.

Proposition 1. Bun and Steinke (2016) If 𝑀satisfies 𝜀-DP then 𝑀satisfies 1

In our examination of Above Threshold we will add different types of noise, similar to the report noisy max algorithms Ding et al. (2021). Accordingly, we will consider noise from the Laplace, Gumbel and Exponential distributions where Lap(𝑏) has PDF 𝑝Lap(𝑧; 𝑏), Gumbel(𝑏) has PDF 𝑝Gumbel(𝑧; 𝑏), and Expo(𝑏) has PDF 𝑝Expo(𝑧; 𝑏) where

𝑝Lap(𝑧; 𝑏) = 1

2𝑏exp ( |𝑧|/𝑏) 𝑝Gumbel(𝑧; 𝑏) = 1

𝑏exp ( (𝑧/𝑏+ 𝑒 𝑧/𝑏))

𝑝Expo(𝑧; 𝑏) = { 1

𝑏exp ( 𝑧/𝑏) 𝑧 0 0 𝑧< 0

We let Noise(𝑏) denote noise addition from any of Lap(𝑏), Gumbel(𝑏), or Expo(𝑏). We will also utilize the definition of the exponential mechanism to analyze the addition of Gumbel noise.

Definition 2.5. Mc Sherry and Talwar (2007) The Exponential Mechanism is a randomized mapping 𝑀 such that

Pr [𝑀(𝑥) = 𝑦] exp ( 𝜀 𝑞(𝑥, 𝑦)

where 𝑞 ℝhas sensitivity Δ.

3 Improved Analysis for Sparse Vector Technique

In this section, we review the Above Threshold algorithm from the literature. To our knowledge, this technique has only been used with Laplace noise in the literature. Unsurprisingly, we show that Gumbel and Exponential noise can also be applied, with the former allowing for a closed form expression of each output probability. We further show that for monotonic queries the privacy analysis of the Sparse Vector Technique, which iteratively applies Above Threshold, can be improved. All proofs are be pushed to Appendix B where we also give a more general characterization of query properties that can improve the privacy bounds of Above Threshold.

3.1 Above Threshold Algorithm

We first provide the algorithm for Above Threshold where noise can be applied from any of the Laplace, Gumbel or Exponential distributions.

Algorithm 1 Above Threshold

Require: Input dataset 𝑥, a stream of queries {𝑓𝑖 ℝ} with sensitivity Δ, and a threshold 𝑇

1: Set 𝑇= 𝑇+ Noise(Δ/𝜀1) 2: for each query 𝑖do 3: Set 𝜈𝑖= Noise(Δ/𝜀2) 4: if 𝑓𝑖(𝑥) + 𝜈𝑖 𝑇then 5: Output and halt 6: else 7: Output 8: end if 9: end for

We will also define a common class of queries within the literature that is often seen to achieve a factor of 2 improvement in privacy bounds.

Definition 3.1. We say that stream of queries {𝑓𝑖 ℝ} with sensitivity Δ is monotonic if for any neighboring 𝑥, 𝑥 we have either 𝑓𝑖(𝑥) 𝑓𝑖(𝑥 ) for all 𝑖or 𝑓𝑖(𝑥) 𝑓𝑖(𝑥 ) for all 𝑖.

To our knowledge all previous derivations of Above Threshold in the literature apply Lap noise which gives the following privacy guarantees.

Lemma 3.1 (Theorem 2 and 3 of Lyu et al. (2017)). If the noise addition is Lap then Algorithm 1 is (𝜀1 + 2𝜀2)-DP for general queries and is (𝜀1 + 𝜀2)-DP for monotonic queries.

Given that Expo noise is one-sided Lap noise, it can often be applied for comparative algorithms such as this one and report noisy max as well. We will show this extension in the appendix for completeness.

Corollary 3.1. If the noise addition is Expo then Algorithm 1 is (𝜀1 + 2𝜀2)-DP for general queries and (𝜀1 + 𝜀2)-DP for monotonic queries.

While the proofs for Expo noise generally follow from the Lap noise proofs, it will require different techniques to show that Gumbel noise can be applied as well. In particular, we utilize the known connection between adding Gumbel noise and the exponential mechanism.

Lemma 3.2. If the noise addition is Gumbel and 𝜀1 = 𝜀2 then Algorithm 1 is (𝜀1 + 2𝜀2)-DP for general queries and (𝜀1 + 𝜀2)-DP for monotonic queries.

We defer the proof of this to the appendix. In all of our empirical evaluations we will use Expo noise in our calls to Above Threshold because it has the lowest variance for the same parameter. While we strongly believe that Expo noise will be most accurate under the same noise parameters, we leave a more rigorous examination to future work. We also note that this examination was implicitly done for report noisy max between Gumbel and Expo noise in Mc Kenna and Sheldon (2020), where their algorithm is equivalent to adding Expo noise Ding et al. (2021), and Expo noise was shown to be clearly superior.

3.2 Improved privacy analysis for Sparse Vector Technique

In this section, we further consider the iterative application of Above Threshold which is known as the sparse vector technique. We show that for monotonic queries, we can improve the privacy analysis of sparse vector technique to obtain better utility for the same level of privacy. Our primary metric for measuring privacy through composition will be z CDP which we defined in Section 2 and has become commonly used particularly due to the composition properties. We further show in the appendix that our analysis also enjoys improvement under the standard definition of differential privacy. These improved properties immediately apply to our unbounded quantile estimation algorithm as our queries will be monotonic.

Theorem 1. If the queries are monotonic, then for any noise addition of Lap, Gumbel, or Expo we have that Algorithm 1 is 1

2𝜀2-z CDP where 𝜀= 𝜀1

2 + 𝜀2. If the noise addition is Gumbel then we further require 𝜀1 = 𝜀2

Note that applying Proposition 1 will instead give 𝜀= 𝜀1 + 𝜀2. It will require further techniques to reduce this by 𝜀1/2 which will immediately allow for better utility with the same privacy guarantees. This bound also follows the intuitive factor of 2 improvement that is often expected for monotonic queries. The analysis will be achieved through providing a range-bounded property, a definition that was introduced in Durfee and Rogers (2019). This definition is ideally suited to characterizing the privacy loss of selection algorithms, by which we can view Above Threshold. As such it will also enjoy the improved composition bounds upon the standard differential privacy definition shown in Dong et al. (2020). This range bounded property was then unified with z CDP with improved privacy guarantees in Cesar and Rogers (2021).

We give a proof of this theorem along with further discussion in Appendix B.3. We will also give a generalized characterization of when we can take advantage of properties of the queries to tighten the privacy bounds in Above Threshold. We will immediately utilize this characterization to improve the privacy guarantees for our method in Appendix C.

4 Unbounded Quantile Estimation

In this section we give our method for unbounded quantile estimation. We focus upon the lower bounded setting which we view as most applicable to real-world problems, where non-negative

data is incredibly common, particularly for datasets that contain information about individuals. This method can be symmetrically apply to upper bounded data, and in the appendix we will show how this approach can be extended to the fully unbounded setting.

For the quantile problem we will assume our data 𝑥 ℝ𝑛. Given quantile 𝑞 [0, 1] and dataset 𝑥 ℝ𝑛the goal is to find 𝑡 ℝsuch that |{𝑥𝑗 𝑥|𝑥𝑗< 𝑡}| is as close to 𝑞𝑛as possible.

4.1 Unbounded quantile mechanim

The idea behind unbounded quantile estimation will be a simple guess-and-check method that will just invoke Above Threshold. In particular, we will guess candidate values 𝑡such that |{𝑥𝑗 𝑥|𝑥𝑗< 𝑡}| is close to 𝑞𝑛. We begin with the smallest candidate, recall that we assume lower bounded data here and generalize later, and iteratively increase by a small percentage. At each iteration we check if |{𝑥𝑗 𝑥|𝑥𝑗< 𝑡}| has exceeded 𝑞𝑛, and terminate when it does, outputting the most recent candidate. In order to achieve this procedure privately, we will simply invoke Above Threshold. We also discuss how this procedure can be achieved efficiently in Section 4.2.

Thus we give our algorithm here where the lower bound of the data, 𝓁, is our starting candidate and 𝛽is the scale at which the threshold increases. Given that we want our candidate value to increase by a small percentage it must start as a positive value. As such we will essentially just shift the data such that the lower bound is instead 1. In all our experiments we set 𝛽= 1.001, so the increase is by 0.1% each iteration.

Algorithm 2 Unbounded quantile mechanism

Require: Input dataset 𝑥, a quantile 𝑞, a lower bound 𝓁, and parameter 𝛽> 1

1: Run Above Threshold with 𝑥, 𝑇= 𝑞𝑛and 𝑓𝑖(𝑥) = |{𝑥𝑗 𝑥|𝑥𝑗 𝓁+ 1 < 𝛽𝑖}| 2: Output 𝛽𝑘+ 𝓁 1 where 𝑘is the query that Above Threshold halted at

Given that our method simply calls Above Threshold it will enjoy all the privacy guarantees from Section 3.1. Furthermore we will show that our queries are monotonic. Lemma 4.1. For any sequence of thresholds {𝑡𝑖 ℝ} let 𝑓𝑖(𝑥) = |{𝑥𝑗 𝑥|𝑥𝑗< 𝑡𝑖}| for all 𝑖. For any neighboring dataset under Definition 2.1, we have that {𝑓𝑖} are monotonic queries with sensitivity 1.

Proof. Let 𝑥𝑗be the value that differs between neighbors 𝑥, 𝑥 . Define 𝑥,𝑡= {𝑥𝑗 𝑥|𝑥𝑗< 𝑡}. We consider the case 𝑥 𝑗> 𝑥𝑗and the other will follow symmetrically. For all thresholds 𝑡𝑖 ( , 𝑥𝑗] we have 𝑥𝑗 𝑥,𝑡𝑖and 𝑥 𝑗 𝑥 ,𝑡𝑖, so 𝑓𝑖(𝑥) = 𝑓𝑖(𝑥 ). For all thresholds 𝑡𝑖 (𝑥 𝑗, ) we have 𝑥𝑗 𝑥,𝑡𝑖and 𝑥 𝑗 𝑥 ,𝑡𝑖, so 𝑓𝑖(𝑥) = 𝑓𝑖(𝑥 ). Finally, for all thresholds 𝑡𝑖 (𝑥𝑗, 𝑥 𝑗] we have 𝑥𝑗 𝑥,𝑡𝑖and 𝑥 𝑗 𝑥 ,𝑡𝑖, so 𝑓𝑖(𝑥) = 𝑓𝑖(𝑥 ) + 1. Therefore, 𝑓𝑖(𝑥) 𝑓𝑖(𝑥 ) for all 𝑖, and the sensitivity is 1.

Note that for the swap definition of neighboring the threshold remains constant. We will discuss how to extend our algorithm and further improve the privacy bounds for the add-subtract definition of neighboring in the appendix.

4.2 Simple and scalable implementation

In this section we show how our call to Above Threshold can be done with a simple linear time pass through the data, and each subsequent query takes 𝑂(1) time. While the running time could potentially be infinite, if we set 𝛽= 1.001, then after 50,000 iterations our threshold is already over 1021 and thus highly likely to have halted. Unless the scale of the data is absurdly high or the 𝛽 value chosen converges to 1, our guess-and-check process will finish reasonably quickly. 2

In our initial pass through the data, for each data point 𝑥𝑗we will find the index 𝑖such that 𝛽𝑖 𝑥𝑗 𝓁+ 1 < 𝛽𝑖+1, which can be done by simply computing log𝛽(𝑥𝑗 𝓁+ 1) as our lower bound ensures 𝑥𝑗 𝓁+ 1 1. Using a dictionary or similar data structure we can efficiently store |{𝑥𝑗 𝑥|𝛽𝑖 𝑥𝑗 𝓁+ 1 < 𝛽𝑖+1}| for each 𝑖with the default being 0. This preprocessing does not

2Note that the process can also be terminated at any time without affecting the privacy guarantees.

require sorted data and takes 𝑂(𝑛) arithmetic time, where we note that the previous algorithms also measure runtime arithmetically.

Finally, for each query if we already have |{𝑥𝑗 𝑥|𝑥𝑗 𝓁+ 1 < 𝛽𝑖}|, then we can add |{𝑥𝑗 𝑥|𝛽𝑖 𝑥𝑗 𝓁+ 1 < 𝛽𝑖+1}| in O(1) time to get |{𝑥𝑗 𝑥|𝑥𝑗 𝓁+ 1 < 𝛽𝑖+1}|. Inductively, each query will take O(1) time. We provide the code in the appendix for easier reproducibility.

4.3 Extension to multiple quantiles

The framework for computing multiple quantiles set up in Kaplan et al. (2022) is agnostic to the technique used for computing a single quantile. Their method will first compute the middle quantile and split the data according to the result. Through recursive application the number of levels of computation will be logarithmic. Furthermore, at each level we can see that at most one partition of the data will differ between neighbors, allowing for instead a logarithmic number of compositions. As such our approach can easily be applied to this recursive splitting framework to achieve the same improvements in composition. This will require some minor updating of their proofs to the swap definition that we will do in the appendix.

5 Empirical Evaluation

In this section we empirically evaluate our approach compared to the previous approaches. We give further detail of the previous approaches, particularly EMQ, in Appendix A along with strong intuition upon why our approach will better handle maximal quantile estimation for data clipping. We first go over the datasets and settings used for our experiments which will follow recent related work Gillenwater et al. (2021); Kaplan et al. (2022). Next we evaluate how accurately our method estimates quantiles for the different datasets in the bounded setting. Finally, we will consider the application of computing differentially private sum, which also gives mean computation, and show how our algorithm allows for a significantly more robust and accurate method when tight bounds are not known for the dataset.

5.1 Datasets

We borrow the same setup and datasets as Gillenwater et al. (2021); Kaplan et al. (2022). We test our algorithm compared to the state-of-the-art on six different datasets. Two datasets will be synthetic. One draws 10,000 data points from the uniform distribution in the range [ 5, 5] and the other draws 10,000 data points from the normal distribution with mean zero and standard deviation of five. Two datasets will come from Soumik (2019) with 11,123 data points, where one has book ratings and the other has book page counts. Two datasets will come from Dua and Graf (2019) with 48,842 data points, where one has the number of hours worked per week and the other has the age for different people. We provide histograms of our datasets for better understanding in Figure 1.

5.2 Quantile estimation experiments

For our quantile estimation experiments, for a given quantile 𝑞 [0, 1] we consider the error of outcome 𝑜𝑞from one of the private methods to be |𝑜𝑞 𝑡𝑞| where 𝑡𝑞is the true quantile value. We use the in-built quantile function in the numpy library with the default settings to get the true quantile value. As in previous related works, we randomly sample 1000 datapoints from each dataset and run the quantile computation on each method. This process is then iterated upon 100 times and the error is averaged. We set 𝜀= 1 as in the previous works, which will require setting 𝜀1 = 𝜀2 = 1/2 for the call to Above Threshold in our method.

We will also tighten the ranges to the following, [ 5, 5] for the uniform dataset, [ 25, 25] for the normal dataset, [0, 10] for the ratings dataset, [0, 10000] for the pages dataset, [0, 100] for the hours dataset, and [0, 100] for the ages dataset. Given that EMQ suffers performance when many datapoints are equal we add small independent noise to our non-sythetic datasets. This noise will be from the normal distribution with standard deviation 0.001 for the ratings dataset and 0.1 for the other three that have integer values. Our method does not require the noise addition but we will use the perturbed dataset for fair comparison. True quantiles are still computed upon the original data. For the datasets with integer values we rounded each output to the nearest integer. For our method we set 𝛽= 1.001 for all datasets.

(a) Uniform

(b) Gaussian

(c) Book ratings

(d) Book pages

(e) Census ages

(f) Census hours

Figure 1: Histograms for each of our datasets.

For these experiments we only compare our method UQE and the previous method EMQ, using the implementations from Gillenwater et al. (2021). The other procedures, discussed in the appendix are more generalized and thus for this specific setting do not perform nearly as well which can be seen in the previous experiments Gillenwater et al. (2021); Kaplan et al. (2022), so we omit them from our results. For this experiment we consider estimating each quantile from 5% to 95% at a 1% interval. In Figure 2 we plot the mean absolute error of each normalized by the mean absolute error of UQE to make for an easier visualization.

Figure 2: Plots of UQE = (mean absolute error UQE) / (mean absolute error UQE) and EMQ = (mean absolute error EMQ) / (mean absolute error UQE). Normalizing in this way will make for an easier visualization. When EMQ is below UQE then it s error is lower, and when EMQ is above UQE then it s error is higher

As we can see in Figure 2, EMQ consistently performs better on synthetic data, and UQE tends to performs better on the non-synthetic data. This fits with our intuition that UQE will be best suited to situations where the data is unstructured and less is known about the dataset beforehand because our guess-and-check methodology is designed to better handle ill-behaving datasets.

5.3 Sum estimation experiments

As the primary application of our method we will also be considering differentially private sum computation, which can thereby compute mean as well. We will be using the following 2𝜀-DP general procedure for computing the sum of non-negative data:

1. Let Δ = Private Quantile(𝑥, 𝑞, 𝜀) where Private Quantile is any differentially private computation algorithm and 𝑞 1.

2. Output Lap(Δ/𝜀) + 𝑛 𝑗=1 min(𝑥𝑗, Δ)

We further test this upon the non-synthetic datasets. For large-scale privacy systems that provide statistical analysis for a wide variety of datasets, if we use a Private Quantile that requires an upper bound then we ideally want this bound to be agnostic to the dataset. The is particularly true for sum computations upon groupby queries as the range and size can differ substantially amongst groups. As such, we fix the range at [0, 10000] to encompass all the datasets. We will otherwise use the same general setup as in Section 5.2.

We will measure the error of this procedure as the absolute error of the output and the true sum of the dataset. For each of the 100 iterations of choosing 1000 samples randomly from the full dataset, we also add Lap noise 100 times. Averaging over all these iterations gives our mean absolute error.

Privacy Method Ratings data Pages data Ages data Work hours data UQE 4.780.21 4385.232077.16 103.0516.04 180.4844.92 𝜀= 1 EMQ 5.730.54 4324.382343.25 187.0633.55 339.0675.82 AT 8.750.31 4377.452340.93 293.11117.32 471.98174.07 UQE 9.220.31 7102.343093.13 180.6127.03 277.8977.60 𝜀= 0.5 EMQ 6906.137123.36 7601.473963.52 678.221931.11 2131.804669.11 AT 29.01115.04 7491.974367.31 473.19238.19 582.58369.29 UQE 44.591.79 21916.376423.75 821.77157.68 981.10219.66 𝜀= 0.1 EMQ 45861.7928366.46 46552.0927944.18 50558.3228843.72 47185.5829437.22 AT 11351.7721423.40 30830.4920422.05 14490.0625268.71 9928.1820159.17

Table 1: Mean absolute error for differentially private sum estimation. The standard deviation over the 100 iterations is also provided for each in the subscript. UQE = Our unbounded quantile estimation method. EMQ = The exponential mechanism based quantile estimation method. AT = The aggregate tree method for quantile estimation. For our method we only use 𝑞= 0.99. For the others we use the best performance for 𝑞 {0.95, 0.96, 0.97, 0.98, 0.99}.

We will run this procedure with 𝜀 {0.1, 0.5, 1}. Further we will use 𝑞= 0.99 always for our method, but give the absolute error for the best performing 𝑞 {0.95, 0.96, 0.97, 0.98, 0.99} for the other methods. It is important to note that this value would have to be chosen ahead of time, which would add more error to the other methods. The previous methods we consider here are again the EMQ, but also the aggregate tree (AT) methods, were we use both the implementation along with generally best performing height (3) and branching factor (10) from Gillenwater et al. (2021). We also implemented the bounding technique using inner quartile range within Algorithm 1 of Smith (2011), but this performed notably worse than the others so we omitted the results from our table.

As we can see in Table 1, our method is far more robust and accurate. Furthermore, for our method the choice of 𝑞remained constant and we can see that our results still stayed consistently accurate when 𝜀changed. Note that the noise added to the clipped sum is also scaled proportional to 𝜀so the amount the error increased as 𝜀decreased for our method is what would be expected proportionally. Once again these findings are consistent with our intuition. Our technique is more robust to differing datasets and privacy parameters, and especially better performing for this important use case.

Recall that the sampled data had size 1000 so dividing accordingly can give the error on mean estimates. There is a long line of literature on differentially private mean estimation.3 To our knowledge, all of these more complex algorithms either require assumptions upon the data distribution,

3See Kamath et al. (2022) for an extensive review of this literature

such as sub-Gaussian or bounded moments, or bounds upon the data range or related parameters, and most often require both. These results also focus upon proving strong theoretical guarantees of accuracy with respect to asymptotic sample complexity. We first note that our approach will provide better initial bounds upon the data as seen in our experiments, which directly improve the theoretical guarantees in the results that require a data range. But also our focus here is upon practical methods that are agnostic to data distributions and more widely applicable to real-world data. Consequently, a rigorous comparison among all of these methods would be untenable and outside the scope of this work.

5.4 Parameter tuning

We kept our 𝛽parameter fixed in all experiments for consistency but also to make our method data agnostic. However, our choice was aggressively small in order to achieve higher precision in the inner quantile estimation comparison in Section 5.2. This choice was still highly resilient to changes in 𝜀for our sum experiments as we see our error only scaled proportional to the increase in noise. But for more significant decreases in 𝜀or in the data size, i.e. the conditions under which all private algorithms suffer substantial accuracy loss, this choice of 𝛽could be too small. Those settings imply that the noise added is larger and the distance between queries and threshold shrinks, so our method is more likely to terminate earlier than desired. Smaller 𝛽values will then intensify this issue as the candidate values increase more slowly. For the clipping application, we generally think using a value of 𝛽= 1.01 would be a more stable choice. In fact, replicating our empirical testing with this value actually improves our results in Table 1. Furthermore, increasing 𝛽will also reduce the number of queries and thus the computational cost. In general, we find that setting 𝛽 [1.01, 1.001] gives good performance with 𝛽= 1.01 as a default for computational efficiency. We leave to future work a more thorough analysis of this parameter to fully optimize setting it with relation to data size and privacy parameter. Additionally, all our methods and proofs only require an increasing sequence of candidate values and it s possible that other potential sequences would be even more effective. For example, if tight upper and lower bounds are known on the data, such as within the recursive computation of multiple quantiles, then it likely makes more sense to simply uniformly partition the interval and check in increasing order. But we leave more consideration upon this to future work as well.

Our choice of 𝑞= 0.99 in the differentially private sum experiments was also to maintain consistency with the choices for the previous method but also to keep variance of the estimates lower. This will create some negative bias as we then expect to clip the data. As we can see in our illustrative example Figure 3, the PDF will exponentially decrease once we pass the true quantile value, but will do so less sharply once all the queries have value 𝑛. Accordingly, setting 𝑞= 1.0 would add slightly more variance to the estimation but initial testing showed improvement in error. However, if the user would prefer slightly higher variance to avoid negative bias, then setting the threshold at 𝑛or even 𝑛+ 1/𝜀, would make it far more likely that the process terminates with a value slightly above the maximum. This is particularly useful for heavy-tailed data, where clipping at the 99th percentile can have an out-sized impact on the bias.

Acknowledgments and Disclosure of Funding

We thank our colleagues Matthew Clegg, James Honaker, and Kenny Leftin for helpful discussions and feedback. We also thank anonymous reviewers for their helpful feedback.

Alabi, D., Ben-Eliezer, O., and Chaturvedi, A. (2022). Bounded space differentially private quantiles. ar Xiv preprint ar Xiv:2201.03380.

Barthe, G., Fong, N., Gaboardi, M., Grégoire, B., Hsu, J., and Strub, P.-Y. (2016a). Advanced probabilistic couplings for differential privacy. In Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pages 55 67.

Barthe, G., Gaboardi, M., Grégoire, B., Hsu, J., and Strub, P.-Y. (2016b). Proving differential privacy via probabilistic couplings. In Proceedings of the 31st Annual ACM/IEEE Symposium on Logic in Computer Science, pages 749 758.

Bassily, R., Thakkar, O., and Guha Thakurta, A. (2018). Model-agnostic private learning. Advances in Neural Information Processing Systems, 31.

Blum, A., Morgenstern, J., Sharma, A., and Smith, A. (2015). Privacy-preserving public information for sequential games. In Proceedings of the 2015 Conference on Innovations in Theoretical Computer Science, pages 173 180.

Bun, M., Nissim, K., Stemmer, U., and Vadhan, S. (2015). Differentially private release and learning of threshold functions. In 2015 IEEE 56th Annual Symposium on Foundations of Computer Science, pages 634 649. IEEE.

Bun, M. and Steinke, T. (2016). Concentrated differential privacy: Simplifications, extensions, and lower bounds. In Theory of Cryptography: 14th International Conference, TCC 2016-B, Beijing, China, October 31-November 3, 2016, Proceedings, Part I, pages 635 658. Springer.

Bun, M., Steinke, T., and Ullman, J. (2017). Make up your mind: The price of online queries in differential privacy. In Proceedings of the twenty-eighth annual ACM-SIAM symposium on discrete algorithms, pages 1306 1325. SIAM.

Cesar, M. and Rogers, R. (2021). Bounding, concentrating, and truncating: Unifying privacy loss composition for data analytics. In Algorithmic Learning Theory, pages 421 457. PMLR.

Chan, T.-H. H., Shi, E., and Song, D. (2011). Private and continual release of statistics. ACM Transactions on Information and System Security (TISSEC), 14(3):1 24.

Chen, Y., Machanavajjhala, A., Reiter, J. P., and Barrientos, A. F. (2016). Differentially private regression diagnostics. In ICDM, pages 81 90.

Cummings, R., Kearns, M., Roth, A., and Wu, Z. S. (2015). Privacy and truthful equilibrium selection for aggregative games. In Web and Internet Economics: 11th International Conference, WINE 2015, Amsterdam, The Netherlands, December 9-12, 2015, Proceedings 11, pages 286 299. Springer.

Cummings, R., Krehbiel, S., Lut, Y., and Zhang, W. (2020). Privately detecting changes in unknown distributions. In International Conference on Machine Learning, pages 2227 2237. PMLR.

Ding, Z., Kifer, D., Steinke, T., Wang, Y., Xiao, Y., Zhang, D., et al. (2021). The permute-and-flip mechanism is identical to report-noisy-max with exponential noise. ar Xiv preprint ar Xiv:2105.07260.

Ding, Z., Wang, Y., Xiao, Y., Wang, G., Zhang, D., and Kifer, D. (2023). Free gap estimates from the exponential mechanism, sparse vector, noisy max and related algorithms. The VLDB Journal, 32(1):23 48.

Dong, J., Durfee, D., and Rogers, R. (2020). Optimal differential privacy composition for exponential mechanisms. In International Conference on Machine Learning, pages 2597 2606. PMLR.

Dua, D. and Graf, C. (2019). Uci machine learning repository. In http://archive.ics.uci.edu/ ml.

Durfee, D. and Rogers, R. M. (2019). Practical differentially private top-k selection with pay-whatyou-get composition. Advances in Neural Information Processing Systems, 32.

Dwork, C., Feldman, V., Hardt, M., Pitassi, T., Reingold, O., and Roth, A. L. (2015). Preserving statistical validity in adaptive data analysis. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing, pages 117 126.

Dwork, C., Kenthapadi, K., Mc Sherry, F., Mironov, I., and Naor, M. (2006a). Our data, ourselves: Privacy via distributed noise generation. In Advances in Cryptology-EUROCRYPT 2006: 24th Annual International Conference on the Theory and Applications of Cryptographic Techniques, St. Petersburg, Russia, May 28-June 1, 2006. Proceedings 25, pages 486 503. Springer.

Dwork, C., Mc Sherry, F., Nissim, K., and Smith, A. (2006b). Calibrating noise to sensitivity in private data analysis. In Theory of Cryptography: Third Theory of Cryptography Conference, TCC 2006, New York, NY, USA, March 4-7, 2006. Proceedings 3, pages 265 284. Springer.

Dwork, C., Naor, M., Pitassi, T., and Rothblum, G. N. (2010). Differential privacy under continual observation. In Proceedings of the forty-second ACM symposium on Theory of computing, pages 715 724.

Dwork, C., Naor, M., Reingold, O., Rothblum, G. N., and Vadhan, S. (2009). On the complexity of differentially private data release: efficient algorithms and hardness results. In Proceedings of the forty-first annual ACM symposium on Theory of computing, pages 381 390.

Feldman, V. and Steinke, T. (2017). Generalization for adaptively-chosen estimators via stable median. In Conference on learning theory, pages 728 757. PMLR.

Gillenwater, J., Joseph, M., and Kulesza, A. (2021). Differentially private quantiles. In International Conference on Machine Learning, pages 3713 3722. PMLR.

Hardt, M. and Rothblum, G. N. (2010). A multiplicative weights mechanism for privacy-preserving data analysis. In 2010 IEEE 51st annual symposium on foundations of computer science, pages 61 70. IEEE.

Hasidim, A., Kaplan, H., Mansour, Y., Matias, Y., and Stemmer, U. (2020). Adversarially robust streaming algorithms via differential privacy. Advances in Neural Information Processing Systems, 33:147 158.

Hsu, J., Roth, A., and Ullman, J. (2013). Differential privacy for the analyst via private equilibrium computation. In Proceedings of the forty-fifth annual ACM symposium on Theory of computing, pages 341 350.

Kamath, G., Mouzakis, A., Singhal, V., Steinke, T., and Ullman, J. (2022). A private and computationally-efficient estimator for unbounded gaussians. In Conference on Learning Theory, pages 544 572. PMLR.

Kaplan, H., Ligett, K., Mansour, Y., Naor, M., and Stemmer, U. (2020a). Privately learning thresholds: Closing the exponential gap. In Conference on Learning Theory, pages 2263 2285. PMLR.

Kaplan, H., Mansour, Y., and Stemmer, U. (2021). The sparse vector technique, revisited. In Conference on Learning Theory, pages 2747 2776. PMLR.

Kaplan, H., Schnapp, S., and Stemmer, U. (2022). Differentially private approximate quantiles. In International Conference on Machine Learning, pages 10751 10761. PMLR.

Kaplan, H., Sharir, M., and Stemmer, U. (2020b). How to find a point in the convex hull privately. So CG.

Ligett, K., Neel, S., Roth, A., Waggoner, B., and Wu, S. Z. (2017). Accuracy first: Selecting a differential privacy level for accuracy constrained erm. Advances in Neural Information Processing Systems, 30.

Lyu, M., Su, D., and Li, N. (2017). Understanding the sparse vector technique for differential privacy. Proceedings of the VLDB Endowment, 10(6).

Mc Kenna, R. and Sheldon, D. R. (2020). Permute-and-flip: A new mechanism for differentially private selection. Advances in Neural Information Processing Systems, 33:193 203.

Mc Sherry, F. and Talwar, K. (2007). Mechanism design via differential privacy. In 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS 07), pages 94 103. IEEE.

Nandi, A. and Bassily, R. (2020). Privately answering classification queries in the agnostic pac model. In Algorithmic Learning Theory, pages 687 703. PMLR.

Nissim, K., Raskhodnikova, S., and Smith, A. (2007). Smooth sensitivity and sampling in private data analysis. In Proceedings of the thirty-ninth annual ACM symposium on Theory of computing, pages 75 84.

Nissim, K. and Stemmer, U. (2018). Clustering algorithms for the centralized and local models. In Algorithmic Learning Theory, pages 619 653. PMLR.

Nissim, K., Stemmer, U., and Vadhan, S. (2016). Locating a small cluster privately. In Proceedings of the 35th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 413 427.

Roth, A. and Roughgarden, T. (2010). Interactive privacy via the median mechanism. In Proceedings of the forty-second ACM symposium on Theory of computing, pages 765 774.

Sajed, T. and Sheffet, O. (2019). An optimal private stochastic-mab algorithm based on optimal private stopping rule. In International Conference on Machine Learning, pages 5579 5588. PMLR.

Shokri, R. and Shmatikov, V. (2015). Privacy-preserving deep learning. In Proceedings of the 22nd ACM SIGSAC conference on computer and communications security, pages 1310 1321.

Smith, A. (2011). Privacy-preserving statistical estimation with optimal convergence rates. In Proceedings of the forty-third annual ACM symposium on Theory of computing, pages 813 822.

Soumik (2019). Goodreads-books dataset. In https://www.kaggle.com/jealousleopard/ goodreadsbooks.

Steinke, T. and Ullman, J. (2016). Between pure and approximate differential privacy. Journal of Privacy and Confidentiality.

Ullman, J. (2015). Private multiplicative weights beyond linear queries. In Proceedings of the 34th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems, pages 303 312.

A Maximal Quantiles Estimation

In this section, we specifically consider applying our unbounded quantile estimation to computing the highest quantiles. This is commonly needed for clipping the dataset to privately compute the sum and mean, which we empirically test in Section 5.3. Our primary goal here is to build intuition upon why our approach fundamentally gives more accurate and robust estimations of the highest quantiles when only loose upper bounds on the data are known. We first give more details upon the previous methods, particularly the EMQ method. Then we give a more detailed look at how these methods are negatively affected by loose upper bounds and why ours performs well in comparison.

A.1 Previous techniques

The most effective and practical previous method for quantile estimation, EMQ, builds a distribution over an assumed bounded range [𝑎, 𝑏] through an invocation of the exponential mechanism. Assuming the data is sorted, it will partition the range based upon the data and select an interval [𝑥𝑗, 𝑥𝑗+1] with probability proportional to

exp ( 𝜀|𝑗 𝑞𝑛|

2 ) (𝑥𝑗+1 𝑥𝑗)

where the intervals [𝑎, 𝑥1] and [𝑥𝑛, 𝑏] are also considered. A uniformly random point is then drawn from within the selected interval. Note that this utility function is not monotonic and will not enjoy the same improved privacy bounds.

There are other approaches that recursively partition the data range while computing statistics on each partition, then aggregate these statistics across logarithmic levels to reduce the noise for quantile computation Dwork et al. (2010); Chan et al. (2011). Another technique is to instead utilize the local sensitivity while maintaining privacy by using a smoothing procedure that can also be used for computing quantiles Nissim et al. (2007). There is also a similar approach to ours within Chen et al. (2016) for bounding an unbounded dataset that increases the bounds by a factor of 2 each iteration but we will discuss in Section A.2 why this variant performs substantially worse.

A.2 Effect of bounds upon maximum quantiles

As mentioned, EMQ will construct a probability distribution over the range [𝑎, 𝑏]. The corresponding PDF is a unimodal step function of the intervals defined by the data with the peak interval [𝑥𝑗, 𝑥𝑗+1] being such that 𝑗closest to 𝑞𝑛. Note then that if 𝑏>> 𝑥𝑛then the probability of selecting [𝑥𝑛, 𝑏] increases dramatically. The exponential decay as the PDF moves away from the peak interval diminishes the impact of [𝑥𝑛, 𝑏] significantly if 𝑛is far away from 𝑞𝑛. However, for computing the highest quantiles, we want 𝑞close to 1 by definition, and the looseness of the upper bound drastically effects the accuracy.

In contrast, as soon as the candidate value for our method exceeds the true quantile value, each successive query will have an output of at least 𝑞𝑛which is the threshold. The probability of continuing for 𝜆more queries then decreases in 𝜆. 4 For ease of comparison, we can modify our Algorithm 2 such that in the last step, the output is drawn uniformly from [𝛽𝑘 1, 𝛽𝑘], assuming 𝓁= 1 for simplicity. This would then create a continuous probability distribution that would similarly be a step function with each interval [𝛽𝑘 1, 𝛽𝑘] being a step.

For better intuition we plot the approximate PDF of each in Figure 3. We use the Gumbel noise in the call to Above Threshold to take advantage of the closed form expression in Lemma B.2 for easier PDF computation.

As we can see in Figure 3, the upper bound, 𝑏, increasing from 10 to 20 dramatically increases the probability of selecting a point in [𝑥𝑛, 𝑏] which changes the normalization for the other intervals, significantly altering the PDF of EMQ. Given that our method is unbounded, the PDF for UQE stays

4We expect 𝜆to most likely be small and this only adds a factor of 𝛽𝜆additional error. Note that setting 𝛽 too high, such as 𝛽= 2 which gives a similar algorithm to Chen et al. (2016), implies that even taking five additional queries will lead to a value that is over 32 times too large.

Figure 3: Illustrative example of how the approximate PDF of the previous method, EMQ is affected by looser upper bounds compared to our unbounded method UQE. A small amount of data was drawn uniformly from [0, 10] and we set 𝑞= 0.9, so accurate output would be about 9. The left and right side assumed a range of [0, 20] and [0, 10] respectively for EMQ.

the same in both figures and sees the expected exponential decay once the candidate quantile passes the true quantile.

There are also alternative methods for bounding the data range by computing the interquartile range more accurately and scaling up. However these methods make strong assumptions upon the data distribution being close to the normal distribution as well as bounds upon the third moment Smith (2011). For real-world datasets, the tails of the data can vary significantly, and scaling up in a data-agnostic manner can often be similarly inaccurate. We also found this to be true in our experiments.

The smooth sensitivity framework will run into similar issues when 𝑞is closer to 1 because fewer data points need to be changed to push the quantile value to the upper bound. If this upper bound is large, then the smooth local sensitivity will still be substantial. Aggregate tree methods are slightly more robust to loose upper bounds as they aren t partitioning specific to the data. However, for accurate estimates the partitioning of the range still requires the data be well-distributed across the evenly split partitions, which is significantly affected by loose upper bounds.

B Improved analysis for Sparse Vector Technique

In this section we complete the analysis for improving the privacy bounds for sparse vector technique with monotonic queries. We will first give a more generalized characterization of query properties by which we can improve the privacy bounds for Above Threshold. While this characterization is more complex, we will immediately utilize it in Section C to improve the privacy bounds of our methods under an alternate common definition of neighboring. Additionally, the class of queries that we apply it to are not monotonic, so improvements can be made in a multitude of settings. Next we will review the definition of range-bounded, a property of privacy mechanisms that can be used to improve composition bounds, and apply it to our setting. Finally we will utilize these results to show that the privacy analysis of the sparse vector technique can be improved more generally, but also specifically for monotonic queries.

B.1 Generalized characterization

We provide a more general characterization for the privacy loss of Above Threshold that is specific to the input stream of queries. This will be achieved by providing a one-sided privacy parameter for each pair of neighboring datasets where order matters. For a given pair 𝑥, 𝑥 , our goal will be to upper bound the output distribution of 𝑥by the output distribution of 𝑥 up to an exponential factor of the one-sided privacy parameter. The specificity of the parameter to the neighboring datasets will reduce conciseness but it is for this reason that we will be able to further improve privacy analysis. This precise characterization will also help improve bounds for our method in Section C without requiring onerous analysis.

Definition B.1. For a stream of queries {𝑓𝑖} with sensitivity Δ, we define our one-sided privacy loss for neighboring data sets 𝑥, 𝑥 as

𝜀(𝑥, 𝑥 ) = max 𝑘 ( 𝜀1

Δ Δ𝑘(𝑥, 𝑥 ) + 𝜀2

Δ max{0, Δ𝑘(𝑥, 𝑥 ) (𝑓𝑘(𝑥 ) 𝑓𝑘(𝑥))})

where Δ𝑘(𝑥, 𝑥 ) = max𝑖<𝑘max{0, 𝑓𝑖(𝑥 ) 𝑓𝑖(𝑥)}.

Given that our definition is meant to encompass the one-sided privacy loss for Above Threshold we will now prove that fact for Expo and Gumbel noise (and Lap follows equivalently to Expo). We first prove this conjecture for Expo noise.

Lemma B.1. For a stream of queries {𝑓𝑖} with sensitivity Δ, threshold 𝑇and neighboring data sets 𝑥, 𝑥 , if we run Algorithm 1 with Expo noise, then for any given outcome { 𝑘 1, } we have

Pr [Above Threshold(𝑥, {𝑓𝑖}, 𝑇) = { 𝑘 1, }]

exp(𝜀(𝑥, 𝑥 ))Pr [Above Threshold(𝑥 , {𝑓𝑖}, 𝑇) = { 𝑘 1, }]

Proof. Let 𝜈𝑖 Expo(Δ/𝜀2) denote the noise drawn for query 𝑓𝑖, and let 𝜈 Expo(Δ/𝜀1) denote the noise drawn for the threshold. Further let 𝜈𝑖 𝑘denote all 𝜈𝑖such that 𝑖 𝑘. By construction, we have

Pr [Above Threshold(𝑥, {𝑓𝑖}, 𝑇) = { 𝑘 1, }] = Pr𝜈,𝜈𝑖 𝑘[max 𝑖<𝑘𝑓𝑖(𝑥) + 𝜈𝑖< 𝑇+ 𝜈< 𝑓𝑘(𝑥) + 𝜈𝑘]

We will fix the randomness of 𝜈𝑖<𝑘, denoting 𝜈𝑖such that 𝑖< 𝑘, for datasets 𝑥and 𝑥 , and let 𝜏= max𝑖<𝑘𝑓𝑖(𝑥) + 𝜈𝑖and 𝜏 = max𝑖<𝑘𝑓𝑖(𝑥 ) + 𝜈𝑖. It suffices then to show that for any fixed randomness 𝜈𝑖<𝑘we have

Pr𝜈,𝜈𝑘[𝜏< 𝑇+ 𝜈< 𝑓𝑘(𝑥) + 𝜈𝑘] exp(𝜀(𝑥, 𝑥 ))Pr𝜈,𝜈𝑘[𝜏 < 𝑇+ 𝜈< 𝑓𝑘(𝑥 ) + 𝜈𝑘]

Let Δ1 = max{0, 𝜏 𝜏} and Δ2 = max{0, Δ1 (𝑓𝑘(𝑥 ) 𝑓𝑘(𝑥))} and let 𝜈 = 𝜈+ Δ1 and 𝜈 𝑘= 𝜈𝑘+ Δ2. It suffices to show that for every pair of draws 𝜈, 𝜈𝑘that satisfies 𝜏< 𝑇+ 𝜈< 𝑓𝑘(𝑥) + 𝜈𝑘, then the corresponding 𝜈 , 𝜈 𝑘are i) non-negative, ii) satisfy 𝜏 < 𝑇+ 𝜈 < 𝑓𝑘(𝑥 ) + 𝜈 𝑘and iii)

𝑝Expo(𝜈; Δ/𝜀1) 𝑝Expo(𝜈𝑘; Δ/𝜀2) exp(𝜀(𝑥, 𝑥 ))𝑝Expo(𝜈 ; Δ/𝜀1) 𝑝Expo(𝜈 𝑘; Δ/𝜀2)

We note that condition iii) does not require any scalar on the RHS as our change of variable is a shift for each so the Jacobian of the mapping is the identity matrix with determinant of 1.

Condition i) follows from the fact that 𝜈, 𝜈𝑘are non-negative because they re drawn from the exponential distribution, and Δ1, Δ2 are non-negative by construction.

For condition ii), if 𝜏< 𝑇+ 𝜈we must have 𝜏 < 𝑇+ 𝜈+ Δ1 by definition of Δ1. Similarly, if 𝑇+ 𝜈< 𝑓𝑘(𝑥) + 𝜈𝑘then 𝑇+ 𝜈+ Δ1 < 𝑓𝑘(𝑥 ) + 𝜈𝑘+ Δ2 because Δ2 Δ1 (𝑓𝑘(𝑥 ) 𝑓𝑘(𝑥)).

For condition iii), the PDF of the exponential distribution implies 𝑝Expo(𝜈; Δ/𝜀1) exp( 𝜀1Δ1

Δ )𝑝Expo(𝜈 ; Δ/𝜀1) and 𝑝Expo(𝜈𝑘; Δ/𝜀2) exp( 𝜀2Δ2

Δ )𝑝Expo(𝜈 𝑘; Δ/𝜀2). Due to the fixing of randomness of 𝜈𝑖<𝑘, we have that 𝜏 𝜏 max𝑖<𝑘(𝑓𝑖(𝑥 ) 𝑓𝑖(𝑥)) so Δ1 Δ𝑘(𝑥, 𝑥 ) from Definition B.1 and Δ2 max{0, Δ𝑘(𝑥, 𝑥 ) (𝑓𝑘(𝑥 ) 𝑓𝑘(𝑥))} which implies condition iii).

This proof then easily applies to Lap as well. The proof for Gumbel will differ though as we no longer have similar closeness of PDF properties, but it will follow from the fact that we can give a closed form expression for each probability.

Lemma B.2. Given a dataset 𝑥, a stream of queries {𝑓𝑖} and threshold 𝑇, for any outcome { 𝑘 1, } from running Above Threshold with Gumbel noise and 𝜀= 𝜀1 = 𝜀2, then

Pr [Above Threshold(𝑥, {𝑓𝑖}, 𝑇) = { 𝑘 1, }] =

Δ𝑇) + 𝑘 𝑖=1 exp( 𝜀

Δ𝑓𝑖(𝑥)) exp( 𝜀

Δ𝑇) + 𝑘 1 𝑖=1 exp( 𝜀

Proof. We add noise Gumbel(Δ/𝜀) to all 𝑓𝑖(𝑥) and the threshold 𝑇, and in order to output { 𝑘 1, } we must have that for the first 𝑘queries, the noisy value of 𝑓𝑘(𝑥) is the largest and the noisy threshold is the second largest. It is folklore in the literature that adding Gumbel noise (with the same noise parameter) and choosing the largest index is equivalent to the exponential mechanism. This can also be extended to showing that adding Gumbel noise and choosing the top-k indices in order is equivalent to the peeling exponential mechanism. The peeling exponential mechanism first selects the top index using the exponential mechanism and removes it from the candidate set, repeating iteratively until accumulating the top-k indices. A formal proof of this folklore result is also provided in Lemma 4.2 of Durfee and Rogers (2019). Applying this result and the definition of the exponential mechanism gives the desired equality.

We then utilize this closed form expression to get our desired one-sided privacy bounds when using Gumbel noise. Lemma B.3. For a stream of queries {𝑓𝑖} with sensitivity Δ, threshold 𝑇and neighboring data sets 𝑥, 𝑥 , if we run Algorithm 1 with Gumbel noise with 𝜀1 = 𝜀2, then for any outcome { 𝑘 1, } we have

Pr [Above Threshold(𝑥, {𝑓𝑖}, 𝑇) = { 𝑘 1, }]

exp(𝜀(𝑥, 𝑥 ))Pr [Above Threshold(𝑥 , {𝑓𝑖}, 𝑇) = { 𝑘 1, }]

Proof. Let 𝜀= 𝜀1 = 𝜀2. From Lemma B.2 we know that

Pr [Above Threshold(𝑥, {𝑓𝑖}, 𝑇) = { 𝑘 1, }] =

Δ𝑇) + 𝑘 𝑖=1 exp( 𝜀

Δ𝑓𝑖(𝑥)) exp( 𝜀

Δ𝑇) + 𝑘 1 𝑖=1 exp( 𝜀

Similar to the proof of Lemma B.1, we let Δ1 = max{0, max𝑖<𝑘(𝑓𝑖(𝑥 ) 𝑓𝑖(𝑥))} and Δ2 = max{0, Δ1 + (𝑓𝑘(𝑥) 𝑓𝑘(𝑥 ))}. We first want to show that

Δ𝑇) + 𝑘 1 𝑖=1 exp( 𝜀

Δ𝑇) + 𝑘 1 𝑖=1 exp( 𝜀

Δ𝑓𝑖(𝑥 )) (1)

This inequality reduces to showing exp( 𝜖

Δ𝑇)+ 𝑘 1 𝑖=1 exp( 𝜖

Δ𝑓𝑖(𝑥 )) 𝑒 𝜖Δ1

Δ𝑇)+ 𝑘 1 𝑖=1 exp( 𝜖

Δ𝑓𝑖(𝑥))). This holds from the fact that exp( 𝜖

Δ𝑇) exp(𝜖Δ1/Δ) exp( 𝜖

Δ𝑇) because 𝜖Δ1/Δ 0 and exp( 𝜖

Δ𝑓𝑖(𝑥 )) exp(𝜖Δ1/Δ) exp( 𝜖

Δ𝑓𝑖(𝑥)) for all 𝑖< 𝑘by our definition of Δ1.

Δ𝑇) + 𝑘 𝑖=1 exp( 𝜀

Δ𝑇) + 𝑘 𝑖=1 exp( 𝜀

Δ𝑓𝑖(𝑥 )) (2)

First we let Δ 1 = max{0, max𝓁 𝑘(𝑓𝓁(𝑥 ) 𝑓𝓁(𝑥))} and Δ 2 = 𝑓𝑘(𝑥) 𝑓𝑘(𝑥 ), and we will achieve our desired inequality (2) by bounding the numerator with respect to Δ 2 and the denominator with respect to Δ 1. We have exp( 𝜖

Δ𝑓𝑘(𝑥)) = exp(𝜖Δ 2/Δ) exp( 𝜖

Δ𝑓𝑘(𝑥 )) by definition of Δ 2. Further exp( 𝜖

Δ𝑓𝑖(𝑥 )) exp(𝜖Δ 1/Δ) exp( 𝜖

Δ𝑓𝑖(𝑥)) by definition of Δ 1, so

because 𝜖Δ 1/Δ 0. In order to show our desired inequality (2) it then suffices to show Δ2 Δ 1 +Δ 2. By definition we know Δ2 Δ1 + Δ 2. Furthermore, if Δ 1 > Δ1 then Δ 1 = 𝑓𝑘(𝑥 ) 𝑓𝑘(𝑥) = Δ 2 which implies Δ 1 + Δ 2 = 0, and we know Δ2 0 by construction.

Combining inequality (1) and (2) along with the fact that Δ1 = Δ𝑘(𝑥, 𝑥 ) from Definition B.1 completes the proof.

Now that we ve bounded the one-sided privacy loss for neighboring datasets for each type of noise, this will immediately imply an overall privacy bound over all neighbors that utilizes this general characterization.

Corollary B.1. For a stream of queries {𝑓𝑖} with sensitivity Δ and threshold 𝑇, Algorithm 1 with Lap, Gumbel, or Expo noise is 𝜀-DP where

𝜀= max 𝑥 𝑥 𝜀(𝑥, 𝑥 )

and for Gumbel noise we must have 𝜀1 = 𝜀2.

B.2 Range-bounded definition and properties

The definition of range-bounded was originally introduced in Durfee and Rogers (2019) to tighten the privacy bounds on the composition of exponential mechanisms. The analysis was further extended in Dong et al. (2020) to give the optimal composition of range-bounded mechanisms. This definition was then unified with other definitions in Cesar and Rogers (2021).

Definition B.2 (Durfee and Rogers (2019)). A mechanism 𝑀 is 𝜀-range-bounded if for any neighboring datasets 𝑥, 𝑥 and outcomes 𝑦, 𝑦 :

Pr [𝑀(𝑥) = 𝑦] Pr [𝑀(𝑥 ) = 𝑦] 𝑒𝜀Pr [𝑀(𝑥) = 𝑦 ]

Pr [𝑀(𝑥 ) = 𝑦 ]

If we have bounds upon this property that are stronger than those immediately implied by DP, then it was shown in Dong et al. (2020) that substantial improvements can be made in bounding the overall DP over the composition of these mechanisms. While these improvements can be applied to our results here as well, we focus upon the privacy bounds with respect to z CDP as it is much cleaner to work with in providing strong composition bounds. From the proof of Lemma 3.4 in Cesar and Rogers (2021) we see that the range-bounded property can give a significant improvement in z CDP properties compared to similar DP guarantees.

Proposition 2 (Cesar and Rogers (2021)). If a mechanism 𝑀is 𝜀-range-bounded then it is 1

Our general characterization is already set up to provide range-bounded properties of Above Threshold. This will allow us to give tighter guarantees on the range-boundedness which can be translated to better privacy composition bounds.

Corollary B.2. For a stream of queries {𝑓𝑖} and threshold 𝑇, Algorithm 1 with Lap, Gumbel, or Expo noise is 𝜀-range-bounded where

𝜀= max 𝑥 𝑥 (𝜀(𝑥, 𝑥 ) + 𝜀(𝑥 , 𝑥))

and for Gumbel noise we must have 𝜀1 = 𝜀2.

Proof. We can rewrite the inequality from Definition B.2 as equivalently

Pr [𝑀(𝑥) = 𝑦] Pr [𝑀(𝑥 ) = 𝑦 ] 𝑒𝜀Pr [𝑀(𝑥 ) = 𝑦] Pr [𝑀(𝑥) = 𝑦 ]

Our claim then follows immediately by applying Lemma B.1 and Lemma B.3

B.3 Improved composition analysis of SVT

In this section we complete our analysis for improving the privacy bounds for sparse vector technique. While our generalized characterization was in a more complex form, we simplify it here for general queries and monotonic queries. This will then match up to the privacy guarantees that we gave in Section 3 for Above Threshold.

Lemma B.4. Given a stream of queries {𝑓𝑖} with sensitivity Δ then for any neighboring datasets 𝜀(𝑥, 𝑥 ) 𝜀1 + 2𝜀2. If the queries are monotonic then for any neighboring datasets 𝜀(𝑥, 𝑥 ) 𝜀1 + 𝜀2, and furthermore, 𝜀(𝑥, 𝑥 ) + 𝜀(𝑥 , 𝑥) 𝜀1 + 2𝜀2.

Proof. By Definition 2.2 we must always have Δ𝑘(𝑥, 𝑥 ) Δ for any neighboring datasets. Furthermore, we must also have 𝑓𝑘(𝑥 ) 𝑓𝑘(𝑥) Δ for any neighboring datasets. This implies 𝜀(𝑥, 𝑥 ) 𝜀1 + 2𝜀2 for any neighboring datasets.

Now assume the queries are monotonic. Without loss of generality assume 𝑓𝑘(𝑥) 𝑓𝑘(𝑥 ) for all 𝑘. Therefore Δ𝑘(𝑥, 𝑥 ) = 0 and 𝑓𝑘(𝑥 ) 𝑓𝑘(𝑥) Δ, which implies 𝜀(𝑥, 𝑥 ) 𝜀2. Similarly we have Δ𝑘(𝑥 , 𝑥) Δ but also 𝑓𝑘(𝑥) 𝑓𝑘(𝑥 ) 0, which implies 𝜀(𝑥 , 𝑥) 𝜀1 + 𝜀2. Combining these implies 𝜀(𝑥 , 𝑥) 𝜀1 + 𝜀2 and 𝜀(𝑥 , 𝑥) + 𝜀(𝑥, 𝑥 ) 𝜀1 + 2𝜀2 for any neighboring datasets.

This lemma along with Corollary B.1 immediately imply Corollary 3.1 and Lemma 3.2

Proof of Theorem 1. From Lemma B.4 and Corollary B.2 we have that Algorithm 1 is 𝜀1 + 2𝜀2-rangebounded. Applying Proposition 2 then gives the z CDP guarantees.

The sparse vector technique is simply an iterative call to Above Threshold and as such the privacy bounds will come from the composition. By analyzing the composition through z CDP which has become commonplace in the privacy community, we can immediately improve the privacy guarantees for monotonic queries in the sparse vector technique. Furthermore we can also improve the privacy guarantees under the standard definition using the improved composition bounds for range-bounded mechanisms from Dong et al. (2020).

We also note here that the generalized Sparse Vector Technique given in Lyu et al. (2017) only adds noise once to the threshold and only has to scale the noise to the number of calls to Above Threshold for the queries and not the threshold. Our analysis can extend to give the same properties for Lap and Expo noise, but utilizing advanced composition properties will give far more accuracy. More specifically, the threshold and queries having noise proportional to 𝑐/𝜀, is preferable to all the queries having noise proportional to 𝑐/𝜀, and only the threshold proportional to 1/𝜀, where 𝑐is the number of calls to Above Threshold.

B.4 Iterative exponential mechanism

It is folklore that the peeling exponential mechanism is equivalent to taking the top-𝑘after adding Gumbel noise, as formally shown in Lemma 4.2 of Durfee and Rogers (2019). We show a similar property here for Above Threshold with Gumbel noise that it is equivalent to iteratively running the exponential mechanism.

Algorithm 3 Iterative Exponential Mechanism

Require: Input dataset 𝑥, a stream of queries {𝑓𝑖 ℝ} with sensitivity Δ, and a threshold 𝑇

1: for each query 𝑖do 2: Run exponential mechanism with = {0, ..., 𝑖} where 𝑞(𝑥, 0) = 𝑇and 𝑞(𝑥, 𝑗) = 𝑓𝑗(𝑥) for all 𝑗> 0 3: if 𝑖is selected then 4: Output and halt 5: else 6: Output 7: end if 8: end for

Lemma B.5. If 𝜀1 = 𝜀2 = 𝜀/2 then Algorithm 1 with Gumbel noise gives the equivalent output distribution to Algorithm 3.

Proof. We first define

2Δ𝑇) + 𝑘 𝑖=1 exp( 𝜀

By construction, the probability of Algorithm 3 outputting { 𝑘 1, } is equal to 𝑝𝑘 𝑘 1 𝑖=1 (1 𝑝𝑖). Furthermore, by our definition of 𝑝𝑘we have

1 𝑝𝑘= exp( 𝜀

2Δ𝑇) + 𝑘 1 𝑖=1 exp( 𝜀

2Δ𝑇) + 𝑘 𝑖=1 exp( 𝜀

Through telescoping cancellation

𝑖=1 (1 𝑝𝑖) = exp( 𝜀

2Δ𝑇) + 𝑘 1 𝑖=1 exp( 𝜀

From Lemma B.2 we then see that the output probabilities are equivalent.

C Additional unbounded quantile estimation results

In this section we first extend our method to the fully unbounded setting by simply making two calls to Above Threshold and essentially searching through both positive and negative numbers in each respective call. Next we show how our methods can also extend to the add-subtract definition of neighboring. Further, we ll apply our approach to the framework set up in Kaplan et al. (2022) for computing multiple quantiles and extend it to the swap definition.

C.1 Fully unbounded quantile estimation

Recall that our unbounded quantile estimation algorithm assumed that the data was lower bounded. It then slowly increased that bound by a small percentage until the appropriate amount of data fell below the threshold for the given quantile. In order to extend to the fully unbounded setting, we will simply first apply this guess and check method to the positive numbers and then apply it to the negative numbers.

Algorithm 4 Fully unbounded quantile mechanism

Require: Input dataset 𝑥, a quantile 𝑞, and parameter 𝛽> 1

1: Run Above Threshold with 𝑥, 𝑇= 𝑞𝑛and 𝑓𝑖(𝑥) = |{𝑥𝑗 𝑥|𝑥𝑗+ 1 < 𝛽𝑖}| 2: Run Above Threshold with 𝑥, 𝑇= (1 𝑞)𝑛and 𝑓𝑖(𝑥) = |{𝑥𝑗 𝑥|𝑥𝑗 1 > 𝛽𝑖}| 3: If the first Above Threshold halts at 𝑘> 0 then output 𝛽𝑘 1 4: If the second Above Threshold halts at 𝑘> 0 then output 𝛽𝑘+ 1 5: Otherwise return 0

Note that the second call could equivalently be achieved by flipping the sign of all the datapoints and again applying queries 𝑓𝑖(𝑥) = |{𝑥𝑗 𝑥|𝑥𝑗+ 1 < 𝛽𝑖}|. Therefore all our privacy guarantees from Section 4 will still apply, but composing over two calls to Above Threshold, which is the Sparse Vector Technique.

In the first call to Above Threshold we are assuming a lower bound of 0, and searching the positive numbers. If it terminates immediately then it is likely that more than a 𝑞th fraction of the data is below 0. We then symmetrically search through the negative numbers by assuming an upper bound of 0. If this halts immediately then it is likely that the quantile is already near 0. We could also apply other variants of this, such as three total calls to get the maximum and minimum of the data if we wanted full data bounds.

Further note that computing the smallest quantiles will be challenging for our algorithm because it may take many queries to get to the appropriate threshold, but each query will have a reasonable chance of terminating if 𝑞is very small. To account for these queries in the lower-bounded setting, we can invert all the datapoints and instead search for quantile 1 𝑞on this transformed data, then invert our resulting estimate. We would also need to reduce our parameter 𝛽a reasonable amount in this setting.

C.2 Extension to add-subtract neighbors

We also extend our results to the add-subtract definition of neighboring datasets.

Definition C.1. Datasets 𝑥, 𝑥 are neighboring if one of them can be obtained from the other by adding or removing one individual s data.

Under this definition we see that our threshold 𝑇= 𝑞𝑛is no longer fixed, so we will instead set each query to be 𝑓𝑖(𝑥) = |{𝑥𝑗 𝑥|𝑥𝑗 𝓁+ 1 < 𝛽𝑖}| 𝑞𝑛. Unfortunately this query will no longer be monotonic, so we will instead take advantage of our more general characterization in Section B.1 to further tighten the privacy bounds in this setting. Lemma C.1. Given the stream of queries 𝑓𝑖(𝑥) = |{𝑥𝑗 𝑥|𝑥𝑗 𝓁+ 1 < 𝛽𝑖}| 𝑞𝑛with sensitivity Δ = 1, for any neighboring datasets 𝑥, 𝑥 under Definition C.1 and quantile 𝑞 [0, 1] we have 𝜀(𝑥, 𝑥 ) max{(1 𝑞)𝜀1, 𝑞𝜀1 + 𝜀2}.

Proof. Without loss of generality, assume that 𝑥has one more individuals data, so 𝑥 ℝ𝑛and 𝑥 ℝ𝑛 1. Let 𝑔𝑖(𝑥) = |{𝑥𝑗 𝑥|𝑥𝑗 𝓁+ 1 < 𝛽𝑖}|. By construction we must have 𝑔𝑖(𝑥) 𝑔𝑖(𝑥 ) {0, 1} because 𝑥has an additional datapoint. Furthermore, because the thresholds are increasing, if 𝑔𝑘(𝑥) 𝑔𝑘(𝑥 ) = 1 then 𝑔𝑖(𝑥) 𝑔𝑖(𝑥 ) = 1 for all 𝑖> 𝑘. Similarly if 𝑔𝑘(𝑥) 𝑔𝑘(𝑥 ) = 0 then 𝑔𝑖(𝑥) 𝑔𝑖(𝑥 ) = 0 for all 𝑖< 𝑘. We further see that 𝑓𝑖(𝑥) 𝑓𝑖(𝑥 ) = 𝑔𝑖(𝑥) 𝑔𝑖(𝑥 ) 𝑞for all 𝑖.

First consider the case when 𝑔𝑘(𝑥) 𝑔𝑘(𝑥 ) = 0. We therefore have 𝑔𝑖(𝑥) 𝑔𝑖(𝑥 ) = 0 for all 𝑖< 𝑘 so Δ𝑘(𝑥, 𝑥 ) = 𝑞, and also Δ𝑘(𝑥, 𝑥 ) (𝑓𝑘(𝑥 ) 𝑓𝑘(𝑥)) = 0, so 𝜀(𝑥, 𝑥 ) 𝑞𝜀1. Similarly, we have Δ𝑘(𝑥 , 𝑥) = 0 and Δ𝑘(𝑥, 𝑥 ) (𝑓𝑘(𝑥 ) 𝑓𝑘(𝑥)) = 𝑞, so 𝜀(𝑥 , 𝑥) 𝑞𝜀2.

Next consider the case when 𝑔𝑘(𝑥) 𝑔𝑘(𝑥 ) = 1. Therefore we have Δ𝑘(𝑥, 𝑥 ) 𝑞and also 𝑓𝑘(𝑥 ) 𝑓𝑘(𝑥) = 𝑞 1 which implies Δ𝑘(𝑥, 𝑥 ) (𝑓𝑘(𝑥 ) 𝑓𝑘(𝑥)) 1, so 𝜀(𝑥, 𝑥 ) 𝑞𝜀1 + 𝜀2. Similarly, we have that Δ𝑘(𝑥 , 𝑥) 1 𝑞, and thus Δ𝑘(𝑥 , 𝑥) (𝑓𝑘(𝑥) 𝑓𝑘(𝑥 )) = 0, so 𝜀(𝑥 , 𝑥) (1 𝑞)𝜀1.

Combining these bounds gives our desired result.

With these improved bounds we can then show that our methods also extend to the add-subtract definition of neighboring with tighter privacy guarantees. Corollary C.1. If we run Algorithm 2 with 𝜀1 = 𝜀2 for a quantile 𝑞under Definition C.1 then it is (𝑞𝜀1 + 𝜀2)-DP.

Proof. Combining Lemma C.1 and Corollary B.1 immediately imply the desired result

C.3 Extension to multiple quantile estimation

As previously established, the framework in Kaplan et al. (2022) is agnostic to the single quantile estimation method. However their proof is for the add-subtract neighbors, although they note that it can be extended easily to the swap neighbors. We also discuss here how it can be extended for completeness.

For the swap neighboring definition, at each level of the recursive partitioning scheme either two partitions differ under the add-subtract definition, or one partition differs under the swap definition. Applying their framework to our algorithm, we will instead compute all the thresholds in advance of the splitting. Accordingly these thresholds will remain fixed. If the thresholds are fixed then we can actually further improve our privacy bounds for the add-subtract definition. Once again we will use our more general characterization from Section B.1.

Lemma C.2. Given the stream of queries 𝑓𝑖(𝑥) = |{𝑥𝑗 𝑥|𝑥𝑗 𝓁+ 1 < 𝛽𝑖}| with sensitivity Δ = 1, for any neighboring datasets 𝑥, 𝑥 under Definition C.1 we have 𝜀(𝑥, 𝑥 ) max{𝜀1, 𝜀2}

Proof. Without loss of generality, assume that 𝑥has one more individuals data, so 𝑥 ℝ𝑛and 𝑥 ℝ𝑛 1. By construction we must have 𝑓𝑖(𝑥) 𝑓𝑖(𝑥 ) {0, 1} because 𝑥has an additional datapoint. Furthermore, because the thresholds are increasing, if 𝑓𝑘(𝑥) 𝑓𝑘(𝑥 ) = 0 then 𝑓𝑖(𝑥) 𝑓𝑖(𝑥 ) = 0 for all 𝑖< 𝑘. Thus the case of 𝑓𝑘(𝑥) 𝑓𝑘(𝑥 ) = 0 is easy.

Instead consider 𝑓𝑘(𝑥) 𝑓𝑘(𝑥 ) = 1. We know Δ𝑘(𝑥, 𝑥 ) = 0 so Δ𝑘(𝑥, 𝑥 ) (𝑓𝑘(𝑥 ) 𝑓𝑘(𝑥)) = 1, so 𝜀(𝑥, 𝑥 ) 𝜀2. Further we must have Δ𝑘(𝑥 , 𝑥) 1 and Δ𝑘(𝑥 , 𝑥) (𝑓𝑘(𝑥) 𝑓𝑘(𝑥 )) = 0, so 𝜀(𝑥 , 𝑥) 𝜀1. Combining these implies 𝜀(𝑥 , 𝑥) max{𝜀1, 𝜀2} for any neighboring datasets.

Applying these bounds we see that applying the framework of Kaplan et al. (2022) to the swap definition either leads to one composition of (𝜀1+𝜀2)-DP for the swap definition or two compositions of max{𝜀1, 𝜀2} for the add-subtract definition at each level. Setting 𝜀1 = 𝜀2 we then have that the privacy cost of computing 𝑚quantiles with this framework will be equivalent to log(𝑚+ 1) compositions of (𝜀1 + 𝜀2)-DP.

D Implementation code

For ease of implementation, we provide some simple python code to run our method with access to the respective Noise generator of the users choosing. In our experiments we used exponential noise from the numpy library.

def unbounded Quantile(data , l, b, q, eps_1 ,eps_2):

d = defaultdict(int) for x in data:

i = math.log(x-l+1,b) // 1 d[i] += 1

t = q * len(data) + noise(1/eps_1) cur , i = 0, 0 while True:

cur += d[i] i += 1 if cur + noise(1/eps_2) > t:

break return b**i - l + 1