# private_frequency_estimation_via_projective_geometry__7cfac213.pdf Private Frequency Estimation via Projective Geometry Vitaly Feldman * 1 Jelani Nelson * 2 Huy Nguyen * 3 Kunal Talwar * 1 Abstract In this work, we propose a new algorithm Projective Geometry Response (PGR) for locally differentially private (LDP) frequency estimation. For a universe size of k and with n users, our ϵ-LDP algorithm has communication cost log2 k bits in the private coin setting and ϵ log2 e + O(1) in the public coin setting, and has computation cost O(n + k exp(ϵ) log k) for the server to approximately reconstruct the frequency histogram, while achieving optimal privacy-utility tradeoff (not just asymptotically). In many parameter settings used in practice this is a significant improvement over the O(n + k2) computation cost that is achieved by the recent PI-RAPPOR algorithm (Feldman and Talwar; 2021). Our empirical evaluation shows a speedup of over 50x over PI-RAPPOR while using approximately 75x less memory for practically relevant parameter settings. In addition, the running time of our algorithm is within an order of magnitude of Hadamard Response (Acharya, Sun, and Zhang; 2019) and Recursive Hadamard Response (Chen, Kairouz, and Ozgur; 2020) which have significantly worse reconstruction error. The error of our algorithm essentially matches that of the communicationand time-inefficient but utility-optimal Subset Selection (SS) algorithm (Ye and Barg; 2017). Our new algorithm is based on using Projective Planes over a finite field to define a small collection of sets that are close to being pairwise independent and a dynamic programming algorithm for approximate histogram reconstruction on the server side. We also give an extension of PGR that allows trading off computation time with utility smoothly. *Equal contribution 1Apple, Cupertino, CA, USA 2UC Berkeley, CA, USA 3Northeastern University, MA, USA. Correspondence to: Vitaly Feldman , Jelani Nelson , Huy Nguyen , Kunal Talwar . Proceedings of the 39 th International Conference on Machine Learning, Baltimore, Maryland, USA, PMLR 162, 2022. Copyright 2022 by the author(s). 1. Introduction In the so-called federated setting, user data is distributed over many devices that each communicate to some central server, after some local processing, for downstream analytics and/or machine learning tasks. We desire such schemes which (1) minimize communication cost, (2) maintain privacy of the user data while still providing utility to the server, and (3) support efficient algorithms for the server to extract knowledge from messages sent by the devices. Such settings have found applications to training language models for such applications as autocomplete and spellcheck, and other analytics applications in Apple i OS (Thakurta et al., 2017) and analytics on settings in Google Chrome (Erlingsson et al., 2014). The gold standard for protecting privacy is for a scheme to satisfy differential privacy. In the so-called local model that is relevant to the federated setting, there are n users with each user i holding some data di D. Each user then uses its own private randomness ri and data di to run a local randomizer algorithm that produces a random message Mi to send to the server. We say the scheme is ϵ-differentially private if for all users i, any possible message m, and any d = d , Pr(Mi = m|di = d) eϵ Pr(Mi = m|di = d ). Note a user could simply send an unambiguous encoding of di, which allows the server to learn di exactly (perfect utility), but privacy is not preserved; such a scheme does not preserve ϵ-DP for any finite ϵ. On the opposite extreme, the user could simply send a uniformly random message that is independent of di, which provides zero utility but perfect privacy (ϵ = 0). One can hope to develop schemes that smoothly increase utility by relaxing privacy (i.e., by increasing ϵ). This work addresses the problem of designing efficient schemes for locally differentially private frequency estimation. In this problem, one defines a histogram x Rk where xd is the number of users i with di = d, and k = |D|. From the n randomized messages it receives, the server would like to approximately reconstruct the histogram, i.e., compute some x such that x x is small with good probability over the randomness r = (r1, . . . , rn), for some norm . Our goal is to design schemes that obtain the Private Frequency Estimation via Projective Geometry best-known privacy-utility trade-offs, while being efficient in terms of communication, computation time, and memory. In this work we measure utility loss as the mean squared error (MSE) Er 1 k[ x x 2 2], with lower MSE yielding higher utility. Note that such a scheme should specify both the local randomizer employed by users, and the reconstruction algorithm used by the server. There are several known algorithms for this problem; see Table 1. To summarize, optimal utility was achieved in prior work by Subset Selection (matching the lower bound of (Ye & Barg, 2019), including the leading constant factor of 4) with roughly similar utility achieved by the RAPPOR algorithm (Erlingsson et al., 2014) that is based on the classical binary randomized response (Warner, 1965). Unfortunately, both RAPPOR and Subset Selection have very high communication cost of k H(1/(eϵ +1)), where H is the binary entropy function and server-side running time of O(nk/ exp(ϵ)). Large k is common in practice, e.g., k may be the size of a lexicon when estimating word frequencies to train language models. This has led to numerous and still ongoing efforts to design low-communication protocols for the problem (Hsu et al., 2012; Erlingsson et al., 2014; Bassily & Smith, 2015; Kairouz et al., 2016; Wang et al., 2019; 2017; Ye & Barg, 2017; Acharya et al., 2019; Bun et al., 2019; Bassily et al., 2020; Chen et al., 2020; Feldman & Talwar, 2021; Shah et al., 2021). One simple approach to achieve low communication and computational complexity is to use a simple k-ary Randomized Response algorithm (e.g. (Wang et al., 2017)). Unfortunately, its utility loss is suboptimal by up to an Ω(k/eϵ) factor; recall k is often large and ϵ is at most a small constant, and thus this represents a large increase in utility loss. In the ϵ < 1 regime asymptotically optimal utility bounds are known to be achievable with low communication and computational costs (Bassily & Smith, 2015; Bun et al., 2019; Bassily et al., 2020). The first low-communication algorithm that achieves asymptotically optimal bounds in the ϵ > 1 regime is given in (Wang et al., 2017). It communicates O(ϵ) bits and relies on shared randomness. However, it matches the bounds achieved by RAPPOR only when eϵ is an integer and its computational cost is still very high and comparable to that of RAPPOR. Two algorithms, Hadamard Response (Acharya et al., 2019) and Recursive Hadamard Response (Chen et al., 2020), show that it is possible to achieve low communication, efficient computation (only Θ(log k) slower than Randomized Response) and asymptotically optimal utility. However, their utility loss in practice is suboptimal by a constant factor (e.g. our experiments show that these algorithms have an MSE that is over 2 higher for ϵ = 5 than Subset Selection; see Figure 2). Recent work of Feldman and Talwar (Feldman & Talwar, 2021) describes a general technique for reducing communication of a local randomizer without sacrificing utility and, in particular, derives a new low communication algorithm for private frequency estimation via pairwise independent derandomization of RAPPOR. Their algorithm, referred to as PI-RAPPOR, achieves the same utility loss as RAPPOR and has the server-side running time of O(min(n + k2, nk/ exp(ϵ))). The running time of this algorithm is still prohibitively high when both n and k are large. We remark that while goals (1)-(3) from the beginning of this section are all important, goal (2) of achieving a good privacy/utility tradeoff is unique in that poor performance cannot be mitigated by devoting more computational resources (more parallelism, better hardware, increased bandwidth, etc.). After deciding upon a required level of privacy ϵ, there is a fundamental limit as to how much utility can be extracted given that level of privacy; our goal in this work is to understand whether that limit can be attained in a communicationand computation-efficient way. Our main contributions. We give a new private frequency estimation algorithm Projective Geometry Response (PGR) that maintains optimal utility and low communication while significantly improving computational efficiency amongst algorithms with similarly good utility. Using our ideas, we additionally give a new reconstruction algorithm that can be used with the PI-RAPPOR mechanism to speed up its runtime from O(k2/ exp(ϵ)) to O(k exp(2ϵ) log k) (albeit, this runtime is still slower than PGR s reconstruction algorithm by an exp(ϵ) factor). We also show a general approach that can further improve the server-side runtime at the cost of slightly higher reconstruction error, giving a smooth tradeoff: for any prime 2 q exp(ϵ) + 1, we can get running time O(n + qk log k) with error only (1 + 1/(q 1)) times larger than the best known bound1. Note that for q = 2 we recover the bounds achieved by HR and RHR. Our mechanisms require log2 k per device in the private coin model, or ϵ log2 e + O(1) bits in the public coin model (see Appendix B). As in previous work, our approximate reconstruction algorithm for the server is also parallelizable, supporting linear speedup for any number of processors P min{n, k exp(ϵ)}. We also perform an empirical evaluation of our algorithms and prior work and show that indeed the error of our algorithm matches the state of the art will still being time-efficient. As has been observed in previous work (Acharya et al., 1For both PGR and HPGR we have stated runtime bounds assuming that certain quantities involving k, exp(ϵ) are prime powers. If this is not the case, runtimes may increase by a factor of exp(ϵ) for PGR, or q for HPGR; we note that PI-RAPPOR also has this feature. Private Frequency Estimation via Projective Geometry scheme name communication utility loss server time Randomized Response log2 k n(2eϵ+k) (eϵ 1)2 n + k RAPPOR (Erlingsson et al., 2014) k 4neϵ Subset Selection (Ye & Barg, 2017; Wang et al., 2019) k eϵ (ϵ + O(1)) 4neϵ (eϵ 1)2 n k eϵ PI-RAPPOR (Feldman & Talwar, 2021) log2 k + O(ϵ) 4neϵ (eϵ 1)2 min(n + k2, n k eϵ ), or n + ke2ϵ log k (this work) Hadamard Response (Acharya et al., 2019) log2 k 36neϵ (eϵ 1)2 n + k log k Recursive Hadamard Response (Chen et al., 2020) log2 k 8neϵ (eϵ 1)2 n + k log k Projective Geometry Response log2 k 4neϵ (eϵ 1)2 n + keϵ log k Hybrid Projective Geometry Response log2 k (1 + 1 q 1) 4neϵ (eϵ 1)2 n + kq log k Table 1. Known local-DP schemes for private frequency estimation compared with ours. Utility bounds are given up to 1 + ok(1) multiplicative accuracy for ease of display and running times are asymptotic. For brevity we only state bounds for ϵ log k. Some of algorithms assume k is either a power of 2 or some other prime power and otherwise potentially worsen in some parameters due to round-up issues; we ignore this issue in the table. The communication and server time for RAPPOR are random variables which are never more than k and nk, respectively, but RAPPOR can be implemented so that in expectation the communication and runtimes are asymptotically equal to Subset Selection. For Hybrid Projective Geometry Response, q can be chosen as any prime in [2, exp(ϵ) + 1]. The utility loss here is the proven upper bound on the variance for PGR, HR and RHR, and the analytic expression for the variance for the others. The communication bounds are in the setting of private coin protocols. As with RHR, PGR and HPGR can also both achieve improved communication in the public coin model; see Appendix B. 2019), the problem of designing a local randomizer is closely related to the question of existence of set systems consisting of sets of density exp( ϵ) which are highly symmetric, and do not have positive pairwise dependencies. The size of the set system then determines the communication cost, and its structural properties may allow for efficient decoding. We show that projective planes over finite fields give us set systems with the desired properties, leading to low communication and utility loss achieving the known lower bound of (Ye & Barg, 2019) (including even the leading constant). We also show a novel dynamic programming algorithm that allows us to achieve server runtime that is not much worse than the fastest known algorithms. As in a lot of recent work on this problem, we have concentrated on the setting of moderately large values for the local privacy parameter ϵ. This is a setting of interest due to recent work in privacy amplification by shuffling (Bittau et al., 2017; Cheu et al., 2019; Erlingsson et al., 2019; Balle et al., 2019; Feldman et al., 2021) that shows that local DP responses, when shuffled across a number of users so that the server does not know which user sent which messages, satisfy a much stronger central privacy guarantee. Asymptotically, ϵ-DP local randomizers aggregated over n users satisfy (O( q δ n ), δ)-DP. The hidden constants here are small: as an example with n = 10, 000 and ϵ = 6, shuffling gives a central DP guarantee of (0.3, 10 6)-DP. This motivation from shuffling is also the reason why our work concentrates on the setting of private coin protocols, as shared randomness seems to be incompatible with shuffling of private reports. We note that while constant factors improvement in error may seem small, these algorithm are typically used for discovering frequent items from power law distributions. A constant factor reduction in variance of estimating any particular item frequency then translates to a corresponding smaller noise floor (for a fixed false positive rate, say), which then translates to a constant factor more items being discovered. 1.1. Related Work A closely related problem is finding heavy hitters , namely all elements j [k] with counts higher than some given threshold; equivalently, one wants to recover an approximate histogram x such that x x is small (the non-heavy hitters i can simply be approximated by xi = 0). In this problem the goal is to avoid linear runtime dependence on k that would result from doing frequency estimation and then checking all the estimates. This problem is typically solved using a frequency oracle which is an algorithm that for a given j [k] returns an estimate of the number of j s held by users (typically without computing the entire histogram) (Bassily & Smith, 2015; Bassily et al., 2020; Bun et al., 2019). Frequency estimation is also closely related to the discrete distribution estimation problem in which inputs are sampled from some distribution over [k] and the goal is to estimate the distribution (Ye & Barg, 2017; Acharya et al., 2019). Indeed, bounds for frequency estimation can be translated directly to bounds on distribution estimation by adding the sampling error. We note that even for the problem of implementing a private frequency oracle, our PGR scheme supports answering queries faster than PIRAPPOR by factor of Θ(exp(ϵ)). Private Frequency Estimation via Projective Geometry 2. Preliminaries Our mechanisms are based on projective spaces, and below we review some basic definitions and constructions of such spaces from standard vector spaces. Definition 2.1. For a given vector space V , the projective space P (V ) is the set of equivalence classes of V \ {0}, where 0 denotes the zero vector, under the following equivalence relation: x y iff x = cy for some scalar c. Each equivalence class is called a (projective) point of the projective space. Let p : V \ {0} P (V ) be the mapping from each vector v V to its equivalence class. If V has dimension t then P(V ) has dimension t 1. We will also use subspaces of the projective space P (V ). Definition 2.2. A projective subspace W of P (V ) is a subset of P (V ) such that there is a subspace U of V where p (U \ {0}) = W. If U has dimension t then W has dimension t 1. It should be noted that intersections of projective subspaces are projective subspaces. Let q be a prime power and Ft q the t-dimensional vector space over the field Fq. We will work with P Ft q and its subspaces. Definition 2.3. A vector x Ft q is called canonical if its first non-zero coordinate is 1. Each equivalence class can be specified by its unique canonical member. 3. Projective Geometry Response description and analysis Our PGR scheme is an instantiation of the framework due to (Acharya et al., 2019). In their framework, the local randomizer is implemented as follows. There is a universe U of outputs which we call message space , and each input v has a corresponding preferred list of messages S(v) U of outputs. All the subsets S(v) for different values of v have the same size cset. We use the name preferred list because when holding v, the local randomizer is more likely to send an element if it is preferred for v. More specifically, any element not in S(v) is sent with probability p, whereas those in S(v) are sent with probability eϵp. Since the probability of sending some item is 1, eϵpcset + p(|U| cset) = 1, so that p = 1 |U|+cset(eϵ 1). Or equivalently, given the input v, the local randomizer returns a uniformly random element of S(v) with probability eϵcset |U|+cset(eϵ 1) and a uniformly random element of U \S(v) with probability |U| cset |U|+cset(eϵ 1). The crux of the construction is in specifying the universe U and the subsets S(v). Small |U| is beneficial, as the communication is log |U| bits. Now we describe the preferred list construction for our new mechanism. PGR works for k = qt 1 q 1 for some integer t (other values of k need to be rounded up to the nearest such value). We identify the k input values with k canonical vectors in Ft q and the corresponding projective points in P Ft q . We also identify the output values with projective points in P Ft q . The subsets S(v) are the (t 2)-dimensional projective subspaces of P Ft q . There are qt 1 q 1 (t 2)- dimensional projective subspaces, which is the same as the number of projective points. For a canonical vector v, the set S(v) is the (t 2)-dimensional projective subspace such that for all u p 1(S(v)), we have u, v = 0. Each (t 2)-dimensional projective subspace contains qt 1 1 q 1 projective points. In other words, each set S(v) contains qt 1 1 q 1 messages out of the universe of qt 1 q 1 messages. An important property of the construction is the symmetry among the intersections of any two subsets S(v). Claim 3.1. Consider a t-dimensional vector space V . The intersection of any two (t 2)-dimensional projective subspaces of P (V ) is a (t 3)-dimensional projective subspace. Proof. Let I be the intersection of two projective subspaces S1 and S2. Recall that I, S1, S2 are projective subspaces corresponding to subspaces of V . Assume for contradiction that the dimension d 1 of the intersection I is lower than t 3. Starting from a basis v1, . . . , vd of p 1(I) {0}, we can extend it with u1, . . . , ut 1 d to form a basis of the subspace p 1(S1) {0}. We can also extend v1, . . . , vd with w1, . . . , wt 1 d to form a basis of p 1(S2) {0}. Because d + 2(t 1 d) = t + (t 2 d) > t, the collection of vectors v1, . . . , vd, u1, . . . , ut 1 d, w1, . . . , wt 1 d must be linearly dependent. There must exist nonzero coefficients so that P k γkwk = 0. This means P j βjuj is a non-zero vector in p 1(S1) p 1(S2) but it is not in p 1(I), which is a contradiction. Now we have cint = qt 2 1 q 1 is the size of the intersection of two subsets S(v) and cset = qt 1 1 q 1 is the size of each subset S(v). Notice that c2 set k cint i.e. (cset/cint)2 k/cint. As mentioned above, each user with input v sends a projective point e with probability eϵp if e is in S(v) and probability p otherwise. The server keeps the counts on the received projective points in a vector y Zk. Thus, the total server storage is O (k). We estimate xv by computing Private Frequency Estimation via Projective Geometry where α and β are chosen so that it is an unbiased estimator. Note P u yu = n. We would like E xv = xv for all v. Notice that by linearity of expectation, it suffices to focus on the contribution to xv from a single user. If that user s input is v, the expectation of the sum Q := P u S(v) yu is eϵpcset. On the other hand, if the input is not v, the expectation of the sum P u S(v) yu is eϵpcint + p (cset cint). We want α E [Q] + β = [[input is v]], where [[T]] is defined to be 1 if T is true and 0 if false. Thus, αeϵpcset + β = 1 and αp ((eϵ 1) cint + cset) + β = 0. Substituting p and solving for α, β, we get α = (eϵ 1) cset + k (eϵ 1) (cset cint); β = (eϵ 1) cset + k (eϵ 1) (cset cint) (eϵ 1)cint + cset (eϵ 1) cset + k = (eϵ 1)cint + cset (eϵ 1) (cset cint). We next state the variance, which suggests that q should be chosen close to exp(ϵ) + 1 for the best utility. The proof can be found in Appendix C. Lemma 3.2. E h x x 2 2 i neϵc2 set/c2 int+n(k 1)((eϵ 1)+cset/cint)2 (eϵ 1)2(cset/cint 1) . In particular, if cset/cint = eϵ + 1 then E h 1 k x x 2 2 i n Next we discuss the algorithms to compute xv. The naive algorithm takes O(kcset) = O(k2/q) time and this is the algorithm of choice for t 3. For t > 3, we can use dynamic programming to obtain a faster algorithm. Note in the below that q should be chosen close to exp(ϵ) + 1. Theorem 3.3. In the Projective Geometry Response scheme, there exists an O((qt 1)/(q 1)tq) time algorithm for server reconstruction, using O((qt 1)/(q 1)) memory. These bounds are at best O(ktq) time and O(k) memory, and increase by at most a factor of q each if rounding up to the next power of q is needed so that (qt 1)/(q 1) k. Proof. We use dynamic programming. For a Fj q, b Ft j q , z Fq, where a is further restricted to have its first nonzero entry be a 1 (it may also be the all-zeroes vector), and b is restricted to be a canonical vector when j = 0, define f(a, b, z) = X prefj(u)=a sufft j(u),b =z where prefi(u) denotes the length-i prefix vector of u, and suffi(u) denotes the length-i suffix vector of u. Then, we would like to compute u yu = α f( , v, 0) + βn, for all projective points v, where denotes the length-0 empty vector. We next observe that f satisfies a recurrence relation, so that we can compute the full array of values (f( , v, 0))v is canonical efficiently using dynamic programming and then efficiently obtain x Rk. We now describe the recurrence relation. For w Fq and a vector v, let v w denote v with w appended as one extra entry. If j denotes the length of the vector a, then the base case is j = t. In this case, f(a, , z) = ya iff both a = 0 and z = 0; else, f(a, , z) = 0. The recursive step is then when 0 j < t. Essentially, we have to sum over all ways to extend a by one more coordinate. Let suff 1(b) denote the vector b but with the first entry removed (so it is a vector of length one shorter). There are two cases: a is the all-zeroes vector, versus it is not. In the former case, the recurrence is f(0, b, z) = f( 0 0,suff 1(b), z) + f( 0 1, suff 1(b), z b1 mod q). Note we are not allowed to append w {2, 3, . . . , q 1} to a since that would not satisfy the requirement that the first argument to f either be all-zeroes or be canonical. The other case for the recurrence relation is when a = 0, in which case the recurrence relation becomes f(a, b, z) = w=0 f(a w, suff 1(b), z d b1 mod q). We now analyze the running time and memory requirements to obtain all f(a, b, z) values via dynamic programming. The runtime is proportional to a,b,z,j =0 q. This is because for j > 0, for each a, b, z triple we do at most q work. When j = 0, there is only one possible value for a (namely ) and k = qt 1 q 1 values for v, plus we are only concerned with z = 0 in this case. For larger j, the number of possibilities for a is qj 1 q 1 + 1 (the additive 1 is since a can be the all-zeroes vector), whereas the number of possibilities for b is qt j. Thus the total runtime is proportional to q 1 + 1 qt j q2 = O(ktq2). Private Frequency Estimation via Projective Geometry For the memory requirement, note f( ) values for some fixed j only depend on the values for j + 1, and thus using bottom-up dynamic programming we can save a factor of t in the memory, for a total memory requirement of only O(kq) (for any fixed j there are only O(k) a, b pairs, and there are q values for z). Finally, we add an optimization which improves both the runtime and memory by a factor of q. Specifically, suppose b is not canonical and is not the all-zeroes vector. Let the value of its first nonzero entry be ζ. Then f(a, b, z) is equal to f(a, b/ζ, z/ζ), where the division is over Fq. Thus, we only need to compute f( ) for b either canonical or equal to the 0 vector. This reduces the number of b from qt j to (qt j 1)/(q 1)+1, which improves the runtime to O(ktq) and the memory to O(k). Note finite field division over Fq can be implemented in O(1) time after preprocessing. First, factor q 1 and generate all its divisors in o(q) time, from which we can find a generator g of F q in o(q) expected time by rejection sampling (it is a generator iff gp 1 mod q for every nontrivial divisor p of q, and we can compute gp mod q in O(log q) time via repeated squaring). Then, in O(q) time create a lookup table A[0 . . . q 1] with A[i] := gi mod q. Then create an inverse lookup table by for each 0 i < q, setting the inverse of A[i] to A[q 1 i]. 4. Hybrid Projective Geometry Response: trading off error and time In this section, we describe a hybrid scheme using an intermediate value for the field size q to trade off between the variance and the running time. Roughly speaking, larger values for q lead to slower running time but also smaller variance. The approach is similar to the way (Acharya et al., 2019) extended their scheme from the high privacy regime to the general setting. We choose h, q, t such that they satisfy the following conditions: q 1 and bh k > cseth. Let cset = qt 1 1 q 1 , cint = qt 2 1 q 1 , and z = cset/cint. Note that c2 set b cint and q + 1 z q. Choose hz as close as possible to eϵ + 1. The k input values are partitioned into blocks of size at most b each. The algorithm s response consists of two parts: the index of the block and the index inside the block. First, the algorithm uses the randomized response to report the block. Next, if the response has the correct block then the algorithm uses the scheme described in the previous section with field size q to describe the coordinate inside the block. If the first response has the wrong block then the algorithm uses a uniformly random response in the second part. Formally the algorithm is another instantiation of the framework of (Acharya et al., 2019) so we only need to specify the universe U of messages and the subsets S( ) for each input value. Each input value is identified with a pair (i, v) where i Zh and v is a canonical vector in Ft q. We divide the input values evenly among the blocks so that each block i Zh has either k/h or k/h input values. The universe U of responses is the set of all pairs (j, u) where j Zh and u is a canonical vector in Ft q. The subset S(i, v) for input (i, v) consists of all messages (i, u) U such that u, v = 0. Recall from the framework that each message (j, u) S(i, v) is chosen with probability eϵp and each message (j, u) S(i, v) is chosen with probability p for an appropriate p so that the probabilities sum to 1. Thus eϵp cset + p (bh cset) = 1 so that p = 1 bh+(eϵ 1)cset . Let xi,v be our estimate for the frequency of input (i, v). The estimates are computed as follows. v,u =0 yi,u We need to choose α, β and γ so that xi,v is an unbiased estimator of xi,v. By linearity of expectation, we only need to consider the case with exactly one user. If the input is i, v then we have E [ xi,v] = αeϵpcset + βp ((eϵ 1) cset + b) + γ = 1 If the input is not i, v but in the same block then E[ xi,v] should equal zero, and it does equal αp ((eϵ 1) cint + cset) + βp ((eϵ 1) cset + b) + γ. Finally if the input is in a different block then E [ xi,v] = αpcset + βpb + γ = 0 We solve for α, β, γ and get α = 1 p (eϵ 1) (cset cint) = bh + (eϵ 1) cset (eϵ 1) (cset cint) cset = cint/cset p (eϵ 1) (cset cint) = bh + (eϵ 1) cset (eϵ 1) (cset cint) cint γ = αpcset βpb = cset b cint cset (eϵ 1) (cset cint) 0 We note that α + β = 1 cint α = bh/cset+(eϵ 1) This fully specifies the scheme; the utility and runtime analyses can be found in Appendix D. Private Frequency Estimation via Projective Geometry 5. Experimental Results In this section, we compare previously-known algorithms (RAPPOR, PI-RAPPOR, Hadamard Response (HR), Recursive Hadamard Response (RHR), Subset Selection (SS)) and our new algorithms Projective Geometry Response (PGR) and Hybrid Projective Geometry Response (HPGR). As the variance upper bound of these algorithms do not depend on the underlying data, we perform our experiments on simple synthetic data that realize the worst case for variance. Our experiments show that Projective Geometry Oracle matches the best of these algorithms namely SS, RAPPOR, and PI-RAPPOR, and achieves noticeably better MSE than other communicationand computation-efficient approaches. At the same time it is significantly more efficient than those three in terms of server computation time, while also achieving optimal communication. All experiments were run on a Dell Precision T3600 with six Intel 3.2 GHz Xeon E5-1650 cores running Ubuntu 20.04 LTS, though our implementation did not take advantage of parallelism. We implemented all algorithms and ran experiments in C++, using the GNU C++ compiler version 9.3.0; code and details on how to run the experiments used to generate data and plots are in our public repository at https://github.com/minilek/ private_frequency_oracles/. We first performed one experiment to show the big gap in running times. We took ϵ = 5, a practically relevant setting, and n = 10,000, k = 3,307,948; this setting of n is smaller than one would see in practice, but the runtimes of the algorithms considered are all linear in n plus additional terms that depend on k, ϵ, and our aim was to measure the impact of these additive terms, which can be significant even for large n. Furthermore, in practice the server can immediately process messages from each of the n users dynamically as the messages arrive asynchronously, whereas it is the additive terms that must be paid at once at the time of histogram reconstruction. For our settings, the closest prime to exp(ϵ) + 1 149.4 is q = 151. Recall that PGR rounds up to universe sizes of the form (qt 1)/(q 1); then (q4 1)/(q 1) is less than 5% larger than k, so that the negative effect of such rounding on the runtime of PGR is minimal. Meanwhile PI-RAPPOR picks the largest prime q smaller than exp(ϵ) + 1, which in this case is q = 149, and assumes universe sizes of the form q t 1; in this case q 3 1 = k exactly, so rounding issues do not negatively impact the running time of PI-RAPPOR (we chose this particular value of k intentionally, to show PIRAPPOR s performance in the best light for some fairly large universe size). The runtimes of various algorithms with this setting of ϵ, k, n are shown in Table 2. Note RHR and HR sacrifice a constant factor in utility compared to Figure 1. Randomized Response has significantly worse error than other algorithms, even for moderately large universes, followed by Hadamard Response and Recursive Hadamard Response, which have roughly double the error of state-of-the-art algorithms. Hybrid Projective Geometry Response trades off having slightly worse error than state-of-the-art for faster runtime. PI-RAPPOR and PGR, the former of which is four orders of magnitude slower while the latter is only one order of magnitude slower and approximately 51x faster than PIRAPPOR. Meanwhile, HPGR s runtime is of the same order of magnitude (though roughly 5x slower) than RHR, but as we will see shortly, HPGR can provide significantly improved utility over RHR and HR. Next we discuss error. Many of our experiments showing reconstruction error with fixed ϵ take ϵ = 5, a practically relevant setting, and universe size k = 22,000, for which the closest prime to exp(ϵ) + 1 149.4 is q = 151. Recall that PGR rounds up to universe sizes of the form (qt 1)/(q 1); then (q3 1)/(q 1) = 22,593 is not much larger than k, so that the runtime of PGR is not severely impacted. Also, cset/cint as defined in Section 3 is very close to exp(ϵ) + 1, so that the MSE bound in Lemma 3.2 nearly matches that of SS. Furthermore for HPGR for this setting of ϵ, k, if we choose q = 5, h = 30, t = 5, then Private Frequency Estimation via Projective Geometry scheme name runtime (in seconds) PI-RAPPOR 1,893.82 (approximately 31.5 minutes) PGR 36.92 HPGR 5.94 RHR 1.20 HR 0.64 RR 0.02 Table 2. Server runtimes for ϵ = 5, k = 3,307,948. For HPGR, we chose the parameters h = 50, q = 3, t = 11, so that the mechanism rounded up the universe size to h(qt 1)/(q 1), which is about 34% larger than k. h (qt 1)/(q 1) = 23,430, which is not much bigger than k so that the runtime of HPGR is not majorly impacted. Furthermore hz as defined in Section 4 is approximately 150.19, which is very close to exp(ϵ) + 1 as recommended by Lemma D.1 to obtain minimal error. We also do one experiment with ϵ = 1.386 ln(4) so that PGR takes q = 5 eϵ + 1. We first draw attention to Figures 2a and 2b. These plots run RAPPOR, PI-RAPPOR, PGR, and SSwith k, n, ϵ as in the figure and show that their error distributions are essentially equivalent. We show the plots for only one some particular parameter settings, but the picture has looked essentially the same to us regardless of which parameters we have tried. In Figure 2a, we have n users each holding the same item in the universe (item 0); we call this a spike distribution as noted in the plot. We have each user apply its local randomizer to send a message to the server, and we ask the server to then reconstruct the histogram (which should be (n, 0, . . . , 0)) and calculate the MSE. We repeat this experiment 300 times, and in this plot we have 300 dots plotted per algorithm, where a dot at point (x, y) signifies that the MSE was at most y for x% of the trial runs; this, it is a plot of the CDF of the empirical error distribution. In Figure 2b, we plot MSE as a function of increasing ϵ, where for each value of ϵ we repeat the above experiment 10 times then plot the average MSE across those 10 trial runs. Because the error performance of RAPPOR, PI-RAPPOR, SS, and PGR are so similar, in all other plots we do not include RAPPOR and PI-RAPPOR since their runtimes are so slow that doing extensive experiments is very time-consuming computationally (note: our implementation of RAPPOR requires O(nk) server time, though O(n(k/eϵ + 1)) expected time is possible by having each user transmit only a sparse encoding of the locations of the 1 bits in its message). We finally draw attention to Figures 2c to 2g. Here we run several algorithms where the distribution over the universe amongst the users is Zipfian (a power law), with power law exponent either 0.1 (an almost flat distribution), or 3.0 (rapid decay). The HPGR algorithm was run with q = 5. As can be seen, the qualitative behavior and relative ordering of all the algorithms is essentially unchanged by the Zipf parameter: PGR,SSalways have the best error, followed by HPGR, followed by RHR and HR. Figures 2c and 2d show the CDF of the empirical MSE over 300 independent trials, as discussed above. Figures 2e and 2f shows how the MSE varies as ϵ is increased; in these last plots we do not include HPGR as one essentially one should select a different q for each ϵ carefully to obtain a good tradeoff between runtime and error (as specified by Lemma D.1) due to round-up issues in powering q. Figure 2g is similar to Figures 2c and 2d, but the y-axis denotes x x instead of the MSE. Finally, Figure 2h takes smaller ϵ = 1.386 ln(4), so that PGR takes q = 5; we note that even with such small ϵ, the qualitative picture does not change. Acknowledgments We thank Noga Alon for pointing out the relevance of projective geometry for constructing the type of set system our mechanism relies on. Private Frequency Estimation via Projective Geometry Figure 2. Error distributions from experiments. Private Frequency Estimation via Projective Geometry Acharya, J., Sun, Z., and Zhang, H. Hadamard response: Estimating distributions privately, efficiently, and with little communication. In Proceedings of the 22nd International Conference on Artificial Intelligence and Statistics (AISTATS), pp. 1120 1129, 2019. Balle, B., Bell, J., Gasc on, A., and Nissim, K. The privacy blanket of the shuffle model. In Boldyreva, A. and Micciancio, D. (eds.), Advances in Cryptology CRYPTO 2019, pp. 638 667, Cham, 2019. Springer International Publishing. ISBN 978-3-030-26951-7. Bassily, R. and Smith, A. Local, private, efficient protocols for succinct histograms. In Proceedings of the fortyseventh annual ACM symposium on Theory of computing, pp. 127 135, 2015. Bassily, R., Nissim, K., Stemmer, U., and Thakurta, A. Practical locally private heavy hitters. Journal of Machine Learning Research, 21(16):1 42, 2020. Bittau, A., Erlingsson, U., Maniatis, P., Mironov, I., Raghunathan, A., Lie, D., Rudominer, M., Kode, U., Tinnes, J., and Seefeld, B. Prochlo: Strong privacy for analytics in the crowd. In Proceedings of the 26th Symposium on Operating Systems Principles, SOSP 17, pp. 441 459, 2017. Bun, M., Nelson, J., and Stemmer, U. Heavy hitters and the structure of local privacy. ACM Transactions on Algorithms (TALG), 15(4):1 40, 2019. Chen, W., Kairouz, P., and Ozg ur, A. Breaking the communication-privacy-accuracy trilemma. In Proceedings of the 33rd Annual Conference on Advances in Neural Information Processing Systems (Neur IPS), 2020. Cheu, A., Smith, A., Ullman, J., Zeber, D., and Zhilyaev, M. Distributed differential privacy via shuffling. In Ishai, Y. and Rijmen, V. (eds.), Advances in Cryptology EUROCRYPT 2019, pp. 375 403, Cham, 2019. Springer International Publishing. ISBN 978-3-030-17653-2. Erlingsson, U., Pihurand, V., and Korolova, A. Rappor: Randomized aggregatable privacy-preserving ordinal response. In Proceedings of the 2014 ACM SIGSAC Conference on Computer and Communications Security (CCS), 2014. Erlingsson, U., Feldman, V., Mironov, I., Raghunathan, A., Talwar, K., and Thakurta, A. Amplification by shuffling: From local to central differential privacy via anonymity. In Proceedings of the Thirtieth Annual ACMSIAM Symposium on Discrete Algorithms, SODA 19, pp. 2468 2479, USA, 2019. Society for Industrial and Applied Mathematics. Feldman, V. and Talwar, K. Lossless compression of efficient private local randomizers. In Proceedings of the 38th Annual Conference on International Conference on Machine Learning (ICML), pp. 3208 3219, 2021. Feldman, V., Mc Millan, A., and Talwar, K. Hiding among the clones: A simple and nearly optimal analysis of privacy amplification by shuffling. In Proceedings of the 62nd Annual IEEE Symposium on Foundations of Computer Science (FOCS), 2021. ar Xiv:2012.12803 [cs.LG]. Hsu, J., Khanna, S., and Roth, A. Distributed private heavy hitters. In International Colloquium on Automata, Languages, and Programming, pp. 461 472. Springer, 2012. Kairouz, P., Bonawitz, K., and Ramage, D. Discrete distribution estimation under local privacy. ar Xiv preprint ar Xiv:1602.07387, 2016. Shah, A., Chen, W.-N., Balle, J., Kairouz, P., and Theis, L. Optimal compression of locally differentially private mechanisms. ar Xiv preprint ar Xiv:2111.00092, 2021. Thakurta, A. G., Vyrros, A. H., Vaishampayan, U. S., Kapoor, G., Freudiger, J., Sridhar, V. R., and Davidson, D. Learning new words, 2017. URL https: //www.google.com/patents/US9594741. US Patent 9,594,741. Wang, S., Huang, L., Nie, Y., Zhang, X., Wang, P., Xu, H., and Yang, W. Local differential private data aggregation for discrete distribution estimation. IEEE Trans. Parallel Distributed Syst., 30(9):2046 2059, 2019. Wang, T., Blocki, J., Li, N., and Jha, S. Locally differentially private protocols for frequency estimation. In 26th USENIX Security Symposium (USENIX Security 17), pp. 729 745, Vancouver, BC, August 2017. USENIX Association. ISBN 978-1-931971-40-9. URL https://www.usenix.org/conference/ usenixsecurity17/technical-sessions/ presentation/wang-tianhao. Warner, S. L. Randomized response: A survey technique for eliminating evasive answer bias. Journal of the American Statistical Association, 60(309):63 69, 1965. Ye, M. and Barg, A. Optimal schemes for discrete distribution estimation under local differential privacy. In Proceedings of the 14th Annual IEEE International Symposium on Information Theory (ISIT), pp. 759 763, 2017. Ye, M. and Barg, A. Optimal locally private estimation under ℓp loss for 1 p 2. The Electronic Journal of Statistics, 13(2):4102 4120, 2019. Private Frequency Estimation via Projective Geometry A. Fast dynamic programming for PI-RAPPOR In this section, we describe an adaptation of our dynamic programming approach to PI-RAPPOR. First, we briefly review the construction of PI-RAPPOR. We use Fq with the field size q close to eϵ + 1. Let t be the minimum integer such that k qt. We identify the k input values with vectors in Ft q. Let x Zqt denote the input frequency vector i.e. xv is the number of users with input v Ft q. For each input v, we define a set S(v) Ft q Fq where (a, b) S(v) if and only if a, v + b = 0. Each user with input v sends a random element e of Ft q Fq with probability eϵp if e S(v) and probability p if e S(v). Thus, p = 1 eϵqt+(q 1)qt . The server keeps the counts on the received elements in a vector y indexed by elements of Ft q Fq. The total storage is O qt+1 . We estimate the frequency vector x by computing u,w: u,v +w=0 yu,w where α and β are chosen so that this is an unbiased estimator. This condition implies two equations: eϵqt + (q 1)qt + β = 1 αeϵqt 1 + (q 1)qt 1 eϵqt + (q 1)qt + β = 0 α = eϵq + (q 1)q (eϵ 1)(q 1) β = eϵ + (q 1) (eϵ 1)(q 1) Next, we describe a fast algorithm to compute x with running time O tqt+2 . Specifically, for a Fj q, b Ft j q , z Fq, define fj(a, b, z) = X prefj(u)=a sufft j(u),b +w=z where prefi(u) denotes the length-i prefix vector of u, and suffi(u) denotes the length-i suffix vector of u. Then, we would like to compute u,w: u,v +w=0 yu,w u,w yu,w = α X w f0( , v, 0) + βn, for all v Ft q, where denotes the length-0 empty vector. We next observe that f satisfies a recurrence relation, so that we can compute the full array of values f0( , v, w) efficiently using dynamic programming and then efficiently obtain x Rk. We have Private Frequency Estimation via Projective Geometry fj(a, b, z) = X prefj(u)=a sufft j(u),b +w=z prefj+1(u)=a i sufft j 1(u),sufft j 1(b) +w=z i b1 (mod q) i=0 fj+1(a i, sufft j 1(b), (z i b1) mod q) Note that we have the base cases ft(a, , w) = ya,w. We need to compute the values of fj(a, b, z) for j {0, 1 . . . , t 1}, a Fj q, b Ft j q , z Fq and each value takes O(q) time so the total running time is O(tqt+2). B. The public coin setting We show that versions of PGR and HPGR can be implemented in the public coin setting in a way that the communication is log2 q = ϵ log2 e + O(1) bits, which is asymptotically optimal to achieve asymptotically optimal utility loss (?)Corollary 7]Barnes HO20. We begin with PGR. Recall that as described, PGR associates each of the k input values with a canonical vector in Ft q. In the public coin variant we now describe, we further assume that the canonical vectors have a non-zero last coordinate. This can be ensured by picking q, t such that k 1 + (1 1/q)((qt 1)/(q 1) 1) = qt 1. We will use Cq,t to denote the set of canonical vectors in Ft q and C q,t to denote those with a non-zero last coordinate. With this setup, recall that each output in the set Sv can be associated with a vector u Cq,t such that u, v = 0. Thus a user with input v sends a vector u Cq,t with probability eϵp if u, v = 0 and with probability p otherwise. For a vector u, let preft 1(u) denote its length (t 1) prefix. Note that for a vector u Cq,t, either preft 1(u) is itself a canonical vector in Cq,t 1, or u = u def = (0, . . . , 0, 1). Also note that for any v C q,t, u = Sv. This then suggests the following algorithm. We use public randomness to select a vector w Ft 1 q such that w = (0, . . . , 0) with probability p, and w is a random vector in Cq,t 1 otherwise. Thus there are 1 + qt 1 1 q 1 possible values of w. Given a w Cq,t 1 and a v C q,t, there is a unique a Fq such that v, w a = 0 mod q. When w = (0, . . . , 0), a user with input v C q,t sends message a with probability eϵ eϵ+q 1 if v, w a = 0 mod q, and with probability 1 eϵ+q 1 otherwise. If w = (0, . . . , 0), the user always send 1. The server given w derived from the shared public randomness, and the message a Fq, decodes it as Dec(w, a) = w a. We claim that the distribution of Dec(w, a) is identical to the output in the private coin PGR. First observe that by construction, Dec(w, a) Cq,t. Next notice that for any u, u S(v), we have Pr(Dec(w, a) = u) = Pr(w = preft 1(u)) Pr(a = ut | w = preft 1(u)) = Pr(w = preft 1(u)) eϵ = Pr(w = preft 1(u )) eϵ eϵ + q 1 (by uniformity of w over canonical vectors) = Pr(w = preft 1(u )) Pr(a = u t | w = preft 1(u )) = Pr(Dec(w, a) = u ). Private Frequency Estimation via Projective Geometry Similarly, for any u, u Cq,t \ Sv such that u, u = u , we can write Pr(Dec(w, a) = u) = Pr(w = preft 1(u)) Pr(a = ut | w = preft 1(u)) = Pr(w = preft 1(u)) 1 eϵ + q 1 = Pr(w = preft 1(u )) 1 eϵ + q 1 (by uniformity of w over canonical vectors) = Pr(w = preft 1(u )) Pr(a = u t | w = preft 1(u )) = Pr(Dec(w, a) = u ). Further, an identical calculation shows that for u Sv, u Cq,t \ Sv with u = u , Pr[Dec(w, a) = u] = eϵ Pr(Dec(w, a) = u ). Moreover, the distribution of w ensures that Pr(Dec(w, a) = u ) = p. It follows that for all u Cq,t, Pr(Dec(w, a) = u) is eϵp if u Sv and p if u Cq,t \ Sv. In other words, we have shown how to simulate the output distribution of PGR in the public coin setting while sending only a single element from Fq. An implementation of HPGR in the public coin model is similar. A message in HPGR is a pair (j, u) where j {1, . . . , h} is the index of a block, and u Ft q is the name of a canonical vector, and as above in the public coin setting we will forbid u from being the all-zeroes vector (so that now we need hqt 1 k). As described in Section 4, h, q are chosen so that hq eϵ + 1. In the public coin model, the user selects j using private randomness and sends it explicitly then uses the PGR public coin protocol described above to determine the first t 1 entries of u with no communication required, then sends the final entry of u to obey the HPGR distribution. The total communication is hq = ϵ log2 e + O(1) bits. C. Utility analaysis for PGR We now restate Lemma 3.2 then provide its proof, analyzing the utility loss of PGR. Lemma C.1. E h x x 2 2 i neϵc2 set/c2 int+n(k 1)((eϵ 1)+cset/cint)2 (eϵ 1)2(cset/cint 1) . In particular, if cset/cint = eϵ + 1 then E h 1 k x x 2 2 i n Proof. By independence, we only need to analyze the variance when there is exactly one user with input v. The lemma then follows from adding up the variances from all users. E h ( xv 1)2i = eϵpcset (α + β 1)2 + p(k cset) (β 1)2 α (α + β 1)2 + α + β 1 = (α + β 1) (1 β) = cset + k (eϵ 1) (cset cint) eϵcset (eϵ 1) (cset cint) = ( cset/cint + k/cint) eϵcset/cint (eϵ 1)2 (cset/cint 1)2 cset/cint + c2 set/c2 int eϵcset/cint (eϵ 1)2 (cset/cint 1)2 = eϵc2 set/c2 int (eϵ 1)2 (cset/cint 1) Let z = cset/cint. Note that z2 z 1 is an increasing function for z [2, + ) so this part of the variance gets larger as q gets larger. Next we analyze the contribution to the variance from the (k 1) coordinates u = v. Private Frequency Estimation via Projective Geometry E x2 u = ((eϵ 1) cint + cset) p (α + β)2 + (1 eϵpcint p (cset cint)) β2 α (α + β)2 + 1 + β = β (α + β)2 + (α + β) β2 α = β (α + β) = (eϵ 1)cint + cset (eϵ 1) (cset cint) (eϵ 2) cset + k (eϵ 1)cint (eϵ 1) (cset cint) = (eϵ 1) + cset/cint (eϵ 1) (cset/cint 1) (eϵ 2) cset/cint + k/cint (eϵ 1) (eϵ 1) (cset/cint 1) ((eϵ 1) + z) (eϵ 2) z + z2 (eϵ 1) (eϵ 1)2 (z 1)2 = ((eϵ 1) + z)2 (eϵ 1)2 (z 1) Note that the function ((eϵ 1)+z)2 (eϵ 1)2(z 1) is decreasing for z (0, eϵ + 1] and it is increasing for z [eϵ + 1, + ) so this part of the variance is minimized when z = eϵ + 1. For z = eϵ + 1, we can substitute and get 4eϵ D. Utility and runtime analyses for HPGR E h x x 2 2 i n 1 + (zh + (eϵ 1)) (eϵ 1)2 (z 1) + 2 (eϵ 1) + eϵ (zh eϵ + 1) + n(zh + (eϵ 1)) z (eϵ 1)2 (z 1) k k/h + ( k/h 1)(z + eϵ 1) In particular, if zh = eϵ + 1 then E h 1 k x x 2 2 i n k + z z 1 n 4eϵ Proof. By independence, we only need to analyze the variance when there is exactly one user with input (i, v) and response (j, u). The lemma follows from adding up the variances from all users. E h ( xi,v 1)2i E h ( xi,v 1 γ)2i = Pr [j = i] ( 1)2 + Pr [j = i u, v = 0] (β 1)2 + Pr [j = i u, v = 0] (α + β 1)2 = (1 (eϵ 1) pcset pb) + p (b cset) (β 1)2 + eϵpcset (α + β 1)2 =1 + p (b cset) β2 2β + eϵpcset (α + β) (α + β 2) Private Frequency Estimation via Projective Geometry We expand the second and third terms individually: p (b cset) β2 2β = p (b cset) cint/cset p (eϵ 1) (cset cint) 2 + 2cint/cset p (eϵ 1) (cset cint) = (bh + (eϵ 1) cset) (b cset) c2 int/c2 set (eϵ 1)2 (cset cint)2 + 2 (b cset) cint/cset (eϵ 1) (cset cint) = (bh/cset + (eϵ 1)) (b/cset 1) (eϵ 1)2 (cset/cint 1)2 + 2 (b/cset 1) (eϵ 1) (cset/cint 1) (zh + (eϵ 1)) (eϵ 1)2 (z 1) + 2 (eϵ 1) eϵpcset (α + β) (α + β 2) = eϵcset bh + (eϵ 1) cset bh/cset + (eϵ 1) (eϵ 1) bh/cset (eϵ 1) = eϵ (bh/cset eϵ + 1) eϵ (zh eϵ + 1) When zh = eϵ + 1, we have E h ( xi,v 1)2i 1 + 2eϵ (eϵ 1)2 (z 1) + 2 (eϵ 1) + 2eϵ (eϵ 1)2 < 1 + 4eϵ (eϵ 1)2 z z 1 Next consider v = v. E h ( xi,v 0)2i E h ( xi,v γ)2i = Pr [j = i] 0 + Pr [j = i u, v = 0] (β)2 + Pr [j = i u, v = 0] (α + β)2 = p (b + (eϵ 2) cset (eϵ 1) cint) (β)2 + p ((eϵ 1) cint + cset) (α + β)2 = (b + (eϵ 2) cset (eϵ 1) cint) 1 p (eϵ 1)2 (cset cint)2 c2 int c2 set + p ((eϵ 1) cint + cset) (1 cint/cset)2 p2 (eϵ 1)2 (cset cint)2 = (b/cset + (eϵ 2) (eϵ 1) cint/cset) (bh/cset + (eϵ 1)) (eϵ 1)2 (1 cint/cset)2 c2 int c2 set + ((eϵ 1) cint/cset + 1) (bh/cset + (eϵ 1)) (z + (eϵ 2) (eϵ 1) /z) (zh + (eϵ 1)) (eϵ 1)2 (z 1)2 + ((eϵ 1) /z + 1) (zh + (eϵ 1)) = (z + eϵ 1) (zh + (eϵ 1)) (eϵ 1)2 (z 1) When zh = eϵ + 1, the last expression is bounded by (z + eϵ 1) 2eϵ (eϵ 1)2(z 1)z + (z + eϵ 1) 2eϵ (eϵ 1)2z = (z+eϵ 1) (eϵ 1)2 1+h z z (z 1) 2eϵ Private Frequency Estimation via Projective Geometry Finally, consider i = i and arbitrary v . E h ( xi ,v 0)2i E h ( xi ,v γ)2i = Pr [j = i ] 02 + Pr [j = i u, v = 0] (β)2 + Pr [j = i u, v = 0] (α + β)2 = p (b cset) (β)2 + pcset (α + β)2 = p (b cset) 1 p2 (eϵ 1)2 (cset cint)2 c2 int c2 set + pcset (1 cint/cset)2 p2 (eϵ 1)2 (cset cint)2 = (b/cset 1) (bh/cset + (eϵ 1)) (eϵ 1)2 (1 cint/cset)2 c2 int c2 set + bh/cset + (eϵ 1) (zh + (eϵ 1)) (eϵ 1)2 (z 1) + zh + (eϵ 1) = (zh + (eϵ 1)) z (eϵ 1)2 (z 1) When zh = eϵ + 1, the last expression is bounded by 2eϵ (eϵ 1)2 1 z 1 + 2eϵ (eϵ 1)2 = 2eϵ (eϵ 1)2 z z 1 There are bi k/h b valid coordinates in the same block with the input (i, v). There are k bi coordinates in the other blocks. Thus the total variance across all coordinates except for coordinate (i, v) is bounded by (zh + (eϵ 1)) z (eϵ 1)2 (z 1) k bi + (bi 1)(z + eϵ 1) (zh + (eϵ 1)) z (eϵ 1)2 (z 1) k k/h + ( k/h 1)(z + eϵ 1) For zh = eϵ + 1, we have (z+eϵ 1) z 1 + h and k bi + (bi 1) (z+eϵ 1) z k k/h + ( k/h 1)(1 + h) = k k/h + k/h 1 + h( k/h 1) < 2k. Regarding the decoding algorithms, notice that the estimates are computed separately by blocks except for an offset γ scaled by the total number of received messages across all blocks. Thus, using the naive algorithm, the time to estimate one count is O (cset) = O q logq(k/h) 1 . Using the fast algorithm to estimate all counts takes O (bqt) time per block and in total, O (bqth) = O logq (k/h) hq1+ logq(k/h) time.