# blackbox_attacks_via_surrogate_ensemble_search__7f87e9a9.pdf

Blackbox Attacks via Surrogate Ensemble Search

Zikui Cai, Chengyu Song, Srikanth Krishnamurthy,

Amit Roy-Chowdhury, M. Salman Asif

University of California Riverside

Blackbox adversarial attacks can be categorized into transferand query-based attacks. Transfer methods do not require any feedback from the victim model, but provide lower success rates compared to query-based methods. Query attacks often require a large number of queries for success. To achieve the best of both approaches, recent efforts have tried to combine them, but still require hundreds of queries to achieve high success rates (especially for targeted attacks). In this paper, we propose a novel method for Blackbox Attacks via Surrogate Ensemble Search (BASES) that can generate highly successful blackbox attacks using an extremely small number of queries. We ﬁrst deﬁne a perturbation machine that generates a perturbed image by minimizing a weighted loss function over a ﬁxed set of surrogate models. To generate an attack for a given victim model, we search over the weights in the loss function using queries generated by the perturbation machine. Since the dimension of the search space is small (same as the number of surrogate models), the search requires a small number of queries. We demonstrate that our proposed method achieves better success rate with at least 30 fewer queries compared to state-of-the-art methods on different image classiﬁers trained with Image Net (including VGG-19, Dense Net-121, and Res Next-50). In particular, our method requires as few as 3 queries per image (on average) to achieve more than a 90% success rate for targeted attacks and 1 2 queries per image for over a 99% success rate for untargeted attacks. Our method is also effective on Google Cloud Vision API and achieved a 91% untargeted attack success rate with 2.9 queries per image. We also show that the perturbations generated by our proposed method are highly transferable and can be adopted for hard-label blackbox attacks. Furthermore, we argue that BASES can be used to create attacks for a variety of tasks and show its effectiveness for attacks on object detection models. Our code is available at https://github.com/CSIPlab/BASES.

1 Introduction

Deep neural network (DNN) models are known to be vulnerable to adversarial attacks [1 4]. Many methods have been proposed in recent years to generate adversarial attacks [2, 5 11] (or to defend against such attacks [6, 12 20]). Attack methods for blackbox models can be divided into two broad categories: transferand query-based methods. Transfer-based methods generate attacks for some (whitebox) surrogate models via backpropagation and test if they fool the victim models [3, 4]. They are usually agnostic to victim models as they do not require or readily use any feedback; and they often provide lower success rates compared to query-based methods. On the other hand, query-based attacks achieve high success rate but at the expense of querying the victim model several times to ﬁnd perturbation directions that reduce the victim model loss [21 25]. One possible way to achieve a high success rate while keeping the number of queries small, is to combine the transfer and query attacks. While there has been impressive recent work along this direction [26, 10, 27, 11], the

Corresponding authors: Zikui Cai (zcai032@ucr.edu) and M. Salman Asif (sasif@ucr.edu)

36th Conference on Neural Information Processing Systems (Neur IPS 2022).

Perturbation Machine (PM)

Model w = w1 w2 w N

Loss up/down?

Coordinate-wise update (wn wn 𝜂)

Loss landscape vs weights of (N=3) models

Init Generate

Re-initialize Assign weights

0.33 0.33 0.33 Fail loss = 1.5 label:Butterfly # queries = 2

Success loss = - 2.5 label:Primate # queries = 6

0.3 0.3 0.4

0.1 0.1 0.8

Figure 1: BASES for score-based attack. (Top-left) We deﬁne a perturbation machine (PM) using a ﬁxed set of N surrogate models, each of which is assigned a weight value as w = [w1, . . . , w N]. The PM generates a perturbed image x?(w) for a given input image x by minimizing the perturbation loss that is deﬁned as a function of w. To fool a victim model, we update one coordinate in w at a time while querying the victim model using x?(w) generated by the PM. We can view this approach as a bi-level optimization or search procedure; the PM generates a perturbed image with the given weights x?(w) in the inner level, while we update w in the outer level. (Bottom-left) We visualize weights and perturbed images for a few iterations. We stop as soon as the attack is successful (e.g. original label - Butterﬂy is changed to target label - Primate for targeted attack). (Right) Victim loss values for different weights along the Barycentric coordinates on the triangle. We start with equal weights (at the centroid) and traverse the space of w to reduce loss (concentrate on model f3). Red color indicates large loss values (unsuccessful attack), and blue indicates low loss (successful attack).

state-of-the-art methods [27, 11] still require hundreds of or more queries to be successful at targeted attacks. Such attacks are infeasible for limited-access settings where a user cannot query a model that many times[28].

Given this premise, we design a new method for blackbox attacks via surrogate ensemble search (BASES), combining transfer and query ideas, to fool a given victim model with higher success rates and fewer queries compared to state-of-the-art methods. For example, our evaluation shows that BASES (on average) only requires 3 queries per image to achieve over 90% success rate for targeted attacks, which is at least 30 fewer queries compared to state-of-the-art methods [10, 11]. BASES consists of two key steps that can be viewed as bilevel optimization steps. 1) A perturbation machine generates a query for the victim model based on weights assigned to the surrogate models. 2) The victim model s feedback is used to change weights of the perturbation machine to reﬁne the query. Figure 1 depicts these steps.

We ﬁrst deﬁne a perturbation machine (PM) that generates a single perturbation to fool all the (whitebox) models in the surrogate ensemble. We use a surrogate ensemble for two reasons: 1) It is known to provide better transfer attacks [29, 7]. The assumption is that if an adversarial image can fool multiple surrogate models, then it is very likely to fool a victim model as well. For the same reason, an ensemble with different and diverse surrogate models provides better attack transfer. 2) Our main interest is in searching for perturbations that can fool the given victim model. A single surrogate model provides a ﬁxed perturbation; hence, it does not offer ﬂexibility to search over perturbations. To facilitate search over perturbations, we deﬁne the adversarial loss for the PM as a function of weights assigned to each model in the ensemble. By changing the weights of the loss function, we can generate different perturbations and steer in a direction that fools the victim model. It is worth noting that perturbations generated by a surrogate ensemble with an arbitrary set of weights often fools all the surrogate models, but they do not guarantee success on unseen victim models; therefore, searching over the weights space for surrogate models is necessary.

Since the number of models in the surrogate ensemble is small, the search space is low dimensional and requires extremely small number of queries compared to other query-based approaches. In our method, we further simplify the search process by updating one weight element at a time, which is equivalent to coordinate descent, which has been shown to be effective in query-based attacks

[21, 23]. Since we search along one coordinate at a time instead of estimating the full gradients, the method is extremely efﬁcient in terms of query count. In particular, our method requires two queries per coordinate update but offers success rates as good as that given by performing a full gradient update step (as shown in Section 4). Reducing the dimension of the search space while maintaining high success rate for query-based attacks is an active area of research [23, 10, 27, 11], and our proposed method pushes the boundary in this area.

We perform extensive experiments for (score-based) blackbox attacks using a variety of surrogate and blackbox victim models for both targeted and untargeted attacks. We select Py Torch Torchvision [30] as our model zoo, which contains 56 image classiﬁcation models trained on Image Net [31] that span a wide range of architectures. We demonstrate superior performance by a large margin over state-of-the-art approaches, especially for targeted attacks. Furthermore, we tested the perturbations generated by our method for attacks on hard-label classiﬁers. Our results show that the perturbations generated by our method are highly transferable. We present also present experiments for attacks on object detectors in the supplementary material, which demonstrate the effectiveness of our attack method for tasks beyond image classiﬁcation.

The main contributions of this paper are as follows.

We propose a novel, yet simple method, BASES, for effective and query-efﬁcient blackbox attacks.

The method adjusts weights of the surrogate ensemble by querying the victim model and achieves high fooling rate targeted attack with a very small number of queries. We perform extensive experiments to demonstrate that BASES outperforms state-of-the-art methods

[26, 10, 27, 11] by a large margin; over 90% targeted success rate with less than 3 queries, which is at least 30 fewer queries than other methods. We also demonstrate the effectiveness under a real-world blackbox setting by attacking Google

Cloud Vision API and achieve 91% untargeted fooling rate with 2.9 queries (3 less than [10]). The perturbations from BASES are highly transferable and can also be used for hard-label attacks. In

this challenging setting, we can achieve over 90% fooling rate for targeted and almost perfect fooling rate for untargeted attacks on a variety of models using less than 3 and 2 queries, respectively. We show that BASES can be used for different tasks by creating attacks for object detectors that

signiﬁcantly improve the fooling rate over transfer attacks.

2 Related work

Ensemble-based transfer attacks. Transferable adversarial examples that can fool one model can also fool a different model [3, 4, 29, 32]. Transfer-based untargeted attacks are considered easy since the adversarial examples can disrupt feature extractors into unrelated directions (e.g., in MIM [7], the fooling rate for some models can be as high as 87.9%). In contrast, transfer-based targeted attacks often suffer from low fooling rates (e.g., MIM shows a transfer rate of about 20% at best). To improve the transfer rate, several methods use ensemble based approach. To combine the information from different surrogate models; [29] fuses probability scores and [7] proposes combining logits. While these methods have been effective, the most natural and generic approach is to combine losses, which can be used for tasks beyond classiﬁcation [33 35]. MGAA [36] iteratively selects a set of surrogate models from an ensemble and performs meta train and meta test steps to reduce the gap between whitebox and blackbox gradient directions. Simulator-Attack method [37] uses several surrogate models to train a generalized substitute model as a "Simulator" for the victim model; however, training such simulators is computationally expensive and difﬁcult to scale to large datasets. Previous ensemble approaches typically assign equal weights for each surrogate model. In contrast, we update weights for different surrogate models based on the victim model feedback.

Query-based attacks. Unlike transfer-based attacks, query-based attacks do not make assumptions that surrogate models share similarity with the victim model. They can usually achieve high fooling rates even for targeted attacks (but at the expense of queries) [21, 25, 22]. The query complexity is proportional to the dimension of the search space. Queries over the entire image space can be extremely expensive [21], requiring millions of queries for targeted attack [22]. To reduce the query complexity, a number of approaches have attempted to reduce the search space dimension or leverage transferable priors or surrogate models to generate queries [23, 10, 27, 38, 39]. Sim BA-DCT [23] searches over the low DCT frequencies. P-RGF [26] utilizes surrogate gradients as a transfer-based

prior, and draws random vectors from a low-dimensional subspace for gradient estimation. TREMBA [10] trains a perturbation generator and traverses over the low-dimensional latent space. ODS [27] optimizes in the logit space to diversify perturbations for the output space. GFCS [11] searches along the direction of surrogate gradients, and falls back to ODS if surrogate gradients fail. Some other methods [38, 40, 37] also reuse the query feedback to update surrogate models or blackbox simulator , but such a ﬁne-tuning process provides very slight improvements. We summarize the typical search space and average number of queries for some state-of-the-art methods in Table 1. In our approach, we further shrink the search dimension to as low as the number of models in the ensemble. Since our search space is dense with adversarial perturbations, we show that a moderatesize ensemble with 20 models can generate successful targeted attacks for a variety of victim models while requiring only 3 queries (on average), which is at least 30 time fewer than that of existing methods.

3.1 Preliminaries

We use additive perturbation [1, 2, 6] to generate a perturbed image as x? = x + δ, where δ denotes the perturbation vector of same size as input image x. To ensure that the perturbation is imperceptible to humans, we usually constrain its p norm to be less than a threshold, i.e., kδkp ", where p is usually chosen from {2, 1}. Such adversarial attacks for a victim model f can be generated by minimizing the so-called adversarial loss function L over δ such that the output f(x + δ) is as close to the desired (adversarial) output as possible. Speciﬁcally, the attack generator function maps the input image x to an adversarial image x? such that the output f(x?) is either far/different from the original output y for untargeted attacks, or close/identical to the desired output y? for targeted attacks.

Let us consider a multi-class classiﬁer f(x) : x 7! z, where z = [z1, . . . , z C] represents a logit vector at the last layer. The logit vector can be converted to a probability vector p = softmax(z). We refer to such a classiﬁer as a score-based or soft-label classiﬁer. In contrast, a hardlabel classiﬁer provides a single label index out of a total of C classes. We can derive the hard label from the soft labels as y = arg maxc f(x)c. For untargeted attacks, the objective is to ﬁnd x? such that arg maxc f(x?)c 6= y. For targeted attacks, the objective is to ﬁnd x? such that arg maxc f(x?)c = y?, where y? is the target label.

Many efforts on adversarial attacks use iterative variants of the fast signed gradient method (FGSM) [2] because of their simplicity and effectiveness. Notable examples include I-FGSM [5], PGD [6], and MIM [7]. We use PGD attack in our PM, which iteratively optimizes perturbations as

δt λ sign(rδL(x + δt, y?))

, (1) where L is the loss function and " denotes a projection operator. There are many loss functions suitable for crafting adversarial attacks. We mainly employ the following margin loss, which has been shown to be effective in C&W attacks [41]:

L(f(x), y?) = max

max j6=y? f(x)j f(x)y?,

where is the margin parameter that adjusts the extent to which the example is adversarial. A larger corresponds to a lower optimization loss. One advantage of C&W loss function is that its sign directly indicates whether the attack is successful or not (+ve value indicates failure, ve value indicates success). Cross-entropy loss is also a popular loss function to consider, which has similar performance as margin loss (comparison results are provided in the supplementary material).

3.2 Perturbation machine with surrogate ensemble

Controlled query generation with PM. We deﬁne a perturbation machine (PM) to generate queries for the victim model as shown in Figure 1. The PM accepts an image and generates a perturbation to fool all the surrogate models. Furthermore, we seek some control over the perturbations generated by the PM to steer them in a direction that fools the victim model. To achieve these goals, we construct the PM such that it minimizes a weighted adversarial loss function over the surrogate ensemble.

Adversarial loss functions for ensembles. Suppose our PM consists of N surrogate models given as F = {f1, . . . , f N}, each of which is assigned a weight in w = [w1, . . . , w N] such that PN

i=1 wi = 1.

For any given image x and w, we seek to ﬁnd a perturbed image x?(w) that fools the surrogate ensemble. Below we discuss three possible weighted ensemble loss functions-based optimization problems for targeted attack. Loss functions for untargeted attack can be derived similarly.

weighted probabilities x?(w) = arg min

wi softmax(fi(x))), (3)

weighted logits x?(w) = arg min

wifi(x), y?), (4)

weighted loss x?(w) = arg min

wi L(fi(x), y?). (5)

y? denotes the target label and 1y? denotes its one-hot encoding. L represents some adversarial loss function (e.g., C&W loss). The ﬁrst problem in (3) is the minimization of the softmax cross-entropy loss deﬁned on the weighted combination of probability vectors from all the models in the ensemble [29]. The second problem in (4) optimizes adversarial loss over a weighted combination of logits from the models [7]. The third problem in (5) optimizes a weighted combination of adversarial losses over all models. The weighted loss formulation is the simplest and most generic ensemble approach that works not only for the classiﬁcation task with logit or probability vectors, but also other tasks (e.g., object detection, segmentation) as long as the model losses can be aggregated [34]. Here, we focus on the weighted loss formulation, since it shows superior performance compared to weighted probabilities and logits formulations in our experiments (additional experiments are presented in the supplementary material).

Algorithm 1 presents a pseudocode for the PM module for a ﬁxed set of weights. The PM accepts an image x and weights w along with the surrogate ensemble and returns the perturbed image x? = x+δ after a ﬁxed number of signed gradient descent steps (denoted as T) for the ensemble loss.

Algorithm 1 Perturbation Machine: δ, x?(w) = PM(x, w, δinit)

Input x and the target class y? (for untargeted attack y? 6= y ); Surrogate ensemble F = {f1, f2, ..., f N}; Ensemble weights w = {w1, w2, ..., w N}; Initial perturbation δinit; Step size λ; Perturbation norm ( 2/ 1) and bound "; Number of signed gradient steps T Output: Adversarial perturbation δ, x?(w)

1: δ = δinit 2: for t = 1 to T do 3: Calculate Lens = PN

i=1 wi Li(x + δ, y?) . Ensemble loss 4: Update δ δ λ sign(rδLens) . Gradient of ensemble via backpropagation 5: Project δ "(δ) . Project to the feasible set of 1 or 2 ball 6: end for 7: x?(w) x + δ 8: return δ, x?(w)

3.3 Surrogate ensemble search as bilevel optimization

Let us assume that we are given a blackbox victim model, fv, that we seek to fool using a perturbed image generated by the PM (as illustrated in Figure 1). Suppose the adversarial loss for the victim model is deﬁned as Lv. To generate a perturbed image that fools the victim model, we want to solve the following optimization problem:

w = arg min

w Lv(fv(x?(w)), y?). (6)

The problem in (6) is bilevel optimization that seeks to update the weight vector w for the PM so that the generated x?(w) fools the victim model. The PM in Algorithm 1 can be viewed as a function that solves the inner optimization problem in our bilevel optimization. The outer optimization problem searches over w to steer the PM towards a perturbation that fools the victim model.

BASES: Blackbox Attacks via Surrogate Ensemble Search. Our objective is to maximize the attack success rate and minimize the number of queries on the victim model; hence, we adopt a simple yet effective iterative procedure to update the weights w and generate a sequence of queries. Pseudocode for our approach is shown in Algorithm 2. We initialize all entries in w to 1/N and generate the initial perturbed image x?(w) for input x. We stop if the attack succeeds for the victim model; otherwise, we update w and generate a new set of perturbed images. We follow [21] and update w in a coordinate-wise manner, where at every outer iteration, we select nth index and generate two instances of w as w+, w by updating wn as wn + , wn , where is a step size. We normalize the weight vectors so that the entries are non-negative and add up to 1. We generate perturbations x?(w+), x?(w ) using the PM and query the victim model. We compute the victim loss (or score) for {w+, w } and select the weights, the perturbation vector, and the perturbed images corresponding to the smallest victim loss. We stop if the attack is successful with any query.

Algorithm 2 BASES: Blackbox Attack via Surrogate Ensemble Search

Input x and the target class y? (for untargeted attack y? 6= y ); Victim model fv; Maximum number of queries Q; Learning rate ; Perturbation machine (PM) with surrogate ensemble Output: Adversarial perturbation δ, x?

1: Initialize δ = 0; q = 0; w = {1/N, 1/N, ..., 1/N} 2: Generate perturbation via PM: δ, x?(w) = PM(x, w, δ) . ﬁrst query with equal weights 3: Query victim model: z = fv(x + δ) 4: Update query count: q q + 1 5: if arg maxc zc = y? then 6: break . stop if attack is successful 7: end if 8: while q < Q do 9: Update surrogate ensemble weights as follows. . outer level updates weights 10: Pick a surrogate index n . cyclic or random order 11: Compute w+, w by updating wn as wn + , wn , respectively 12: Generate perturbation x?(w+), x?(w ) via PM . inner level generates query 13: Query victim model: fv(x?(w+)), fv(x?(w )) . 2 queries per coordinate 14: Calculate victim model loss for {w+, w } as Lv(w+), Lv(w ) 15: Select w, δ, x?(w) for the weight vector with the smallest loss 16: Increment q after every query, and stop if the attack is successful for any query 17: end while 18: return δ

4 Experiments

4.1 Experiment setup

In this section, we present experiments on attacking the image classiﬁcation task. Additional experiments on attacking object detection task can be found in the supplementary material.

Surrogate and victim models. We present experiments for blackbox attacks mainly using pretrained image classiﬁcation models from Pytorch Torchvision [30], which is a comprehensive and actively updated package for computer vision tasks. At the time of writing this paper, Torchvision offers 56 classiﬁcation models trained on Image Net dataset [31]. These models have different architectures and include the family of VGG [42], Res Net [43], Squeeze Net [44], Dense Net[45], Res Ne Xt [46], Mobile Net [47, 48], Efﬁcient Net [49], Reg Net [50], Vision Transformer [51], and Conv Ne Xt [52]. We choose different models as the victim blackbox models for our experiments, as shown in Figures 2, 3, and 4. To construct an effective surrogate ensemble for the PM, we sample 20 models from different families: {VGG-16-BN, Res Net-18, Squeeze Net-1.1, Google Net, MNASNet-1.0, Dense Net-161,

Efficient Net-B0, Reg Net-y-400, Res Ne Xt-101, Convnext-Small, Res Net-50, VGG-13, Dense Net-201, Inception-v3, Shuffle Net-1.0, Mobile Net-v3-Small, Wide-Res Net-50, Efficient Net-B4, Reg Net-x-400, VIT-B-16}. We vary our ensemble

size N 2 {4, 10, 20} by picking the ﬁrst N model from the set. In most of the experiments, our method uses N = 20 models in the PM, unless otherwise speciﬁed. We also tested a different set of models pretrained on Tiny Image Net dataset, the details of which are included in the supplementary material. To validate the effectiveness of our methods in a practical blackbox setting, we also tested Google Cloud Vision API.

Comparison with other methods. We compare our method with some of the state-of-the-art methods for score-based blackbox attacks. TREMBA [10] is a powerful attack method that searches for perturbations by changing the latent code of a generator trained using a set of surrogate models. GFCS [11] is a recently proposed surrogate-based attack method that probes the victim model using the surrogate gradient directions. We use their original code repositories [53, 54]. For completeness, we also compare with two earlier methods, ODS [27] and P-RGF [26], that leverage transferable priors, even though they have been shown to be less effective than GFCS and TREMBA. Additional details about comparison with TREMBA and GFCS are provided in the supplementary material.

Dataset. We mainly use 1000 Image Net-like images from the Neur IPS-17 challenge [55, 56], which provides the ground truth label and a target label for each image. We also provide evaluation results on Tiny Image Net in the supplementary material.

Query budget. In this paper, we move towards a limited-access setting, since for many real-life applications, legitimate users will not be able to run many queries [28]. In contrast with TREMBA and GFCS, which set the maximum query count to 10, 000 and 50, 000, respectively, we set the maximum count to be 500 and only run our method for 50 queries in the worst case. (TREMBA also uses only 500 queries for Google Cloud Vision API to cut down the cost.)

Perturbation budget. We evaluated our method under both 1 and 2 norm bound, with commonly used perturbation budgets of 1 16 and 2 255

0.001D = 3128 on a 0 255 pixel intensity scale, where D denotes the number of pixels in the image. For attacking Google Cloud Vision API, we reduce the norm bound to 1 12 to align with the setting in TREMBA. Results for 2 norm bound are provided in the supplementary material.

Targeted vs untargeted attacks. All the methods achieve near perfect fooling rates for untargeted attacks in our experiments. This is because untargeted attack on image classiﬁers is not challenging [11], especially when the number of classes is large. Thus, we primarily report experimental results on targeted attacks in the main text and report results for untargeted attacks in the supplementary material. We use the target labels provided in the dataset [55] in the experiments discussed in the main text. We provide additional analysis on using different target label selection methods such as easiest and hardest according to the conﬁdence scores in the supplementary material.

4.2 Score-based attacks

Targeted attacks. Figure 2 presents a performance comparison of ﬁve methods for targeted attacks on three blackbox victim models. Our proposed method provides the highest fooling rate with the least number of queries. P-RGF is found to be ineffective (almost 0% success) for targeted attacks under low query budgets. TREMBA and GFCS are similar in performance; TREMBA shows better performance when query count is small, but GFCS matches TREMBA after nearly 100 queries. Nevertheless, our method clearly outperforms these two powerful methods by a large margin at any level of query count. We summarize the search space dimension D and query counts vs fooling rate of different methods under a limited (and realistic) query budget for both the targeted and untargeted attacks in Table 1. Our method is the most effective in terms of fooling rate vs number of queries (and has the smallest search dimension). Additional results and details about fair comparison and ﬁne tuning of TREMBA and GFCS are provided in the supplementary material.

Surrogate ensemble size (N). To evaluate the effect of surrogate ensemble size on the performance of our method, we performed targeted blackbox attacks experiment on three different victim models using three different sizes of the surrogate ensemble: N 2 {4, 10, 20}. The results are presented in Figure 3 in terms of fooling success rate vs number of queries. As we increase the ensemble size, the fooling rate also increases. With N = 20, the targeted attack fooling rate is almost perfect within 50 queries. Speciﬁcally, for VGG-19 with N = 20, we improve from 54% success rate at the ﬁrst query (with equal ensemble weights) to 96% success rate at the end of 50 queries; this equates to 78% improvement. Dense Net-121 and Res Next-50 can achieve 100% fooling rate with N = 20. With Dense Net-121, using 10 surrogate models, we can achieve a fooling rate of 98%. While

(a) VGG-19 (b) Dense Net-121 (c) Res Next-50

Figure 2: Comparison of 5 attack methods on three victim models under perturbation budget l1 16 for targeted attack. Our method achieves high success rate (over 90%) with few queries (average of 3 per image).

Table 1: Number of queries vs fooling rate of different methods and the search space dimension D.

Number of queries (mean std) per image and fooling rate VGG-19 Dense Net-121 Res Next-50 Targeted Untargeted Targeted Untargeted Targeted Untargeted

P-RGF [26] 7,500 - 156 113

93.5% - 164 112

92.9% - 166 116

TREMBA [10] 1,568 92 107

ODS [27] 1,000 261 125

GFCS [11] 1,000 101 95

Ours 20 3.0 5.4

using 4 models is challenging with respect to all victim models, we can see a rapid and signiﬁcant improvement in fooling rates when the number of queries increases.

Comparison of whitebox (gradient) vs blackbox (queries). To check the effectiveness of our query-based coordinate descent approach for updating w, we compare its performance with the alternative approach of calculating the exact gradient of victim loss under the whitebox setting. The results are presented in Figure 3 as dotted lines. We observe that our blackbox query approach provides similar results as the whitebox version, which implies the coordinate-wise update of w is as good as a complete gradient update.

(a) VGG-19 (b) Dense Net-121 (c) Res Next-50

Figure 3: Comparison of targeted attack fooling rate with different number of ensemble models N 2 {4, 10, 20} in PM. Every experiment is performed with whitebox gradient (denoted as WB with dotted lines) and blackbox score-based coordinate descent (denoted as BB with solid lines). Experiment was run on 100 images.

4.3 Hard-label attacks

The queries generated by our PM are highly transferable and can be used to craft successful attacks for hard-label classiﬁers. To generate a sequence of queries for hard-label classiﬁers, we pick a

(a) Targeted (b) Untargeted

Figure 4: Performance of blackbox attack on 6 hard-label classiﬁers. Our method generates a sequence of queries for targeted attack using VGG-19 as a victim model while the PM has N = 20 models in the surrogate ensemble. Experiment performed on 100 images.

surrogate victim model and generate queries by updating w in the same manner as the score-based

attacks for Q iterations (without termination). We store the queries generated at every iterations in a query set {δ1, . . . , δQ}. We test the victim hard-label blackbox model using x + δ by selecting δ from the set in a sequential order until either the attack succeeds or the queries ﬁnish.

In our experiments, we observed that this approach can achieve a high targeted attack fooling rate on a variety of models. We present the results of our experiment in Figure 4, where we report attack success rate vs query count for 6 models: {Mobile Net-V2, Res Net-34, Conv Ne Xt-Base, Efficient Net-B2, Reg Net-x-8, VIT-L-16}. We used VGG-19 as the surrogate victim model to generate the queries using the PM with 20 surrogate models. Using the saved surrogate perturbations, we can fool all models almost 100%, except for VIT-L-16 [51] that is a vision transformer and architecturally very different from the majority of surrogate ensemble models (thus difﬁcult to attack). Nevertheless, the fooling rate increases from 18% ! 63%, which is a 250% improvement.

4.4 Attack on commercial Google Cloud Vision API

We demonstrate the effectiveness of our approach under a practical blackbox setting by attacking the Google Cloud Vision (GCV) label detection API. GCV detects and extracts information about entities in an image, across a very broad group of categories containing general objects, locations, activities, animal species, and products. Thus, the label set is very different from that of Image Net, and largely unknown to us. We have no knowledge about the detection models in this API either. We randomly select 100 images from the aforementioned Image Net dataset that are correctly classiﬁed by GCV, and perform untargeted attacks against GCV using 20 surrogate models with perturbation budget of 1 12 to align with the setting in TREMBA [10].

For each input image, GCV returns a list of labels, which are usually the top 10 labels ranked by probability. Under the success metric of changing the top 1 label to any other label, same as in [10], our attack can achieve a fooling rate of 91% with only 2.9 queries per image on average, which is much lower than 8 queries TREMBA reported for similar experiment. We present some successful examples in Figure 5. We present additional results in the supplementary material that show our attacks from classiﬁcation can transfer to object detection models.

5 Conclusion and discussion

We propose a novel and simple approach, BASES, to effectively perform blackbox attacks in a query-efﬁcient manner, by searching over the weight space of ensemble models. Our extensive experiments demonstrate that a wide range of models are vulnerable to our attacks at the fooling rate of over 90% with as few as 3 queries for targeted attacks. The attacks generated by our method are highly transferable and can also be used to attack hard-label classiﬁers. Attacks on Google Cloud Vision API further demonstrates that our attacks are generalizable beyond the surrogate and victim models in our experiments.

Limitations. 1) Our method needs a diverse ensemble for attacks to be successful. Even though the search space is low-dimensional, the generated queries should span a large space so that they can

(a) Original Image - Bus

(b) Attacked Image

(c) Original Image - Fly

(d) Attacked Image

Figure 5: Visualization of some successful attacks on Google Cloud Vision.

fool any given victim model. This is not a major limitation for image classiﬁcation task as a large number of models are available, but it can be a limitation for other tasks. 2) Our method relies on the PM to generate a perturbation query for every given set of weights. The perturbation generation over surrogate ensemble is computationally expensive, especially as the ensemble size becomes large. In our experiments, one query generation with {4, 10, 20} surrogate models requires nearly {2.4s, 9.6s, 18s} per image on Nvidia Ge Force RTX 2080 TI. Since our method requires a small number of queries, the overall computation time of our method remains small.

Societal impacts. We propose an effective and query efﬁcient approach for blackbox attacks. Such adversarial attacks can potentially be used for malicious purposes. Our work can help further explain the vulnerabilities of DNN models and reduce technological surprise. We also hope this work will motivate the community to develop more robust and reliable models, since DNNs are widely used in real-life or even safety-critical applications.

Acknowledgments. This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under agreement number HR00112090096. Approved for public release; distribution is unlimited.

[1] Christian Szegedy, Wojciech Zaremba, Ilya Sutskever, Joan Bruna, Dumitru Erhan, Ian Goodfellow,

and Rob Fergus. Intriguing properties of neural networks. In International Conference on Learning Representations, 2014.

[2] Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples.

In International Conference on Learning Representations, 2015.

[3] Nicolas Papernot, Patrick Mc Daniel, and Ian Goodfellow. Transferability in machine learning: from

phenomena to black-box attacks using adversarial samples. ar Xiv preprint ar Xiv:1605.07277, 2016.

[4] Nicolas Papernot, Patrick Mc Daniel, Ian Goodfellow, Somesh Jha, Z Berkay Celik, and Ananthram Swami.

Practical black-box attacks against machine learning. In Proceedings of the 2017 ACM on Asia conference on computer and communications security, pages 506 519, 2017.

[5] Alexey Kurakin, Ian Goodfellow, and Samy Bengio. Adversarial examples in the physical world. ar Xiv

preprint ar Xiv:1607.02533, 2016.

[6] Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu. To-

wards deep learning models resistant to adversarial attacks. In International Conference on Learning Representations, 2018.

[7] Yinpeng Dong, Fangzhou Liao, Tianyu Pang, Hang Su, Jun Zhu, Xiaolin Hu, and Jianguo Li. Boosting

Adversarial Attacks with Momentum. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 9185 9193, 2018.

[8] Cihang Xie, Zhishuai Zhang, Yuyin Zhou, Song Bai, Jianyu Wang, Zhou Ren, and Alan L Yuille. Improving

transferability of adversarial examples with input diversity. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2730 2739, 2019.

[9] Jiadong Lin, Chuanbiao Song, Kun He, Liwei Wang, and John E Hopcroft. Nesterov accelerated gradient

and scale invariance for adversarial attacks. ar Xiv preprint ar Xiv:1908.06281, 2019.

[10] Zhichao Huang and Tong Zhang. Black-box adversarial attack with transferable model-based embedding.

In International Conference on Learning Representations, 2019.

[11] Nicholas A. Lord, Romain Mueller, and Luca Bertinetto. Attacking deep networks with surrogate-based

adversarial black-box methods is easy. In International Conference on Learning Representations, 2022.

[12] Florian Tramèr, Alexey Kurakin, Nicolas Papernot, Dan Boneh, and Patrick Mc Daniel. Ensemble ad-

versarial training: Attacks and defenses. In International Conference on Learning Representations, 2018.

[13] Weilin Xu, David Evans, and Yanjun Qi. Feature squeezing: Detecting adversarial examples in deep neural

networks. In Annual Network and Distributed System Security Symposium, 2017.

[14] Chuan Guo, Mayank Rana, Moustapha Cisse, and Laurens Van Der Maaten. Countering adversarial images

using input transformations. In International Conference on Learning Representations, 2017.

[15] Dongyu Meng and Hao Chen. Magnet: a two-pronged defense against adversarial examples. In Proceedings

of the 2017 ACM SIGSAC conference on computer and communications security, pages 135 147, 2017.

[16] Xuanqing Liu, Minhao Cheng, Huan Zhang, and Cho-Jui Hsieh. Towards robust neural networks via

random self-ensemble. In Proceedings of the European Conference on Computer Vision (ECCV), pages 369 385, 2018.

[17] Pouya Samangouei, Maya Kabkab, and Rama Chellappa. Defense-gan: Protecting classiﬁers against

adversarial attacks using generative models. In International Conference on Learning Representations, 2018.

[18] Cihang Xie, Yuxin Wu, Laurens van der Maaten, Alan L Yuille, and Kaiming He. Feature denoising for

improving adversarial robustness. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 501 509, 2019.

[19] Kui Ren, Tianhang Zheng, Zhan Qin, and Xue Liu. Adversarial attacks and defenses in deep learning.

Engineering, 6(3):346 360, 2020.

[20] Tao Bai, Jinqi Luo, Jun Zhao, Bihan Wen, and Qian Wang. Recent advances in adversarial training for

adversarial robustness. In International Joint Conference on Artiﬁcial Intelligence, 2021.

[21] Pin-Yu Chen, Huan Zhang, Yash Sharma, Jinfeng Yi, and Cho-Jui Hsieh. Zoo: Zeroth order optimization

based black-box attacks to deep neural networks without training substitute models. In Proceedings of the 10th ACM Workshop on Artiﬁcial Intelligence and Security, pages 15 26, 2017.

[22] Chun-Chen Tu, Paishun Ting, Pin-Yu Chen, Sijia Liu, Huan Zhang, Jinfeng Yi, Cho-Jui Hsieh, and Shin-

Ming Cheng. Autozoom: Autoencoder-based zeroth order optimization method for attacking black-box neural networks. In Proceedings of the AAAI Conference on Artiﬁcial Intelligence, volume 33, pages 742 749, 2019.

[23] Chuan Guo, Jacob Gardner, Yurong You, Andrew Gordon Wilson, and Kilian Weinberger. Simple black-

box adversarial attacks. In International Conference on Machine Learning, pages 2484 2493. PMLR, 2019.

[24] Huichen Li, Xiaojun Xu, Xiaolu Zhang, Shuang Yang, and Bo Li. Qeba: Query-efﬁcient boundary-based

blackbox attack. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1221 1230, 2020.

[25] Andrew Ilyas, Logan Engstrom, Anish Athalye, and Jessy Lin. Black-box adversarial attacks with limited

queries and information. International Conference on Machine Learning, 2018.

[26] Shuyu Cheng, Yinpeng Dong, Tianyu Pang, Hang Su, and Jun Zhu. Improving black-box adversarial

attacks with a transfer-based prior. In Advances in Neural Information Processing Systems, volume 32, 2019.

[27] Yusuke Tashiro, Yang Song, and Stefano Ermon. Diversity can be transferred: Output diversiﬁcation for

white-and black-box attacks. In Advances in Neural Information Processing Systems, volume 33, pages 4536 4548, 2020.

[28] Ian Goodfellow. A research agenda: Dynamic models to defend against correlated attacks. ar Xiv preprint

ar Xiv:1903.06293, 2019.

[29] Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Song. Delving into transferable adversarial examples and

black-box attacks. In International Conference on Learning Representations, 2017.

[30] Adam Paszke, Sam Gross, Soumith Chintala, Gregory Chanan, Edward Yang, Zachary De Vito, Zeming

Lin, Alban Desmaison, Luca Antiga, and Adam Lerer. Automatic differentiation in pytorch. 2017.

[31] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical

image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248 255. IEEE, 2009.

[32] Maosen Li, Cheng Deng, Tengjiao Li, Junchi Yan, Xinbo Gao, and Heng Huang. Towards transferable

targeted attack. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 641 649, 2020.

[33] Zhaohui Che, Ali Borji, Guangtao Zhai, Suiyi Ling, Jing Li, and Patrick Le Callet. A new ensemble

adversarial attack powered by long-term gradient memories. In Proceedings of the AAAI Conference on Artiﬁcial Intelligence, volume 34, pages 3405 3413, 2020.

[34] Zuxuan Wu, Ser-Nam Lim, Larry S Davis, and Tom Goldstein. Making an invisibility cloak: Real world

adversarial attacks on object detectors. In European Conference on Computer Vision, pages 1 17. Springer, 2020.

[35] Zikui Cai, Xinxin Xie, Shasha Li, Mingjun Yin, Chengyu Song, Srikanth V Krishnamurthy, Amit K

Roy-Chowdhury, and M. Salman Asif. Context-aware transfer attacks for object detection. In Proceedings of the AAAI Conference on Artiﬁcial Intelligence, 2022.

[36] Zheng Yuan, Jie Zhang, Yunpei Jia, Chuanqi Tan, Tao Xue, and Shiguang Shan. Meta gradient adversarial

attack. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 7748 7757, 2021.

[37] Chen Ma, Li Chen, and Jun-Hai Yong. Simulating unknown target models for query-efﬁcient black-box

attacks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11835 11844, 2021.

[38] Fnu Suya, Jianfeng Chi, David Evans, and Yuan Tian. Hybrid batch attacks: Finding black-box adversarial

examples with limited queries. USENIX Security Symposium, 2019.

[39] Shasha Li, Abhishek Aich, Shitong Zhu, M. Salman Asif, Chengyu Song, Amit Roy-Chowdhury, and

Srikanth Krishnamurthy. Adversarial attacks on black box video classiﬁers: Leveraging the power of geometric transformations. Advances in Neural Information Processing Systems, 34:2085 2096, 2021.

[40] Jiancheng Yang, Yangzhou Jiang, Xiaoyang Huang, Bingbing Ni, and Chenglong Zhao. Learning black-box

attackers with transferable priors and query feedback. Advances in Neural Information Processing Systems, 33:12288 12299, 2020.

[41] Nicholas Carlini and David Wagner. Towards evaluating the robustness of neural networks. In 2017 ieee

symposium on security and privacy (sp), pages 39 57. IEEE, 2017.

[42] Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recogni-

tion. ar Xiv preprint ar Xiv:1409.1556, 2014.

[43] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition.

In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770 778, 2016.

[44] Forrest N Iandola, Song Han, Matthew W Moskewicz, Khalid Ashraf, William J Dally, and Kurt Keutzer.

Squeezenet: Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size. ar Xiv preprint ar Xiv:1602.07360, 2016.

[45] Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4700 4708, 2017.

[46] Saining Xie, Ross Girshick, Piotr Dollár, Zhuowen Tu, and Kaiming He. Aggregated residual transforma-

tions for deep neural networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1492 1500, 2017.

[47] Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. Mobilenetv2:

Inverted residuals and linear bottlenecks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 4510 4520, 2018.

[48] Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang,

Yukun Zhu, Ruoming Pang, Vijay Vasudevan, et al. Searching for mobilenetv3. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 1314 1324, 2019.

[49] Mingxing Tan and Quoc Le. Efﬁcientnet: Rethinking model scaling for convolutional neural networks. In

International conference on machine learning, pages 6105 6114. PMLR, 2019.

[50] Ilija Radosavovic, Raj Prateek Kosaraju, Ross Girshick, Kaiming He, and Piotr Dollár. Designing network

design spaces. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10428 10436, 2020.

[51] Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas

Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale. In International Conference on Learning Representations, 2020.

[52] Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie. A

convnet for the 2020s. In Proceedings of the IEEE conference on computer vision and pattern recognition, 2022.

[53] Nicholas A. Lord, Romain Mueller, and Luca Bertinetto. Gfcs. https://github.com/fiveai/GFCS,

2022. [Creative Commons Attribution-Non Commercial-Share Alike 4.0 International License].

[54] Zhichao Huang and Tong Zhang. Tremba. https://github.com/Trans Embed BA/TREMBA, 2019. [No

license provided].

[55] Google Brain. Neurips 2017: Targeted adversarial attack. https://www.kaggle.com/competitions/

nips-2017-targeted-adversarial-attack/data, 2017. [On Kaggle].

[56] Alexey Kurakin, Ian Goodfellow, Samy Bengio, Yinpeng Dong, Fangzhou Liao, Ming Liang, Tianyu Pang,

Jun Zhu, Xiaolin Hu, Cihang Xie, et al. Adversarial attacks and defences competition. In The NIPS 17 Competition: Building Intelligent Systems, pages 195 231. Springer, 2018.

[57] Maksym Andriushchenko, Francesco Croce, Nicolas Flammarion, and Matthias Hein. Square attack: a

query-efﬁcient black-box adversarial attack via random search. In European Conference on Computer Vision, pages 484 501. Springer, 2020.

[58] Olga Russakovsky, Jia Deng, Hao Su, Jonathan Krause, Sanjeev Satheesh, Sean Ma, Zhiheng Huang,

Andrej Karpathy, Aditya Khosla, Michael Bernstein, et al. Imagenet large scale visual recognition challenge. International journal of computer vision, 115(3):211 252, 2015.

[59] Jiawei Du, Hu Zhang, Joey Tianyi Zhou, Yi Yang, and Jiashi Feng. Query-efﬁcient meta attack to

deep neural networks. In International Conference on Learning Representations, 2020. URL https: //openreview.net/forum?id=Skxd6g SYDS.

[60] Andrew Ilyas, Logan Engstrom, and Aleksander Madry. Prior convictions: Black-box adversarial attacks

with bandits and priors. ar Xiv preprint ar Xiv:1807.07978, 2018.

[61] Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng,

Ziwei Liu, Jiarui Xu, Zheng Zhang, Dazhi Cheng, Chenchen Zhu, Tianheng Cheng, Qijie Zhao, Buyu Li, Xin Lu, Rui Zhu, Yue Wu, Jifeng Dai, Jingdong Wang, Jianping Shi, Wanli Ouyang, Chen Change Loy, and Dahua Lin. MMDetection: Open mmlab detection toolbox and benchmark. ar Xiv preprint ar Xiv:1906.07155, 2019.

[62] Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng,

Ziwei Liu, Jiarui Xu, Zheng Zhang, Dazhi Cheng, Chenchen Zhu, Tianheng Cheng, Qijie Zhao, Buyu Li, Xin Lu, Rui Zhu, Yue Wu, Jifeng Dai, Jingdong Wang, Jianping Shi, Wanli Ouyang, Chen Change Loy, and Dahua Lin. Mmdetection. https://github.com/open-mmlab/mmdetection, 2019. [Apache License 2.0].

[63] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection

with region proposal networks. In Advances in neural information processing systems, pages 91 99, 2015.

[64] Joseph Redmon and Ali Farhadi. Yolov3: An incremental improvement. ar Xiv preprint ar Xiv:1804.02767,

[65] Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. Focal loss for dense object

detection. In Proceedings of the IEEE international conference on computer vision, pages 2980 2988, 2017.

[66] Xiaosong Zhang, Fang Wan, Chang Liu, Rongrong Ji, and Qixiang Ye. Free Anchor: Learning to match

anchors for visual object detection. In Neural Information Processing Systems, 2019.

[67] Ze Yang, Shaohui Liu, Han Hu, Liwei Wang, and Stephen Lin. Reppoints: Point set representation for

object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9657 9666, 2019.

[68] Xingyi Zhou, Dequan Wang, and Philipp Krähenbühl. Objects as points. In ar Xiv preprint ar Xiv:1904.07850, 2019.

[69] Nicolas Carion, Francisco Massa, Gabriel Synnaeve, Nicolas Usunier, Alexander Kirillov, and Sergey

Zagoruyko. End-to-end object detection with transformers. In European conference on computer vision, pages 213 229. Springer, 2020.

[70] Xizhou Zhu, Weijie Su, Lewei Lu, Bin Li, Xiaogang Wang, and Jifeng Dai. Deformable detr: Deformable

transformers for end-to-end object detection. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=g Z9h CDWe6ke.

[71] Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár,

and C Lawrence Zitnick. Microsoft coco: Common objects in context. In European conference on computer vision, pages 740 755. Springer, 2014.

The checklist follows the references. Please read the checklist guidelines carefully for information on how to answer these questions. For each question, change the default [TODO] to [Yes] , [No] , or [N/A] . You are strongly encouraged to include a justiﬁcation to your answer, either by referencing the appropriate section of your paper or providing a brief inline description. For example:

Did you include the license to the code and datasets? [Yes] See Section ??.

Did you include the license to the code and datasets? [No] The code and the data are proprietary.

Did you include the license to the code and datasets? [N/A]

Please do not modify the questions and only use the provided macros for your answers. Note that the Checklist section does not count towards the page limit. In your paper, please delete this instructions block and only keep the Checklist section heading above along with the questions/answers below.

1. For all authors...

(a) Do the main claims made in the abstract and introduction accurately reﬂect the paper s contribu-

tions and scope? [Yes] (b) Did you describe the limitations of your work? [Yes]

(c) Did you discuss any potential negative societal impacts of your work? [Yes] (d) Have you read the ethics review guidelines and ensured that your paper conforms to them? [Yes]

2. If you are including theoretical results...

(a) Did you state the full set of assumptions of all theoretical results? [N/A] (b) Did you include complete proofs of all theoretical results? [N/A]

3. If you ran experiments...

(a) Did you include the code, data, and instructions needed to reproduce the main experimen-

tal results (either in the supplemental material or as a URL)? [Yes] Code is available at https://github.com/CSIPlab/BASES. (b) Did you specify all the training details (e.g., data splits, hyperparameters, how they were chosen)?

[Yes] . We provide these details in the experiment section and supplementary material. (c) Did you report error bars (e.g., with respect to the random seed after running experiments

multiple times)? [No] . Our method provides (almost) identical results for each run. (d) Did you include the total amount of compute and the type of resources used (e.g., type of GPUs,

internal cluster, or cloud provider)? [Yes] . We provide these details in the experiment section and supplementary material.

4. If you are using existing assets (e.g., code, data, models) or curating/releasing new assets...

(a) If your work uses existing assets, did you cite the creators? [Yes] (b) Did you mention the license of the assets? [N/A]

(c) Did you include any new assets either in the supplemental material or as a URL? [N/A] (d) Did you discuss whether and how consent was obtained from people whose data you re us-

ing/curating? [N/A] (e) Did you discuss whether the data you are using/curating contains personally identiﬁable informa-

tion or offensive content? [N/A]

5. If you used crowdsourcing or conducted research with human subjects...

(a) Did you include the full text of instructions given to participants and screenshots, if applicable?

[N/A] (b) Did you describe any potential participant risks, with links to Institutional Review Board (IRB)

approvals, if applicable? [N/A] (c) Did you include the estimated hourly wage paid to participants and the total amount spent on

participant compensation? [N/A]