# batched_energyentropy_acquisition_for_bayesian_optimization__db9c79d0.pdf Batched Energy-Entropy acquisition for Bayesian Optimization Felix Teufel12 Carsten Stahlhut1 Jesper Ferkinghoff-Borg1 1Machine Intelligence, Novo Nordisk A/S 2Department of Biology, University of Copenhagen {fegt,ctqs,jfgb}@novonordisk.com Bayesian optimization (BO) is an attractive machine learning framework for performing sample-efficient global optimization of black-box functions. The optimization process is guided by an acquisition function that selects points to acquire in each round of BO. In batched BO, when multiple points are acquired in parallel, commonly used acquisition functions are often high-dimensional and intractable, leading to the use of sampling-based alternatives. We propose a statistical physics inspired acquisition function for BO with Gaussian processes that can natively handle batches. Batched Energy-Entropy acquisition for BO (BEEBO) enables tight control of the explore-exploit trade-off of the optimization process and generalizes to heteroskedastic black-box problems. We demonstrate the applicability of BEEBO on a range of problems, showing competitive performance to existing methods. 1 Introduction Figure 1: q-UCB does not allow for controlling its explore-exploit trade-off with large batches. A GP surrogate (background) was initialized with 100 random points of the Ackley function. q-UCB was run with κ = 0.1 and κ = 100, BEEBO with T = 0.05 and T =50. Batch size Q=100. Bayesian Optimization (BO) has since its inception [1, 2] made a profound contribution to the realm of global optimization of black-box functions through the usage of Bayesian statistics. For global optimization problems pursuing x = argmaxx X ftrue(x), BO has surfaced as a premier strategy for efficiently handling especially complex and costly unknown functions, ftrue(x). While BO is traditionally formulated in a single-point scenario, where individual points are queried and results are observed sequentially, there are situations where batched acquisition is needed. Such situations arise when ftrue(x) is expensive to evaluate in either time or cost, but can be effectively evaluated in parallel by dispatching multiple experiments, reducing the overall optimization time. This is often the case in e.g. drug discovery, materials design or hyperparameter tuning for deep models [3, 4, 5, 6, 7]. The realization that BO could be employed for the training of deep neural networks, as suggested by [3], sparked renewed research interest, with advancements encompassing a variety of areas, including the generalization to accommo- 38th Conference on Neural Information Processing Systems (Neur IPS 2024). date noisy inputs [8, 9], heteroskedastic noise [10, 11], multi-task problems [12], multi-fidelity [13], high-dimensional input spaces [14], and parallel methods with batch queries [15, 16]. Generally, these desired properties are addressed by customizing one of the two key components in BO, either the surrogate model or the acquisition function. The surrogate model f approximates the black-box function ftrue using the available data. In BO, the surrogate is formulated from a Bayesian perspective, allowing us to quantify the model s uncertainty when evaluating new points. Typically, the model of choice is a Gaussian Process (GP) [17]. The acquisition function is responsible for guiding the selection of new input point(s) to evaluate at each optimization step, utilizing the surrogate model to identify promising regions in the input domain and exploring the unknown function further. Any acquisition process needs to trade off exploration (reducing uncertainty to learn a better surrogate model) against exploitation (selecting points with a high expected ftrue(x) based on the current surrogate). In this work, we are particularly interested in acquisition processes that make this trade-off controllable using a hyperparameter. Controllability can be a desirable property if e.g. domain knowledge relating to the difficulty of the optimization process and the quality of the surrogate model is available, or if the strategy needs to be adjusted depending on future experimental budgets. Similarly, it can be desirable to acquire multiple x with high ftrue(x) in a batch (as opposed to just finding the optimum x , with the remaining x being considered explorative). This is useful when optima identified in BO can be subject to constraints that are unknown at optimization time, but may render x intractable [18]. Such constraints arise when the ftrue explored in BO is a necessary simplification of the actual objective. Practical examples include e.g. the synthesizability of a material at larger scale, when BO experiments are performed at lab scale; or the in vivo activity of a molecule with BO experiments performed in vitro. A wide range of batch mode acquisition functions has been proposed, with approaches often leveraging random sampling strategies or Monte Carlo (MC) integration, which can adversely affect controllability for large batches (Figure 1). In contrast, we here introduce BEEBO (Batched Energy Entropy acquisition for BO), a statistical physics inspired acquisition function for BO with GP surrogate models that natively generalizes to batched acquisition. BEEBO enables Parallel gradient-based optimization of the inputs, without requiring sampling or Monte Carlo integrals. Tight control of the explore-exploit trade-off in batch mode using a single temperature hyperparameter. Risk-averse BO under heteroskedastic noise. We demonstrate the application of BEEBO on a wide range of test problems, and investigate its behaviour under heteroskedastic noise. 2 Related works Batch variants of traditional strategies Parallel acquisition in BO has seen a variety of approaches, often starting from established single-point acquisition functions like probability of improvement (PI), expected improvement (EI), knowledge gradient (KG) or upper confidence bound (UCB) [2, 19, 20, 21, 22]. Reformulating these to batch mode with Q query points, we obtain q-PI, q EI, and q-UCB [23, 24]. While the single-point specifications provide an analytical form and enable gradient-based optimization, batch expressions are more challenging and require different optimization strategies, typically involving greedy algorithms [25] or deriving an integral expression over multiple points. For instance, in the popular EI acquisition function, a single point is selected by maximizing the expression a EI (x) = E[max (0, f (x) f t )] = R max (0, f (x) f t ) P (f|x) df. Here f t represents the best observed evaluation of ftrue so far. With a surrogate model in the form of a GP, the acquisition function depends only on the predictive mean and variance functions, µ (x) and C (x). Effectively, we need to evaluate the cumulative normal distribution, which quickly becomes intractable for large batch sizes and approximating the gradient of the q-EI acquisition function typically requires MC estimation [26, 27]. However, proper MC integration can be laborious and is sensitive to both the dimension of the problem and the choice of batch size Q. Specifically, MC methods face the curse of dimensionality problem when applied to high-dimensional integrals, as they require an exponentially increasing number of sample points to maintain accuracy, making them computationally impractical for such tasks [28, 29]. Of particular interest is Wilson et al. [24], in which they adopt the reparameterization trick [30, 31] on acquisition functions integrals, enabling gradient based approaches to the optimization of PI, EI, and UCB. This demonstrates particular usefulness in modest to higher dimensions. While EI trades off exploration and exploitation, users do not have a direct control over the balance. To alleviate this, Sobester et al. [32] proposed a weighted EI formulation. An alternative strategy with an explicit explore-exploit trade-off is offered by the UCB acquisition function, a UCB (x) = µ(x) + κ p C(x), which directly expresses exploration and exploitation as two terms, traded off by the parameter κ. As we are particularly interested in enabling this direct user control, we focus our primary comparison on q-UCB in the main text, while a more extensive comparison with alternative methods can be found in Appendix B, both theoretically and experimentally. Greedy strategies As mentioned, a popular approach for leveraging single-point acquisition functions is devising batch filling strategies that score candidate points sequentially. Kriging Believer (KB) [33] uses EI to select points and iteratively updates the GP by fantasizing an observation with the posterior mean. Likewise, GP-BUCB [34] uses fantasized observations to update p C(x) at each step. Local penalization (LP) [35] introduces a penalization function that repulses selection away from already selected points. Contal et al. [36] propose selecting a single point using UCB and dedicating the remainder of the batch budget for exploration in a restricted region around the believed optimum. GLASSES [37] treats batch selection as a multi-step lookahead problem to overcome the myopia of only considering the immediate effect of selecting a point. Entropy based strategies From an information theory perspective, BO can be interpreted as seeking to reduce uncertainty over the location of optima of the unknown function. This has given rise to entropy-based acquisition functions such as entropy search (ES) [38], predictive entropy search (PES) [39] and max-value entropy search (MES) [13, 40, 41]. MES is distinct in that it seeks to quantify the mutual information between the unknown ftrue(x ) and the observations y|D, rather than the location of x . General-purpose Information-Based Bayesian Optimizatio N (GIBBON) [42] provides an extension of MES that enables application to batched acquisition as well as other challenges such as multi-fidelity BO. GIBBON proposes a lower bound formulation for the intractable batch MES criterion, which is then optimized using greedy selection. Despite being formulated to handle a large degree of parallelism, Moss et al. [42] reported that GIBBON fails in practice for large batches with Q > 50. Potentially, this behaviour is a consequence of the accuracy of the lower bound approximation. A heuristic scaling of the batch diversity was proposed to improve performance with large batches. GIBBON may also be interpreted as a determinantal point process (DPP) [4, 43]. In Appendix B we provide a detailed discussion of the relationship of the BEEBO acquisition function to GIBBON and DPPs. Note that while we will also make use of the term entropy in BEEBO, the quantity is distinct from the ones leveraged by the aforementioned approaches in the sense that it does not relate to an unknown optimum. Thompson sampling Given the challenges of generalizing acquisition functions to batch mode, Thompson sampling (TS), which was originally adopted from bandit problems [44, 45, 46, 47, 48], is a popular alternative strategy for guiding batched BO. While being an attractive approach in general, it has been demonstrated that default TS can become too exploitative, motivating the use of alternatives such as Bayesian Quadrature [49], or advanced strategies on top of TS that ensure diversity [18]. Eriksson et al. [50] demonstrate that overexploration also can be problematic in higher dimensions, and alleviate this using local trust regions in Tu RBO. Maintaining such regions with high precision discretization can be memory-expensive, as indicated by [51], who suggest using MCMC-BO with adaptive local optimization to address this by transitioning a set of candidate points towards more promising positions. 3 The BEEBO acquisition function Assume ftrue : X R is some real output associated with the input and a set of data be given D = {(xi, yi)}N i=1 where yi R represent some noisy observations of ftrue(xi), say yi = ftrue(xi) + ϵi (1) with ϵi denoting the measurement noise. Let x = (x1, , x Q) XQ represent a collection of test points we wish to assign an acquisition value to. In keeping with the BO framework, we assume a given posterior probability distribution over the surrogate function f evaluated at x, f(x) P(f | D, x) (2) The lack of knowledge we have of the surrogate function at x is quantified by the differential entropy H: H f | D, x = Z P f | D, x ln P f | D, x df (3) This entropy can be contrasted with the expected entropy of the surrogate function if Q observations y = (y1, , y Q) were acquired at x, i.e. if the training data D would be augmented with D (y) = {(xq, yq)}Q q=1 to form the joint data set Daug(y) = D, D (y) . We refer to this entropy as Haug: Haug f | D, x = Z P(y | D, x)H f | Daug(y) dy, (4) where P(y | D, x) represent the posterior predictive distribution at x. The expected information gain, I(x), from acquiring observations at x is given by the expected reduction of entropy from this process: I(x) = H f | D, x Haug f | D, x (5) We propose to represent the explore component of the acquisition function, a BEEBO, by I(x). The information gain I(x) is distinct from the quantities exploited by entropy search approaches, as it quantifies global uncertainty reduction, rather than estimating the information over an unknown x . The information gain is directly applicable to multivariate functions and to heteroskedastic settings where σ2 = σ2(x). Since large measurement uncertainties imply smaller information gain, a BEEBO exhibits risk-averse behaviour [11] by automatically prioritizing regions of small uncertainties from where more precise information of ftrue can be obtained, everything else being equal. The exploit component of BEEBO relies on taking expectation values of a scalar function of the random variable f(x), E : RQ R, that summarizes the optimality properties of a given batch x. Natural choices would be the mean or the maximum of f(x). Of particular interest is expressing the optimality as a softmax-weighted sum over f(x), as this allows us to smoothly interpolate between the two regimes: E(x) = E[ E(x)] Q = E q=1 softmax(βf)qfq where β is the softmax inverse temperature. At β = 0, we recover the mean. We scale the expectation with Q so that both I and E scale linearly with increasing batch size. While the mean provides a closed form expression for its expectation, this is not the case for the general softmax-weighted sum of a multivariate normal. Using Taylor expansion, we introduce an approximation of the expectation of the softmax-weighted sum that is fully differentiable and can be computed in closed form. A detailed derivation is provided in Appendix A. At β = 0, all Q points contribute equally to E(x), whereas at β > 0, points that do not compete for optimality are dynamically released. This effect can be quantified as the effective number of points via the entropy of the softmax weights. In the following, we will refer to the (exact) β = 0 limit as mean BEEBO, and the (approximated) general case as max BEEBO. The BEEBO acquisition function then takes the form a BEEBO(x) = E(x) + T I(x), (7) where T sets the balance between exploitation (small T) and exploration (large T). As both E and I scale with the batch size Q, a given choice of T would set the explore-exploit balance in an approximately Q-independent manner. This acquisition function bears a strong similarity to the definition of (negative) free energies in statistical physics, where E and I correspond to respectively the thermodynamic energy and entropy of the system and T corresponds to the temperature. 3.1 BEEBO with Gaussian processes Gaussian processes offer a particular convenient framework for BO, due to the availability of closedform expressions for the inference step [17]. Specifically P(f | D, x) = N(f | µ(x), C(x)) µ(x) = K(x, x D) M 1 D y D C(x) = K(x, x) K(x, x D) M 1 D K(x D, x) MD = K(x D, x D) + σ2(x D) (8) where N( | µ, C) is the multivariate Gaussian distribution with mean µ, and covariance C, x D and y D are the x and y values of the acquired data, σ(x D) = diag σ2 1, , σ2 N is a diagonal matrix with the measurement uncertainties in the diagonal and K( , ) are matrices derived from the GP-kernel, k( , ), i.e. K(x, x )ij = k(xi, x j). It is worth noting that C(x) only depends on the input location of the test points x and the data points x D with their corresponding measurement uncertainties, σ2(x D), but not on the actual observations, y D. Consequently, the entropy of the posterior distribution H f | D, x = Q 2 ln(2πe) + 1 2 ln det(C(x)) (9) is independent of y D as well, with ln det denoting the log determinant. Similarly, the expected entropy of f if observations at x were acquired, simply reads Haug f | D, x = Q 2 ln(2πe) + 1 2 ln det(Caug(x)), (10) Caug(x) = K(x, x) K(x, xaug) M 1 aug K(xaug, x) Maug = K(xaug, xaug) + σ2(xaug) (11) and xaug = x Daug. The BEEBO acquisition function is then given by a BEEBO(x) = E[ E(x)] Q + T I(x) (12) where the expectation is either the mean, 1 Q PQ q=1 µq, or the closed form approximation of the softmax-weighted sum described in Appendix A, and 2 ln det(C(x)) 1 2 ln det(Caug(x)). (13) Algorithm 1: mean BEEBO optimization Input: model GP, initial batch points x, temperature T repeat Calculate µ(x), C(x) from Equation 8 using GP E PQ q=1 µq GPaug fantasize(GP, x) Calculate Caug(x) from Equation 11 using GPaug I 1 2 ln det (C(x)) 1 2 ln det (Caug(x)) a E + T I x x + γ a until converged Output: optimized batch points x All operations needed to compute the acquisition value a BEEBO(x) are analytical. Using automatic differentiation, the batch of points x can therefore be optimized with gradient-based methods, as laid out for mean BEEBO in Algorithm 1, with learning rate γ. In the pseudocode, GP denotes a trained GP model object that holds the training data and the kernel function. Using the kernel s learned amplitude A, we can relate BEEBO s T parameter to the κ of UCB. This allows us to configure BEEBO using a scaled temperature T that ensures both methods have equal gradients at isosurfaces, enabling the user to follow existing guidance and intuition from UCB to control the trade-off. A derivation is provided in Appendix B.1. 4 Experiments Table 1: Overview of the test problems used in the experiments. Function Dimension Ackley 2, 10, 20, 50, 100 Shekel 4 Hartmann 6 Cosine 8 Rastrigin 2, 10, 20, 50, 100 Rosenbrock 2, 10, 20, 50, 100 Styblinski-Tang 2, 10, 20, 50, 100 Powell 10, 20, 50, 100 Embedded Hartmann 6 100 Test problems We benchmark acquisition function performance on a range of maximization test problems with varying dimensions (Table 1) available in Bo Torch [52]. Test problems that are evaluated on multiple dimensions support specifying the respective arbitrary d. As a high-dimensional problem with low inherent dimensionality, we embed the six-dimensional Hartmann function in d = 100 [50, 53, 54]. We additionally test on two robot control problems (robot arm pushing and rover trajectory planning) in Appendix D.3 [55, 56]. On each test problem, we perform 10 rounds of BO using q-UCB or BEEBO with a given explore-exploit parameter for direct comparison. We use the scaled temperature T (B.1) to ensure that both methods operate at the same trade-off. In round 0, we seed the surrogate GP with Q random points that were drawn so that each point has a minimum distance of 0.5 to the test problem s true optimum. We perform ten replicate runs for each problem and method, with replicate seeds controlled so that all methods start from the same Q random points in a replicate. As we evaluate performance in a fixed-round, fixed-Q optimization scenario, we set the explore hyperparameter to 0 in the last round (for max BEEBO, we also set the softmax β to 0). We use Q = 100 for all experiments, which is commonly understood to be a large batch size [50]. Additional results on small batch sizes (5, 10) are provided in Appendix D.2. All experiments use Bo Torch s default utilities for acquisition function optimization and GPy Torch [57] GP training (C.1). Heteroskedastic noise We investigate performance when optimizing under heteroskedastic noise on the 2D Branin function with three global optima. To construct a heteroskedastic problem, we specify noise so that the noise level is maximal at optima 2 and 3, decaying exponentially with distance to any of the two noised optima (C.4). No noise maximum is added at optimum 1. Therefore, while all three optima share the same ftrue(x) (Figure A1), only optimum 1 is favorable in terms of heteroskedastic risk. We perform BO for ten rounds with β = 0.1 and Q = 10 using a heteroskedastic GP that learns surrogate models for both ftrue(x) and σ2(x). We report results over five replicate runs. Metrics We report the mean best observed objective value after 10 rounds over the five replicates. As test problems have highly varying scales, we normalize the results on each test problem using min-max normalization. Typically, the minimum of a maximization problem is not known explicitly. We therefore set the minimum for normalization to the highest value observed among the random seed points. The maximum is given by the ftrue(x ) of the problem. The metric thus directly quantifies how much progress has been made to the true optimum from the random starting configuration on a 0-1 scale. As we are not only interested in identifying a single x with good ftrue(x), we additionally quantify the overall quality of the final (exploitative) batch. We compute the batch instantaneous regret R = P q 5, numerical errors prevent a reliable calculation of the expectation. In practice, A 1/2 lies in a range that allows numerically accurate solutions. Softmax When ymax grows much larger than the softmax input vector f - a situation that can arise easily when initializing with random points for gradient-based optimization - the softmax weights ω can become numerically zero for all "real" points, thus leading to E(x) = 0, and vanishing gradients. As we always wish to preserve a minimal energy contribution from the real points, we parametrize the inverse temperature applied to ymax, βy, using a hyperparameter α that denotes the minimal fraction of probability mass pertaining to real points. This parametrization resembles the Log EI version of the expected improvement acquisition function [58] to address the problem of vanishing EI-gradients. Let N denote the softmax denominator excluding ymax, N = PQ j=1 exp(β fi). We define exp(βy ymax) = min 1 α α N, exp(β ymax) (A28) We used α = 0.05 as a default in all our experiments. A.4 Number of effective points We can interpret the softmax as the number of effective points contributing to the energy of the batch. The entropy H of the softmax is given by i=1 ωi ln(ωi), (A29) and the number of effective points, Deff, is exp(H(ω)), so that i=1 ωi ln(ωi) Deff is bounded by 1 (approaching the maximum) and Q (approaching the mean). Note that if we include ymax in the softmax denominator, we add ωymax ln(ωymax) to H(ω), and the resulting number becomes bounded by 1 and Q + 1. B Relationship to other acquisition strategies In the following section, we will discuss how BEEBO is related to UCB, GIBBON, Determinantal Point Processes (DPP), the Local Penalization heuristic and RAHBO. We will base our analysis on mean BEEBO, as the softmax-mediated interdependency of points in max BEEBO prevents a simple interpretation of the objective in a single-point stepwise manner and does not allow for the same direct analogies to other strategies. B.1 Relationship of BEEBO T and UCB κ hyperparameters BEEBO bears some resemblance to the UCB acquisition function, which in the single particle mode, Q = 1, reads a UCB(x) = µ(x) + κ p C(x), (A31) where the parameter κ controls the balance between exploitation and exploration and µ(x) and C(x) are respectively the mean and variance of the posterior distribution, P(f | x, D), as before. We note that a UCB does not account for the uncertainty of the measurement at x, and therefore remains risk-neutral under heteroskedastic noise [11]. To understand the relationship between BEEBO and UCB, we will therefore limit ourselves to the homoskedastic case and furthermore assume that measurement variances σ2 are much smaller than the typical prior variance of the GP surrogate, A, of f, e.g. A N 1Tr(K), so σ2 A and M 1 = (K + σ2) 1 K 1. In this limit, the variance of f(x) after measurement (indexed at i = n, say) reduces to σ2: C(x) = (K 1 + σ 2I) 1 nn = K(K + σ2I) 1σ2 nn σ2 (A32) and the information gain becomes 2 ln(C(x)) log(σ). Consequently, the gradient of the two acquisition functions reads a UCB(x) = µ(x) + κ a BEEBO(x) = µ(x) + T 2 C(x) C(x). The two gradients will be identical at points x where the posterior uncertainties satisfy p C(x) = T κ. For comparison, we may desire equal gradients at iso-surfaces corresponding to a given fraction, ν, of the prior uncertainty scale A, by setting T accordingly as T = ν A κ. In our experiments, we use ν = 1 2 and configure BEEBO using a dimensionless T explore-exploit parameter, defined as T = T A, and set T = 1 2 κ for a given benchmark experiment. GIBBON [42] approximates the (intractable) General-purpose max-value Entropy Search acquisition function, which quantifies the mutual information MI(f true; y|D) of a batch of measurements y and the unknown optimum f true. It does so using a lower bound on the information gain and MC estimation of the expectation over f true. It can be written as αGIBBON(x) = 1 2 ln det(R) 1 2|M| i=1 ln 1 ρ2 i ϕ(γi(m)) ϕ(γi(m)) γi(m) + ϕ(γi(m)) αGIBBON(x) = 1 2 ln det(R) + i=1 ˆαGIBBON(xi), (A33) where R is the correlation matrix with entries Rij = C(x)ij C(x)ii C(x)jj , M is a set of samples for the max-value f true, and ρi is the correlation of yi and ftrue(xi). ϕ and ϕ are the standard normal cumulative distribution and probability density functions, and γi(m) = m µ The definition of BEEBO introduced in Equation 7, with the scalar summarization function set to the expected mean, E(x) = 1 Q PQ i=1 µ(xi), gives αBEEBO(x) = T 1 2 (ln det (C(x)) ln det (Caug(x))) + i=1 µ(xi). (A34) From the second formulation of GIBBON, it becomes obvious that although being distinct in their motivation and derivation, BEEBO and GIBBON implement acquisition functions with a similar structure. Taking an information theoretic and multi-fidelity BO standpoint, GIBBON refers to this trade-off as diversity against quality, whereas in BEEBO we follow the intuitions of UCB, and use exploration and exploitation. Quality - Exploitation: GIBBON employs an MC estimate of the lower bound approximation of the information gain provided by each point, whereas BEEBO directly summarizes the optimality of all points in closed form, either as their mean or an approximated softmax weighted sum. Diversity - Exploration: In GIBBON, the diversity derived from the differential entropy H(f|D, x) is the entropy of the posterior correlation 1 2 ln det(R). In BEEBO, we employ the reduction of entropy, the information gain I(x). Under homoskedastic noise, I(x) ln det(C(x)). Since R(x) = diag(C(x)) 1/2 C(x) diag(C(x)) 1/2, we have that ln det(R) = ln det(C(x)) PQ i ln(C(x)ii). Therefore, maximizing the log determinant of R penalizes points that have high variance. Therefore, while GIBBON presents an attractive approximation of max-value Entropy Search for batched acquisition, BEEBO is an alternative that avoids approximating a quality criterion using MC. Moreover, GIBBON s diversity criterion implicitly penalizes points that have high variance, whereas BEEBO s criterion maximizes the reduction of variance. We find that BEEBO is orders of magnitudes faster to compute than GIBBON (Figure A3). In the context of large batches (Q >> 10), a modification of GIBBON exists that is further similar to BEEBO. Departing from the strict max-value entropy search derivation, a scaling factor Q 2 is introduced to counteract a growing dominance of the diversity term: αscaled GIBBON(x) = 1 2Q2 ln det(R) + i=1 ˆαMES(xi). (A35) This scaling is motivated by the fact that R contains Q2 elements. However, we note that R is summarized by its log determinant, which scales linearly in Q: As the determinant is the product of the eigenvalues, the log determinant is the sum of the log-eigenvalues. The number of eigenvalues scales linearly with matrix size Q, and so does the log determinant. B.3 Determinantal Point Processes A Determinantal Point Process [43] specifies a probability over a set of points, or a "configuration of points" drawn from a ground set. Specifically, the probability of a set of Q points x is given by P(x) det (Lx) , (A36) where Lx is a Q Q symmetric matrix. Kulesza et al. [43] provide a decomposition of the general DPP kernel L that makes quality and diversity components explicit, so that Lij = q(xi)q(xj)k(xi, xj), (A37) with k being a Rd Rd R+ similarity kernel, and q being a unary Rd R scalar quality function. This framework is naturally amenable to batch BO, as we seek to select a collection of points that trade off quality (optimality) and diversity. Note that both k and q are distinct functions that need to be specified by the user, leading to the practical complication that they must be chosen very carefully so that their scales do not dominate each other, which limits the utility of this decomposition in practice [42]. In the following, we show how BEEBO is equivalent to a DPP, and derive the necessary k and q. Again, we consider BEEBO αBEEBO(x) = E(x) + T I(x), (A38) with the scalar summarization function set to E(x) = PQ i=1 f(xi). We will first focus on the information gain term I(x), which we can rearrange as 2 ln det(C(x)) 1 2 ln det(Caug(x)) = 1 2 ln det C(x) C 1 aug(x) . (A39) Our similarity kernel k is therefore given by the entries of the matrix S = C(x) Caug(x) 1, so that k(xi, xj) = Sij. Note that due to the augmented covariance term, the implied k also depends on all other currently selected points in x, and Lx is not a submatrix of an all-sample L. Therefore, BEEBO does not implement a DPP under heteroskedastic noise. However, if we only consider homoskedastic noise, BEEBO s I(x) simplifies to the posterior entropy [63], and therefore S = C(x). As C(x) can be accessed as a submatrix of an all-sample C, this permits a DPP. Given the choice of E(f), we can rewrite BEEBO as αBEEBO(x) = ln det (S) T 1 2 T αBEEBO(x) = ln det (S) + 2 T αBEEBO(x) = ln det (S) + ln det (D)) with D = diag(exp( 2 2 T αBEEBO(x) = ln det D 1 2 S D 1 2 D T µi) = exp( 1 αBEEBO(x) = ln det D 1 2 S D 1 2 T 1 αBEEBO(x) = ln det (L) T 1 where L is a matrix with entries Lij = Sij exp( 1 T µi) exp( 1 T µj). BEEBO therefore uses the DPP quality function q(xi) = exp( 1 T µi), and, like proven previously for GIBBON, a batch x with maximal αBEEBO corresponds to the MAP of a DPP. B.4 Local penalization Local penalization (LP) is a greedy batch selection strategy that given any arbitrary single-point acquisition function, ensures diversity by applying a penalization function ψ(x, x ) that downweights the acquisition value of candidate locations x based on their proximity to already selected points. The criterion for selecting xi is given by xi = arg max α(x) j=1 ψ(x, xj). (A41) Note that in this formulation, the product includes all previously selected points, not just the current batch. The penalization function ψ may in principle be chosen freely. Gonzalez et al. [35] propose exploiting the fact that ftrue is Lipschitz continuous in order to bound the position of the unknown optimum and penalize accordingly. The Lipschitz constant L is inferred from the GP surrogate and used to parametrize ψ. In LP, acquisition function optimization proceeds iteratively. After an xi is chosen, the corresponding penalizing multiplier is added to the objective before optimizing for the next xi+1. While BEEBO enables optimization to proceed in parallel, it is of course possible to also optimize BEEBO greedily (under homoskedastic noise, I is submodular). In this case, it implements an LP strategy where α(x) = µ(x). Rather than a product of individual Rd Rd R function evaluations, the penalizer implied by BEEBO is the information gain I(x) : Ri d R that we evaluate by concatenating a candidate point to the already acquired x at each iteration. Like in GIBBON, this constitutes an LP strategy that does not require estimation of any properties of ftrue beyond learning the GP surrogate. Risk-averse Heteroskedastic Bayesian Optimization (RAHBO) [11] is a UCB-derived single-point acquisition function that avoids heteroskedastic risk, preferentially selecting points with low noise. While it is not applicable to batched acquisition directly, we here compare it to single-sample BEEBO to highlight different ways of addressing noise. Given a heteroskedastic surrogate model that learns an additional GP for the noise, the variance proxy, RAHBO reads αRAHBO(x) = UCBf(x) α LCBvar(x) αRAHBO(x) = µf(x) + βf σf(x) α(µvar(x) βvar σvar(x)) , (A42) where µf and σf are the posterior mean and variance of the surrogate model and βf is the standard UCB trade-off hyperparameter, yielding the standard upper confidence bound UCBf. α is the chosen risk tolerance, and LCB is the lower confidence bound of the variance GP with posterior mean µvar and variance σvar traded off using βvar. At Q = 1, BEEBO can be expressed as αBEEBO = µf(x) + T 1 2 ln(σf(x)) T 1 2 ln(σaug f (x)) αBEEBO = µf(x) + T 1 σf(x) σaug f (x) where the variance proxy at x is considered via the augmented posterior variance σaug f . While RAHBO penalizes risk on an absolute scale, subject to α, BEEBO optimizes for high uncertainty reduction, quantified as the log ratio of the variance before and after making measurements. Moreover, RAHBO differentiates between known and unknown variance proxies, and uses the LCBvar term to discount the predicted variance according to its uncertainty. In its closed-form analytical expression, BEEBO does not permit for the uncertainty of the variance proxy to be taken into account, being more similar to the known variance RAHBO αRAHBO(x) = µf(x) + βf σf(x) αµvar(x) (A44) where µvar is a noise-free proxy. Either a sampling-based approach, or approximations to I(x) would need to be introduced to handle variance proxy uncertainty in BEEBO. C Implementation details C.1 Acquisition function optimization BEEBO was implemented for full compatibility with the Bo Torch framework (version 0.9.4) [52] as an Analytic Acquistion Function. Standard Bo Torch utilities for initializing and training GPs, initializing q-batches and performing gradient descent optimization of the acquisition function are used. We trained GPy Torch (version 1.11) [57] GP models with Ke Ops [64] Matérn 5/2 kernels (following Bo Torch defaults with a separate length scale for each input dimension, and Gamma priors on the length and output scales). Log determinants for the information gain were computed using singular value decomposition for numerical stability. GPy Torch provides a get_fantasy_model method that allows for the efficient augmentation of the training data of a GP with a set of points, as done in BEEBO. However, we observed that GPy Torch s implementation suffers from GPU memory leaks when used with automatic differentiation enabled. We therefore instantiate augmented models explicitly, not making use of the (more efficient) augmentation strategy. All experiments were performed with double precision. Sobol QMCNormal Sampler was used for acquisition functions making use of the reparametrization trick. Experiments were run on individual Nvidia RTX 6000 and V100 GPUs. Five replicates for the benchmarking experiments required a total of approx. 5,000 RTX 6000 GPU hours, with the majority of the run time dedicated to the GIBBON baseline, rather than BEEBO itself (Figure A3, Table A9). C.2 Benchmark BO methods All methods were benchmarked in Bo Torch. For q-EI, we used Log EI [58]. For TS, 10.000 base Sobol samples were drawn and sampled with Max Posterior Sampling using the Cholesky decomposition of the covariance matrix. GIBBON was optimized using sequential optimization following the Bo Torch tutorial. We additionally implemented a custom version of GIBBON that applies the Q 2 scaling factor to the diversity term, as proposed in GIBBON s supplementary material. We used 100,000 random discretized candidates for max-value sampling. In a few iterations, optimizing GIBBON seemed challenging, with Bo Torch reporting that no nonzero initialization candidate could be identified. KB was optimized using a custom greedy optimization loop with fantasized observations, using (single-sample) Log EI as the underlying acquisition function. Tu RBO-1 was optimized following its Bo Torch tutorial. None of the methods use a hyperparameter for controlling their explore-exploit trade-off. The results are therefore based on 10 iterations at defaults. C.3 Test problems Test functions All test functions were used in their Bo Torch implementations. As done in previous work, the embedded Hartmann function was created by appending all-0 dummy dimensions to the original six dimensions [53, 54, 50]. Control problems We consider two control problems from previous work: A 14-dimensional parameter tuning task for controlling robot arms pushing two objects to a target location [55], and a 60-dimensional trajectory planning task for a rover navigating through a maze of obstacles [56]. Instead of converting the problem objectives into rewards as in the original work, we operate on the actual minimization objectives directly (distance to target, navigation loss), and follow Bo Torch s approach of simply inverting the objective in order to yield maximization problems. Both problems were adapted from their available implementations in Wang et al. [56] to follow the Bo Torch test problem API. C.4 Heteroskedastic noise The (inverted) Branin function has three global optima f(x ) = 0.397887 at x 1 = (9.42478, 2.475), x 2 = ( π, 12.275) and x 3 = (π, 2.275). We define heteroskedastic noise so that the variance is maximal at x 2 and x 3. The noise decays exponentially with the distance from any of the two noised optima at a rate λ. σ2(x) = σ2 max exp( λ min( x x 2 2, x x 3 2) (A45) For our experiments, we set σ2 max = 100 and λ = 0.05. As the surrogate function, we use a Heteroskedastic Single Task GP provided in Bo Torch. This model learns two GPs simultaneously, one for the function f(x) and one for the (also unknown) variance function σ2(x). When querying the oracle with a batch of points, noised observations of f(x) are provided together with the true σ2 at each point. The homoskedastic control experiment uses a Single Task GP with inferred noise level. The homoskedastic noise is set to σ2 = 77.5, which is the average noise level of the heteroskedastic function over the whole domain. Figure A1: The Branin function with added heteroskedastic noise following Equation A45. σ2 max = 100, λ = 0.05. D Extended results D.1 Results including additional baselines Table A1: BO on noise-free synthetic test problems. The normalized highest observed value after 10 rounds of BO with q=100 is shown. Colors are normalized row-wise. The BEE-BO and q-UCB columns are equivalent to Table 2. Higher means better. Results are means over five replicate runs. Problem d mean BEEBO max BEEBO q-UCB q-EI TS KB GIBBON Tu RBO T =0.05 T =0.5 T =5.0 T =0.05 T =0.5 T =5.0 κ=0.1 κ=1.0 κ=10.0 - - - default scaled - Ackley 2 0.993 0.005 0.985 0.031 0.975 0.035 0.982 0.023 0.980 0.035 0.988 0.013 0.973 0.023 0.967 0.022 0.988 0.011 0.987 0.012 1.000 0.000 0.981 0.014 0.878 0.100 0.951 0.027 0.951 0.027 Levy 2 1.000 0.000 0.999 0.001 0.999 0.001 1.000 0.000 1.000 0.000 0.998 0.002 1.000 0.000 1.000 0.000 0.998 0.002 1.000 0.000 0.999 0.002 1.000 0.000 0.988 0.008 0.993 0.010 0.993 0.010 Rastrigin 2 0.981 0.024 0.989 0.016 0.983 0.016 0.993 0.007 0.983 0.011 0.993 0.006 0.951 0.021 0.983 0.015 0.933 0.025 0.995 0.007 1.000 0.000 0.976 0.021 0.903 0.087 0.944 0.038 0.944 0.038 Rosenbrock 2 0.976 0.045 0.956 0.071 0.955 0.080 0.982 0.032 0.979 0.027 0.938 0.123 0.949 0.074 0.943 0.129 0.962 0.079 0.982 0.029 0.966 0.079 0.976 0.068 0.633 0.355 0.843 0.301 0.843 0.301 Styblinski-Tang 2 0.961 0.072 1.000 0.000 1.000 0.000 1.000 0.000 1.000 0.000 1.000 0.001 1.000 0.000 1.000 0.000 0.999 0.001 1.000 0.000 1.000 0.000 1.000 0.000 0.996 0.003 0.999 0.001 0.999 0.001 Shekel 4 0.540 0.242 0.915 0.082 0.698 0.296 0.300 0.079 0.378 0.230 0.411 0.222 0.244 0.116 0.330 0.212 0.264 0.023 0.515 0.253 0.155 0.033 0.371 0.220 0.187 0.057 0.282 0.076 0.282 0.076 Hartmann 6 1.000 0.000 1.000 0.000 0.986 0.013 0.894 0.073 0.976 0.045 0.974 0.042 0.918 0.058 0.950 0.052 0.889 0.062 0.993 0.014 0.810 0.056 0.993 0.013 0.844 0.073 0.887 0.074 0.887 0.074 Cosine 8 1.000 0.000 0.999 0.001 0.619 0.099 0.999 0.001 0.972 0.016 0.895 0.077 0.934 0.032 0.924 0.032 0.621 0.192 0.802 0.060 1.000 0.000 0.985 0.011 0.900 0.071 0.937 0.046 0.937 0.046 Ackley 10 0.915 0.033 0.908 0.042 0.822 0.034 0.819 0.024 0.736 0.053 0.546 0.073 0.800 0.048 0.772 0.051 0.513 0.140 0.802 0.049 1.000 0.000 0.746 0.045 0.410 0.062 0.548 0.100 0.548 0.100 Levy 10 0.989 0.003 0.966 0.022 0.966 0.032 0.966 0.023 0.953 0.015 0.914 0.045 0.904 0.041 0.904 0.041 0.560 0.238 0.931 0.023 0.958 0.008 0.978 0.012 0.889 0.049 0.881 0.075 0.881 0.075 Powell 10 0.987 0.010 0.970 0.015 0.861 0.122 0.951 0.046 0.949 0.040 0.909 0.085 0.920 0.056 0.916 0.047 0.283 0.236 0.920 0.052 0.883 0.057 0.971 0.009 0.755 0.204 0.834 0.118 0.834 0.118 Rastrigin 10 0.463 0.123 0.536 0.157 0.595 0.066 0.558 0.091 0.573 0.122 0.590 0.087 0.420 0.075 0.522 0.081 0.311 0.138 0.456 0.101 1.000 0.000 0.431 0.099 0.359 0.135 0.210 0.158 0.210 0.158 Rosenbrock 10 0.994 0.005 0.991 0.007 0.904 0.068 0.991 0.007 0.986 0.010 0.975 0.028 0.966 0.033 0.971 0.024 0.645 0.255 0.984 0.005 0.870 0.050 0.993 0.005 0.974 0.021 0.979 0.015 0.979 0.015 Styblinski-Tang 10 0.837 0.072 0.835 0.068 0.289 0.155 0.822 0.050 0.638 0.107 0.229 0.156 0.309 0.190 0.492 0.167 0.049 0.104 0.740 0.147 0.430 0.198 0.852 0.071 0.056 0.080 0.255 0.193 0.255 0.193 Ackley 20 0.827 0.045 0.851 0.032 0.777 0.067 0.818 0.023 0.781 0.035 0.404 0.100 0.741 0.027 0.753 0.030 0.474 0.148 0.731 0.048 1.000 0.000 0.755 0.050 0.252 0.106 0.299 0.110 0.299 0.110 Levy 20 0.949 0.028 0.943 0.017 0.889 0.073 0.945 0.025 0.904 0.041 0.907 0.056 0.926 0.031 0.900 0.045 0.819 0.075 0.934 0.025 0.977 0.004 0.955 0.024 0.879 0.097 0.746 0.157 0.746 0.157 Powell 20 0.955 0.028 0.965 0.017 0.872 0.085 0.939 0.020 0.913 0.077 0.915 0.061 0.948 0.031 0.913 0.058 0.845 0.104 0.936 0.040 0.966 0.016 0.967 0.020 0.912 0.061 0.876 0.094 0.876 0.094 Rastrigin 20 0.399 0.083 0.473 0.064 0.508 0.043 0.484 0.089 0.472 0.088 0.522 0.074 0.423 0.081 0.480 0.061 0.401 0.073 0.456 0.070 1.000 0.000 0.447 0.084 0.413 0.047 0.397 0.098 0.397 0.098 Rosenbrock 20 0.993 0.003 0.995 0.003 0.907 0.069 0.992 0.005 0.983 0.013 0.933 0.047 0.973 0.014 0.982 0.008 0.924 0.049 0.987 0.007 0.946 0.021 0.995 0.002 0.953 0.044 0.980 0.016 0.980 0.016 Styblinski-Tang 20 0.737 0.065 0.689 0.114 0.330 0.165 0.667 0.099 0.394 0.090 0.274 0.055 0.203 0.105 0.561 0.143 0.034 0.053 0.621 0.104 0.210 0.201 0.645 0.112 0.104 0.081 0.332 0.112 0.332 0.112 Ackley 50 0.235 0.275 0.342 0.276 0.823 0.045 0.623 0.264 0.594 0.218 0.465 0.103 0.638 0.041 0.759 0.015 0.730 0.021 0.745 0.042 1.000 0.000 0.546 0.140 0.273 0.067 0.179 0.121 0.179 0.121 Levy 50 0.940 0.082 0.965 0.020 0.943 0.018 0.971 0.019 0.958 0.016 0.879 0.025 0.948 0.016 0.951 0.035 0.941 0.015 0.952 0.012 0.987 0.001 0.955 0.016 0.880 0.023 0.901 0.095 0.901 0.095 Powell 50 0.954 0.029 0.982 0.008 0.950 0.023 0.975 0.008 0.969 0.010 0.938 0.024 0.970 0.004 0.961 0.015 0.980 0.007 0.965 0.014 0.986 0.003 0.957 0.030 0.955 0.014 0.939 0.027 0.939 0.027 Rastrigin 50 0.322 0.137 0.476 0.042 0.432 0.051 0.472 0.034 0.470 0.021 0.439 0.024 0.431 0.039 0.397 0.050 0.481 0.041 0.468 0.036 1.000 0.000 0.409 0.068 0.447 0.045 0.305 0.098 0.305 0.098 Rosenbrock 50 0.971 0.016 0.984 0.002 0.983 0.010 0.976 0.008 0.981 0.005 0.962 0.019 0.968 0.005 0.986 0.003 0.981 0.012 0.979 0.005 0.977 0.003 0.984 0.011 0.973 0.013 0.977 0.008 0.977 0.008 Styblinski-Tang 50 0.584 0.087 0.675 0.062 0.393 0.161 0.509 0.064 0.342 0.090 0.325 0.060 0.312 0.105 0.694 0.039 0.356 0.079 0.632 0.129 0.236 0.205 0.699 0.044 0.203 0.039 0.506 0.084 0.506 0.084 Ackley 100 0.277 0.343 0.190 0.344 0.863 0.028 0.417 0.361 0.540 0.281 0.682 0.093 0.708 0.015 0.645 0.102 0.844 0.016 0.742 0.074 0.007 0.004 0.299 0.120 0.244 0.231 0.348 0.246 0.348 0.246 Emb. Hartmann 6 100 0.951 0.090 0.987 0.008 0.928 0.076 0.957 0.079 0.936 0.068 0.907 0.057 0.896 0.076 0.913 0.064 0.916 0.122 0.870 0.153 0.446 0.294 0.912 0.118 0.633 0.163 0.636 0.169 0.636 0.169 Levy 100 0.837 0.155 0.961 0.024 0.944 0.016 0.966 0.013 0.950 0.017 0.940 0.019 0.950 0.008 0.934 0.035 0.964 0.009 0.952 0.021 0.183 0.290 0.903 0.053 0.763 0.283 0.913 0.182 0.913 0.182 Powell 100 0.810 0.036 0.952 0.066 0.983 0.008 0.985 0.002 0.980 0.006 0.979 0.006 0.982 0.005 0.981 0.009 0.984 0.005 0.971 0.015 0.397 0.332 0.964 0.022 0.691 0.115 0.685 0.123 0.685 0.123 Rastrigin 100 0.497 0.034 0.401 0.160 0.459 0.024 0.441 0.117 0.455 0.026 0.455 0.029 0.446 0.016 0.442 0.041 0.443 0.020 0.482 0.023 0.332 0.461 0.634 0.088 0.290 0.112 0.286 0.085 0.286 0.085 Rosenbrock 100 0.822 0.075 0.971 0.012 0.980 0.009 0.953 0.084 0.976 0.009 0.970 0.011 0.972 0.006 0.969 0.016 0.978 0.013 0.974 0.003 0.290 0.377 0.936 0.052 0.966 0.075 0.931 0.114 0.931 0.114 Styblinski-Tang 100 0.537 0.045 0.474 0.110 0.401 0.106 0.423 0.045 0.353 0.055 0.296 0.030 0.308 0.042 0.532 0.109 0.278 0.041 0.536 0.077 0.205 0.248 0.627 0.058 0.222 0.053 0.221 0.054 0.221 0.054 Mean 0.795 0.828 0.788 0.811 0.790 0.744 0.758 0.801 0.678 0.819 0.734 0.813 0.631 0.667 0.727 Median 0.940 0.961 0.889 0.951 0.949 0.907 0.920 0.913 0.819 0.931 0.966 0.955 0.755 0.834 0.819 Table A2: BO on noise-free synthetic test problems. The relative batch instantaneous regret of the last, exploitative batch is shown. Colors are normalized row-wise. The BEEBO and q-UCB columns are equivalent to Table 3. Lower means better. Results are means over five replicate runs. Problem d mean BEEBO max BEEBO q-UCB q-EI TS KB GIBBON Tu RBO T =0.05 T =0.5 T =5.0 T =0.05 T =0.5 T =5.0 κ=0.1 κ=1.0 κ=10.0 - - - default scaled - Ackley 2 0.292 0.102 0.268 0.120 0.257 0.098 0.259 0.125 0.245 0.109 0.165 0.098 1.006 0.020 0.999 0.022 1.002 0.024 0.946 0.050 0.498 0.203 0.800 0.157 1.048 0.025 0.821 0.191 0.248 0.138 Levy 2 0.134 0.055 0.092 0.056 0.102 0.037 0.114 0.012 0.102 0.037 0.111 0.012 1.236 0.378 1.046 0.134 1.114 0.167 1.106 0.138 0.280 0.288 0.184 0.194 1.637 0.568 0.678 0.477 0.091 0.160 Rastrigin 2 0.455 0.053 0.425 0.147 0.407 0.335 0.578 0.230 0.454 0.150 0.500 0.055 1.010 0.069 0.999 0.042 1.020 0.060 0.839 0.115 0.692 0.180 0.763 0.169 1.106 0.130 0.600 0.223 0.062 0.064 Rosenbrock 2 0.001 0.001 0.001 0.001 0.002 0.001 0.004 0.002 0.004 0.002 0.002 0.001 0.992 0.325 1.094 0.337 1.014 0.194 1.215 0.314 0.004 0.005 0.008 0.009 1.875 1.701 0.045 0.052 0.000 0.000 Styblinski-Tang 2 0.168 0.012 0.169 0.008 0.170 0.008 0.172 0.006 0.170 0.009 0.170 0.008 1.024 0.095 1.027 0.062 1.051 0.095 0.851 0.155 0.039 0.002 0.285 0.413 1.674 1.055 1.240 0.719 0.181 0.064 Shekel 4 0.810 0.094 0.688 0.090 0.688 0.090 0.776 0.073 0.730 0.057 0.695 0.082 0.993 0.009 0.995 0.004 0.988 0.006 0.964 0.032 0.936 0.030 1.000 0.016 1.003 0.003 0.985 0.028 0.587 0.125 Hartmann 6 0.060 0.022 0.078 0.030 0.100 0.015 0.229 0.111 0.086 0.072 0.098 0.028 0.968 0.057 0.971 0.015 0.862 0.054 0.867 0.043 0.358 0.007 0.352 0.194 1.029 0.011 1.023 0.016 0.049 0.023 Cosine 8 0.045 0.135 0.001 0.001 0.222 0.063 0.001 0.000 0.016 0.008 0.061 0.039 0.953 0.066 0.975 0.050 0.922 0.059 1.112 0.158 0.446 0.026 1.069 0.283 1.690 0.142 0.678 0.290 0.087 0.047 Ackley 10 0.478 0.143 0.314 0.082 0.253 0.056 0.338 0.055 0.345 0.080 0.452 0.063 0.931 0.014 0.943 0.015 0.950 0.022 0.942 0.070 0.983 0.012 0.917 0.230 1.017 0.003 1.014 0.004 0.316 0.088 Levy 10 0.041 0.042 0.023 0.020 0.261 0.067 0.030 0.026 0.048 0.030 0.103 0.066 1.188 0.166 1.011 0.083 1.111 0.081 0.741 0.202 0.520 0.082 0.352 0.188 2.704 0.222 1.387 1.054 0.022 0.011 Powell 10 0.016 0.021 0.009 0.002 0.067 0.022 0.027 0.012 0.067 0.027 0.151 0.044 1.037 0.184 1.101 0.156 1.215 0.275 0.232 0.093 0.148 0.042 0.041 0.018 2.882 0.202 0.432 0.568 0.003 0.002 Rastrigin 10 0.629 0.091 0.523 0.132 0.567 0.130 0.563 0.156 0.541 0.104 0.402 0.150 0.920 0.027 0.907 0.048 0.905 0.036 0.931 0.069 0.809 0.100 0.962 0.100 1.191 0.022 0.782 0.060 0.344 0.107 Rosenbrock 10 0.002 0.000 0.004 0.002 0.074 0.009 0.013 0.008 0.015 0.007 0.052 0.021 0.906 0.109 0.770 0.098 0.918 0.075 0.078 0.041 0.098 0.016 0.005 0.001 2.462 0.153 0.251 0.161 0.004 0.006 Styblinski-Tang 10 0.196 0.027 0.223 0.029 0.559 0.056 0.220 0.040 0.337 0.022 0.496 0.063 1.174 0.136 1.126 0.077 1.219 0.055 1.247 0.344 0.809 0.103 0.689 0.144 2.339 0.229 1.118 0.393 0.177 0.075 Ackley 20 0.629 0.143 0.282 0.157 0.226 0.068 0.219 0.036 0.292 0.063 0.586 0.084 0.945 0.030 0.950 0.020 0.917 0.012 0.927 0.091 0.979 0.002 1.017 0.001 1.016 0.002 0.841 0.118 0.571 0.104 Levy 20 0.128 0.095 0.063 0.067 0.140 0.111 0.241 0.151 0.113 0.055 0.182 0.062 0.839 0.109 0.914 0.148 1.056 0.121 0.941 1.228 0.734 0.040 0.212 0.038 3.269 0.155 0.967 0.762 0.058 0.025 Powell 20 0.093 0.059 0.010 0.012 0.028 0.019 0.081 0.022 0.074 0.015 0.110 0.024 0.809 0.117 0.689 0.117 0.870 0.217 0.106 0.028 0.487 0.102 0.019 0.005 3.510 0.321 0.238 0.153 0.009 0.003 Rastrigin 20 0.686 0.068 0.610 0.053 0.541 0.094 0.600 0.044 0.635 0.058 0.555 0.044 0.864 0.034 0.838 0.042 0.852 0.035 0.784 0.116 0.861 0.020 0.858 0.330 1.246 0.072 0.651 0.111 0.487 0.109 Rosenbrock 20 0.047 0.031 0.004 0.002 0.036 0.026 0.105 0.057 0.048 0.038 0.051 0.013 0.591 0.171 0.578 0.147 0.903 0.130 0.060 0.018 0.387 0.096 0.006 0.002 2.681 0.116 0.326 0.233 0.005 0.004 Styblinski-Tang 20 0.426 0.187 0.378 0.128 0.691 0.165 0.398 0.074 0.504 0.036 0.578 0.074 1.113 0.099 1.107 0.121 1.177 0.094 0.924 0.100 0.887 0.035 0.765 0.127 2.765 0.114 1.070 0.459 0.287 0.108 Ackley 50 0.895 0.042 0.738 0.246 0.177 0.044 0.464 0.270 0.606 0.233 0.530 0.102 0.949 0.033 0.947 0.037 0.874 0.025 0.842 0.045 0.986 0.001 0.567 0.255 1.015 0.001 0.835 0.128 0.820 0.051 Levy 50 0.055 0.047 0.033 0.029 0.051 0.044 0.029 0.016 0.085 0.116 0.268 0.071 0.611 0.105 0.681 0.092 0.892 0.277 0.113 0.070 0.881 0.014 0.093 0.030 1.807 0.611 0.196 0.125 0.232 0.043 Powell 50 0.018 0.009 0.014 0.005 0.018 0.008 0.021 0.020 0.078 0.066 0.064 0.029 0.542 0.149 0.499 0.137 0.785 0.166 0.052 0.033 0.868 0.034 0.021 0.010 0.646 0.640 0.162 0.362 0.053 0.028 Rastrigin 50 0.793 0.184 0.653 0.247 0.795 0.374 0.573 0.061 0.592 0.048 0.585 0.075 0.813 0.030 0.810 0.044 0.768 0.019 0.662 0.050 0.934 0.012 0.860 0.353 1.234 0.220 0.716 0.121 0.560 0.037 Rosenbrock 50 0.016 0.009 0.049 0.123 0.010 0.007 0.021 0.009 0.031 0.019 0.048 0.021 0.539 0.175 0.520 0.134 0.594 0.142 0.033 0.010 0.801 0.037 0.014 0.005 1.337 0.430 0.031 0.022 0.055 0.027 Styblinski-Tang 50 0.463 0.206 0.676 1.257 0.681 0.142 0.478 0.099 0.574 0.079 0.727 0.031 1.012 0.065 1.196 0.136 0.981 0.049 0.685 0.238 0.961 0.018 0.620 0.358 1.696 0.490 0.617 0.178 0.454 0.055 Ackley 100 0.718 0.340 0.900 0.256 0.137 0.027 0.636 0.313 0.466 0.275 0.321 0.092 0.948 0.031 0.935 0.036 0.863 0.038 0.805 0.087 0.997 0.001 0.731 0.156 1.005 0.013 0.940 0.080 0.902 0.009 Emb. Hartmann 6 100 0.068 0.052 0.035 0.031 0.175 0.119 0.144 0.134 0.086 0.089 0.172 0.117 0.573 0.042 0.863 0.041 0.692 0.131 0.423 0.307 0.869 0.023 0.110 0.107 0.864 0.021 0.872 0.026 0.098 0.094 Levy 100 0.119 0.103 0.044 0.032 0.042 0.013 0.031 0.013 0.164 0.103 0.056 0.025 0.615 0.085 0.716 0.107 0.586 0.107 0.139 0.096 0.975 0.021 0.094 0.050 1.129 0.956 0.116 0.208 0.303 0.028 Powell 100 0.094 0.017 0.027 0.022 0.013 0.009 0.011 0.003 0.041 0.054 0.018 0.016 0.465 0.070 0.493 0.092 0.524 0.130 0.086 0.161 1.008 0.049 0.027 0.010 0.431 0.080 0.420 0.183 0.104 0.013 Rastrigin 100 0.506 0.051 0.604 0.142 0.540 0.035 0.501 0.092 0.557 0.053 0.544 0.047 0.759 0.019 0.832 0.034 0.780 0.025 0.658 0.050 0.986 0.012 0.624 0.087 0.918 0.214 0.713 0.144 0.584 0.018 Rosenbrock 100 0.114 0.043 0.027 0.011 0.014 0.009 0.044 0.051 0.048 0.039 0.031 0.015 0.518 0.100 0.589 0.127 0.507 0.092 0.091 0.072 0.962 0.045 0.046 0.026 0.758 0.520 0.127 0.205 0.133 0.019 Styblinski-Tang 100 0.389 0.030 0.503 0.222 0.562 0.142 0.522 0.095 0.582 0.132 0.742 0.082 0.924 0.042 1.203 0.188 0.930 0.049 0.584 0.208 0.979 0.013 0.333 0.059 0.752 0.052 0.855 0.179 0.538 0.027 Mean 0.29 0.26 0.26 0.26 0.26 0.29 0.87 0.89 0.90 0.64 0.70 0.44 1.57 0.66 0.26 Median 0.13 0.09 0.17 0.22 0.16 0.17 0.93 0.94 0.92 0.78 0.86 0.35 1.23 0.71 0.18 Table A3: Paired t-test p-values for the results of mean BEEBO in Table 2. The combined p-value was computed using Fisher s method. P-values smaller than 0.05 are indicated in bold. mean BEEBO T =0.05 mean BEEBO T =0.5 mean BEEBO T =5.0 Problem d q-UCB q-EI TS KB GIBBON GIBBON (s) Tu RBO q-UCB q-EI TS KB GIBBON GIBBON (s) Tu RBO q-UCB q-EI TS KB GIBBON GIBBON (s) Tu RBO Ackley 2 8E-03 7E-02 1E+00 9E-03 3E-03 8E-04 1E-01 1E-01 6E-01 9E-01 3E-01 8E-03 1E-02 4E-01 9E-01 8E-01 1E+00 7E-01 8E-03 7E-02 6E-01 Levy 2 4E-01 5E-01 2E-01 1E+00 6E-04 4E-02 7E-02 7E-01 1E+00 3E-01 1E+00 5E-04 5E-02 7E-02 2E-01 1E+00 7E-01 1E+00 9E-04 7E-02 7E-02 Rastrigin 2 7E-03 9E-01 1E+00 3E-01 1E-02 1E-02 9E-01 2E-01 8E-01 1E+00 4E-02 7E-03 2E-03 5E-01 5E-04 1E+00 1E+00 2E-01 4E-03 9E-03 8E-01 Rosenbrock 2 1E-01 8E-01 2E-01 5E-01 6E-03 1E-01 3E-02 3E-01 9E-01 8E-01 9E-01 6E-03 1E-01 1E-01 8E-01 9E-01 9E-01 1E+00 9E-03 1E-01 8E-02 Styblinski-Tang 2 9E-01 9E-01 9E-01 9E-01 9E-01 9E-01 2E-05 3E-01 5E-01 1E-01 9E-01 2E-03 5E-02 5E-06 1E-01 6E-01 6E-02 1E+00 2E-03 1E-02 5E-06 Shekel 4 5E-03 4E-01 4E-04 4E-02 4E-04 6E-03 9E-01 4E-06 4E-04 8E-10 2E-05 1E-08 7E-09 5E-03 6E-04 9E-02 1E-04 1E-02 2E-04 6E-04 4E-01 Hartmann 6 8E-04 7E-02 1E-06 7E-02 4E-05 5E-04 1E-04 7E-03 7E-02 1E-06 7E-02 4E-05 5E-04 1E-04 3E-04 9E-01 2E-06 9E-01 1E-04 1E-03 3E-04 Cosine 8 6E-05 1E-06 1E+00 1E-03 8E-04 1E-03 6E-05 2E-05 1E-06 1E+00 2E-03 9E-04 1E-03 8E-05 5E-01 1E+00 1E+00 1E+00 1E+00 1E+00 1E+00 Ackley 10 4E-05 3E-05 1E+00 3E-06 5E-09 1E-06 8E-05 6E-05 2E-04 1E+00 5E-06 5E-10 8E-07 1E-04 8E-05 1E-01 1E+00 8E-04 4E-09 1E-05 1E-02 Levy 10 5E-05 1E-05 1E-07 6E-03 5E-05 7E-04 3E-03 1E-03 3E-03 1E-01 9E-01 8E-04 5E-03 7E-01 2E-04 1E-03 2E-01 8E-01 1E-04 5E-04 7E-01 Powell 10 3E-03 1E-03 1E-04 3E-04 3E-03 1E-03 1E-01 2E-03 9E-03 6E-04 6E-01 4E-03 2E-03 9E-01 1E-05 9E-01 7E-01 1E+00 4E-02 3E-01 1E+00 Rastrigin 10 2E-01 4E-01 1E+00 2E-01 6E-02 6E-05 1E+00 4E-01 6E-02 1E+00 5E-03 5E-03 4E-05 1E+00 1E-04 7E-05 1E+00 3E-04 8E-06 4E-05 1E+00 Rosenbrock 10 5E-03 2E-03 6E-06 1E-01 1E-03 8E-04 3E-01 5E-03 3E-03 5E-06 1E+00 4E-03 4E-03 8E-01 4E-03 1E+00 1E-01 1E+00 1E+00 1E+00 1E+00 Styblinski-Tang 10 6E-06 1E-02 3E-04 8E-01 2E-09 2E-06 3E-03 1E-05 1E-02 2E-04 8E-01 3E-09 3E-06 3E-03 8E-04 1E+00 1E+00 1E+00 2E-04 3E-01 1E+00 Ackley 20 8E-04 7E-04 1E+00 9E-03 4E-08 1E-07 3E-06 7E-05 1E-04 1E+00 1E-03 6E-09 2E-07 1E-06 9E-06 4E-02 1E+00 2E-01 2E-08 1E-06 8E-06 Levy 20 6E-02 2E-01 1E+00 8E-01 2E-02 1E-03 1E-01 1E-02 2E-01 1E+00 1E+00 3E-02 1E-03 2E-01 2E-02 1E+00 1E+00 1E+00 4E-01 1E-02 9E-01 Powell 20 3E-01 1E-02 1E+00 1E+00 2E-02 7E-03 1E+00 2E-03 1E-02 6E-01 6E-01 9E-03 5E-03 8E-01 2E-01 1E+00 1E+00 1E+00 1E+00 6E-01 1E+00 Rastrigin 20 8E-01 1E+00 1E+00 9E-01 7E-01 5E-01 1E+00 6E-01 2E-01 1E+00 1E-01 2E-02 2E-02 9E-01 1E-04 1E-02 1E+00 2E-02 2E-03 2E-03 6E-01 Rosenbrock 20 6E-04 1E-02 2E-05 1E+00 9E-03 1E-02 3E-01 4E-05 3E-04 1E-05 8E-01 8E-03 6E-03 1E-01 8E-01 1E+00 1E+00 1E+00 1E+00 1E+00 1E+00 Styblinski-Tang 20 1E-08 5E-04 2E-05 2E-03 3E-10 8E-07 1E-04 2E-02 6E-02 6E-05 1E-01 3E-08 2E-05 3E-02 5E-05 1E+00 1E-01 1E+00 5E-04 5E-01 1E+00 Ackley 50 1E+00 1E+00 1E+00 1E+00 6E-01 3E-01 4E-01 1E+00 1E+00 1E+00 1E+00 3E-01 6E-02 8E-02 2E-04 4E-03 1E+00 2E-04 1E-09 2E-07 6E-10 Levy 50 6E-01 7E-01 1E+00 7E-01 3E-02 8E-02 9E-05 2E-01 4E-02 1E+00 8E-02 8E-06 2E-02 3E-08 4E-01 9E-01 1E+00 9E-01 6E-06 1E-01 2E-06 Powell 50 9E-01 8E-01 1E+00 6E-01 5E-01 1E-01 1E-02 2E-04 3E-04 1E+00 8E-03 2E-06 9E-05 1E-06 1E+00 1E+00 1E+00 7E-01 8E-01 9E-02 1E-02 Rastrigin 50 1E+00 1E+00 1E+00 1E+00 1E+00 4E-01 1E+00 5E-04 3E-01 1E+00 3E-02 5E-02 1E-04 1E-01 1E+00 1E+00 1E+00 2E-01 8E-01 4E-03 8E-01 Rosenbrock 50 3E-01 9E-01 9E-01 1E+00 6E-01 8E-01 2E-02 9E-01 8E-03 3E-05 4E-01 1E-02 2E-02 4E-04 3E-01 2E-01 4E-02 6E-01 2E-02 1E-01 3E-04 Styblinski-Tang 50 1E-04 9E-01 2E-04 1E+00 3E-07 3E-02 3E-03 9E-01 1E-01 4E-05 8E-01 9E-10 3E-04 1E-05 2E-01 1E+00 8E-02 1E+00 2E-03 9E-01 9E-01 Ackley 100 1E+00 1E+00 2E-02 6E-01 5E-01 7E-01 7E-02 1E+00 1E+00 6E-02 8E-01 6E-01 7E-01 2E-01 4E-02 5E-04 3E-15 3E-07 2E-05 2E-04 2E-15 Emb. Hartmann 6 100 7E-02 1E-02 4E-04 1E-01 2E-05 3E-05 3E-01 2E-03 2E-02 1E-04 4E-02 4E-05 5E-05 3E-02 4E-01 2E-01 8E-05 4E-01 6E-05 8E-05 7E-01 Levy 100 1E+00 1E+00 2E-05 9E-01 3E-01 9E-01 6E-03 2E-02 2E-01 6E-06 1E-03 3E-02 2E-01 3E-09 1E+00 8E-01 9E-06 3E-02 4E-02 3E-01 5E-10 Powell 100 1E+00 1E+00 1E-03 1E+00 3E-03 5E-03 1E+00 9E-01 8E-01 2E-04 8E-01 4E-07 7E-05 1E-03 7E-01 3E-02 2E-04 2E-02 1E-05 2E-05 2E-08 Rastrigin 100 2E-03 1E-01 1E-01 1E+00 5E-04 4E-06 6E-05 8E-01 9E-01 3E-01 1E+00 2E-03 2E-02 4E-01 7E-02 1E+00 2E-01 1E+00 3E-04 9E-05 6E-05 Rosenbrock 100 1E+00 1E+00 3E-04 1E+00 1E+00 1E+00 9E-01 2E-01 7E-01 2E-04 2E-02 4E-01 2E-01 6E-08 4E-01 5E-02 1E-04 2E-02 3E-01 1E-01 5E-08 Styblinski-Tang 100 3E-06 5E-01 8E-04 1E+00 4E-08 4E-08 3E-06 9E-01 9E-01 9E-03 1E+00 3E-05 3E-05 3E-02 3E-03 1E+00 2E-02 1E+00 4E-04 1E-03 4E-01 Combined 4E-16 2E-02 8E-02 1E+00 6E-47 1E-40 2E-18 1E-20 2E-12 7E-06 1E-03 2E-63 3E-51 2E-36 2E-21 1E+00 1E+00 1E+00 4E-44 4E-27 2E-23 Table A4: Paired t-test p-values for the results of max BEEBO in Table 2. The combined p-value was computed using Fisher s method. P-values smaller than 0.05 are indicated in bold. max BEEBO T =0.05 max BEEBO T =0.5 max BEEBO T =5.0 Problem d q-UCB q-EI TS KB GIBBON GIBBON (s) Tu RBO q-UCB q-EI TS KB GIBBON GIBBON (s) Tu RBO q-UCB q-EI TS KB GIBBON GIBBON (s) Tu RBO Ackley 2 2E-01 7E-01 1E+00 5E-01 3E-03 2E-03 4E-01 2E-01 7E-01 9E-01 5E-01 1E-02 4E-02 5E-01 5E-01 5E-01 1E+00 1E-01 4E-03 5E-03 3E-01 Levy 2 8E-01 9E-01 3E-01 1E+00 7E-04 5E-02 7E-02 3E-01 9E-01 2E-01 1E+00 5E-04 4E-02 7E-02 5E-01 1E+00 9E-01 1E+00 3E-03 9E-02 7E-02 Rastrigin 2 2E-04 7E-01 1E+00 8E-03 5E-03 2E-03 3E-01 5E-01 1E+00 1E+00 2E-01 1E-02 4E-03 9E-01 3E-05 7E-01 1E+00 2E-02 6E-03 1E-03 2E-01 Rosenbrock 2 8E-02 5E-01 2E-01 3E-01 6E-03 9E-02 3E-02 2E-01 6E-01 3E-01 4E-01 5E-03 9E-02 4E-02 9E-01 9E-01 1E+00 1E+00 1E-02 2E-01 2E-01 Styblinski-Tang 2 4E-01 2E-01 7E-04 1E+00 2E-03 1E-02 5E-06 5E-01 8E-01 2E-01 1E+00 2E-03 3E-02 5E-06 4E-01 8E-01 6E-01 9E-01 4E-03 1E-01 5E-06 Shekel 4 1E-01 1E+00 2E-04 8E-01 5E-04 3E-01 1E+00 3E-01 9E-01 5E-03 5E-01 2E-02 1E-01 1E+00 3E-02 9E-01 4E-03 3E-01 7E-03 4E-02 1E+00 Hartmann 6 9E-01 1E+00 2E-05 1E+00 4E-03 3E-01 1E+00 1E-01 8E-01 1E-05 8E-01 2E-04 2E-03 2E-02 9E-04 9E-01 7E-06 9E-01 2E-04 3E-03 1E-02 Cosine 8 6E-05 1E-06 1E+00 1E-03 8E-04 1E-03 7E-05 1E-03 2E-05 1E+00 1E+00 4E-03 2E-02 8E-04 2E-03 1E-02 1E+00 1E+00 6E-01 9E-01 9E-01 Ackley 10 2E-01 2E-01 1E+00 2E-04 2E-08 1E-05 2E-02 9E-01 1E+00 1E+00 7E-01 6E-08 6E-05 9E-01 2E-01 1E+00 1E+00 1E+00 1E-04 5E-01 1E+00 Levy 10 5E-04 3E-03 9E-02 9E-01 6E-04 5E-03 8E-01 3E-03 4E-03 9E-01 1E+00 7E-04 5E-03 1E+00 1E-03 9E-01 1E+00 1E+00 1E-01 1E-01 1E+00 Powell 10 8E-02 4E-02 1E-03 9E-01 2E-03 1E-03 1E+00 4E-02 4E-02 2E-03 9E-01 3E-03 2E-03 1E+00 9E-06 8E-01 7E-02 1E+00 1E-02 5E-02 1E+00 Rastrigin 10 3E-04 1E-02 1E+00 3E-03 6E-04 9E-05 1E+00 2E-01 2E-03 1E+00 1E-04 1E-03 1E-04 1E+00 2E-04 5E-03 1E+00 4E-03 1E-04 7E-05 9E-01 Rosenbrock 10 1E-02 4E-03 6E-06 1E+00 6E-03 6E-03 8E-01 1E-02 3E-01 4E-06 1E+00 9E-03 2E-02 1E+00 1E-03 8E-01 4E-06 1E+00 4E-01 7E-01 1E+00 Styblinski-Tang 10 7E-06 5E-02 2E-04 1E+00 5E-10 4E-06 2E-03 7E-03 9E-01 8E-03 1E+00 2E-07 8E-05 6E-01 1E-02 1E+00 1E+00 1E+00 9E-03 6E-01 1E+00 Ackley 20 1E-04 6E-04 1E+00 9E-03 1E-08 3E-07 2E-06 1E-02 2E-02 1E+00 9E-02 4E-08 7E-07 2E-06 9E-01 1E+00 1E+00 1E+00 6E-04 2E-02 1E+00 Levy 20 1E-01 1E-01 1E+00 8E-01 6E-02 9E-04 1E-01 3E-01 1E+00 1E+00 1E+00 3E-01 2E-03 8E-01 2E-02 9E-01 1E+00 1E+00 2E-01 3E-03 7E-01 Powell 20 9E-01 4E-01 1E+00 1E+00 6E-02 2E-02 1E+00 5E-01 9E-01 1E+00 1E+00 5E-01 2E-02 1E+00 3E-03 9E-01 1E+00 1E+00 4E-01 1E-02 1E+00 Rastrigin 20 5E-03 7E-02 1E+00 3E-02 1E-02 3E-03 9E-01 6E-01 1E-01 1E+00 1E-01 4E-02 1E-02 9E-01 9E-05 1E-02 1E+00 2E-02 1E-03 5E-03 4E-01 Rosenbrock 20 5E-04 1E-02 1E-05 1E+00 8E-03 9E-03 4E-01 4E-01 9E-01 4E-06 1E+00 2E-02 2E-01 1E+00 3E-01 1E+00 9E-01 1E+00 9E-01 1E+00 1E+00 Styblinski-Tang 20 5E-08 1E-01 9E-05 3E-01 2E-09 2E-06 1E-01 1E+00 1E+00 1E-02 1E+00 4E-06 7E-02 1E+00 4E-07 1E+00 1E-01 1E+00 1E-05 9E-01 1E+00 Ackley 50 6E-01 9E-01 1E+00 2E-01 2E-02 9E-04 8E-04 1E+00 1E+00 1E+00 3E-01 7E-04 3E-03 4E-04 1E+00 1E+00 1E+00 9E-01 5E-04 3E-04 5E-05 Levy 50 2E-02 2E-02 1E+00 8E-03 3E-06 1E-02 1E-07 3E-01 1E-01 1E+00 4E-01 8E-07 4E-02 4E-07 1E+00 1E+00 1E+00 1E+00 5E-01 7E-01 5E-05 Powell 50 4E-02 3E-02 1E+00 5E-02 2E-04 7E-04 8E-06 8E-02 2E-01 1E+00 9E-02 6E-03 1E-03 5E-06 1E+00 1E+00 1E+00 1E+00 1E+00 5E-01 4E-03 Rastrigin 50 1E-02 4E-01 1E+00 2E-02 4E-02 6E-04 2E-01 2E-04 5E-01 1E+00 1E-02 8E-02 3E-04 2E-01 1E+00 1E+00 1E+00 1E-01 7E-01 5E-04 7E-01 Rosenbrock 50 7E-03 9E-01 7E-01 1E+00 2E-01 7E-01 5E-03 1E+00 2E-01 2E-02 8E-01 4E-02 2E-01 7E-04 1E+00 1E+00 1E+00 1E+00 1E+00 1E+00 3E-01 Styblinski-Tang 50 5E-05 1E+00 2E-03 1E+00 3E-07 5E-01 2E-01 1E+00 1E+00 8E-02 1E+00 4E-05 1E+00 1E+00 8E-01 1E+00 1E-01 1E+00 6E-05 1E+00 1E+00 Ackley 100 1E+00 1E+00 3E-03 2E-01 6E-02 1E-01 1E-02 9E-01 1E+00 9E-05 2E-02 1E-02 1E-01 4E-04 1E+00 9E-01 1E-09 8E-07 2E-04 2E-03 1E-08 Emb. Hartmann 6 100 3E-02 4E-02 7E-05 9E-02 2E-05 3E-05 2E-01 3E-01 1E-01 2E-04 2E-01 2E-04 2E-04 5E-01 6E-01 2E-01 3E-04 6E-01 3E-04 4E-04 9E-01 Levy 100 5E-03 7E-02 6E-06 1E-03 2E-02 2E-01 5E-10 1E-01 6E-01 7E-06 1E-02 4E-02 3E-01 1E-09 1E+00 9E-01 7E-06 2E-02 4E-02 3E-01 2E-09 Powell 100 6E-02 6E-03 2E-04 9E-03 1E-05 2E-05 7E-09 7E-01 7E-02 2E-04 2E-02 1E-05 2E-05 9E-09 1E+00 1E-01 2E-04 3E-02 1E-05 2E-05 5E-08 Rastrigin 100 6E-01 8E-01 2E-01 1E+00 1E-02 4E-03 6E-02 2E-01 1E+00 2E-01 1E+00 1E-03 9E-05 3E-04 2E-01 1E+00 2E-01 1E+00 2E-03 1E-04 2E-04 Rosenbrock 100 8E-01 8E-01 1E-04 2E-01 6E-01 4E-01 4E-03 1E-01 2E-01 1E-04 1E-02 3E-01 1E-01 3E-08 9E-01 9E-01 1E-04 4E-02 4E-01 2E-01 2E-07 Styblinski-Tang 100 3E-04 1E+00 6E-03 1E+00 2E-05 5E-06 1E-01 1E+00 1E+00 4E-02 1E+00 5E-04 5E-04 9E-01 1E-01 1E+00 1E-01 1E+00 4E-03 2E-03 1E+00 Combined 1E-27 3E-04 1E+00 1E+00 6E-69 1E-44 1E-24 9E-02 1E+00 1E+00 1E+00 3E-51 9E-36 3E-16 1E-04 1E+00 1E+00 1E+00 8E-31 6E-16 3E-03 D.2 Results for batch sizes 5 and 10 Table A5: BO on noise-free synthetic test problems. The normalized highest observed value after 10 rounds of BO with q=5 is shown. Colors are normalized row-wise. Higher means better. Results are means over ten replicate runs. Problem d mean BEEBO max BEEBO q-UCB q-EI TS KB GIBBON Tu RBO T =0.05 T =0.5 T =5.0 T =0.05 T =0.5 T =5.0 κ=0.1 κ=1.0 κ=10.0 - - - default - Ackley 2 0.116 0.249 0.240 0.358 0.903 0.052 0.865 0.073 0.865 0.044 0.849 0.049 0.793 0.096 0.919 0.062 0.785 0.084 0.885 0.042 0.555 0.312 0.865 0.063 0.799 0.091 0.550 0.409 Levy 2 0.800 0.198 0.880 0.186 0.995 0.005 0.994 0.008 0.998 0.003 0.995 0.009 0.997 0.004 0.996 0.008 0.992 0.006 0.997 0.004 0.912 0.128 0.998 0.006 0.975 0.018 0.821 0.208 Rastrigin 2 0.513 0.210 0.493 0.289 0.920 0.038 0.875 0.075 0.924 0.044 0.851 0.095 0.842 0.088 0.950 0.029 0.839 0.136 0.871 0.129 0.701 0.163 0.875 0.087 0.808 0.096 0.509 0.275 Rosenbrock 2 0.955 0.134 0.973 0.056 0.976 0.029 0.961 0.097 0.984 0.019 0.829 0.331 0.926 0.224 0.894 0.314 0.892 0.314 0.912 0.228 0.763 0.324 0.889 0.313 0.872 0.308 0.884 0.312 Styblinski-Tang 2 0.594 0.295 0.705 0.316 0.897 0.106 0.669 0.284 0.850 0.161 0.893 0.167 0.837 0.227 0.898 0.168 0.913 0.138 0.948 0.068 0.692 0.298 0.909 0.131 0.866 0.108 0.422 0.313 Shekel 4 0.198 0.046 0.236 0.087 0.194 0.071 0.167 0.052 0.177 0.079 0.112 0.065 0.131 0.065 0.260 0.069 0.103 0.041 0.196 0.072 0.131 0.038 0.160 0.053 0.142 0.085 0.175 0.075 Hartmann 6 0.950 0.040 0.948 0.044 0.880 0.097 0.935 0.046 0.862 0.204 0.681 0.255 0.795 0.118 0.954 0.037 0.486 0.207 0.944 0.040 0.792 0.106 0.950 0.039 0.905 0.075 0.916 0.049 Cosine 8 0.553 0.101 0.701 0.093 0.847 0.058 0.611 0.105 0.702 0.079 0.663 0.051 0.698 0.119 0.824 0.054 0.793 0.089 0.759 0.064 0.722 0.322 0.741 0.085 0.685 0.084 0.624 0.110 Ackley 10 0.053 0.068 0.272 0.295 0.537 0.162 0.143 0.146 0.425 0.228 0.290 0.157 0.191 0.104 0.620 0.061 0.228 0.117 0.597 0.068 0.348 0.451 0.532 0.127 0.530 0.081 0.310 0.129 Levy 10 0.595 0.222 0.570 0.291 0.910 0.093 0.695 0.086 0.802 0.115 0.839 0.081 0.848 0.127 0.847 0.111 0.855 0.119 0.826 0.155 0.558 0.289 0.817 0.097 0.844 0.077 0.640 0.215 Powell 10 0.938 0.041 0.932 0.045 0.925 0.048 0.936 0.032 0.924 0.028 0.797 0.258 0.844 0.146 0.932 0.040 0.656 0.233 0.940 0.029 0.600 0.323 0.933 0.035 0.795 0.153 0.898 0.076 Rastrigin 10 0.377 0.118 0.331 0.133 0.375 0.101 0.276 0.097 0.292 0.095 0.357 0.094 0.400 0.085 0.413 0.168 0.269 0.142 0.406 0.065 0.231 0.152 0.426 0.032 0.400 0.087 0.365 0.156 Rosenbrock 10 0.967 0.030 0.984 0.018 0.987 0.009 0.974 0.020 0.971 0.015 0.964 0.022 0.965 0.057 0.984 0.014 0.974 0.014 0.976 0.017 0.776 0.224 0.973 0.021 0.944 0.043 0.962 0.041 Styblinski-Tang 10 0.631 0.135 0.662 0.106 0.270 0.170 0.623 0.136 0.552 0.131 0.166 0.176 0.394 0.117 0.594 0.111 0.139 0.153 0.632 0.133 0.261 0.191 0.602 0.192 0.258 0.222 0.400 0.105 Ackley 20 0.019 0.011 0.044 0.062 0.354 0.175 0.059 0.061 0.182 0.173 0.148 0.086 0.276 0.112 0.231 0.208 0.186 0.092 0.199 0.141 0.109 0.313 0.172 0.147 0.303 0.161 0.052 0.024 Levy 20 0.571 0.155 0.608 0.175 0.795 0.082 0.663 0.118 0.681 0.132 0.760 0.127 0.788 0.098 0.663 0.145 0.780 0.068 0.665 0.112 0.373 0.214 0.660 0.148 0.751 0.110 0.567 0.158 Powell 20 0.864 0.062 0.889 0.061 0.902 0.060 0.872 0.075 0.874 0.053 0.854 0.101 0.920 0.048 0.894 0.060 0.812 0.125 0.882 0.076 0.460 0.180 0.894 0.049 0.891 0.071 0.764 0.134 Rastrigin 20 0.237 0.121 0.256 0.103 0.272 0.127 0.217 0.120 0.239 0.119 0.266 0.143 0.311 0.124 0.223 0.111 0.257 0.091 0.225 0.111 0.109 0.083 0.227 0.125 0.308 0.047 0.167 0.095 Rosenbrock 20 0.923 0.012 0.958 0.011 0.964 0.021 0.924 0.026 0.942 0.024 0.932 0.034 0.928 0.023 0.958 0.012 0.942 0.026 0.941 0.036 0.368 0.257 0.960 0.018 0.932 0.030 0.891 0.037 Styblinski-Tang 20 0.576 0.080 0.529 0.125 0.401 0.085 0.529 0.078 0.487 0.079 0.271 0.092 0.275 0.092 0.558 0.083 0.180 0.110 0.497 0.101 0.130 0.088 0.591 0.075 0.371 0.096 0.258 0.077 Ackley 50 0.012 0.011 0.009 0.007 0.024 0.016 0.013 0.006 0.013 0.005 0.023 0.011 0.014 0.012 0.014 0.007 0.070 0.030 0.043 0.038 0.008 0.007 0.017 0.009 0.029 0.014 0.044 0.010 Levy 50 0.464 0.086 0.411 0.145 0.347 0.119 0.393 0.122 0.363 0.143 0.382 0.116 0.241 0.114 0.374 0.161 0.504 0.158 0.419 0.153 0.191 0.078 0.486 0.135 0.450 0.087 0.501 0.102 Powell 50 0.708 0.073 0.732 0.056 0.716 0.077 0.634 0.098 0.694 0.051 0.745 0.076 0.416 0.141 0.711 0.069 0.786 0.079 0.678 0.135 0.292 0.251 0.783 0.055 0.804 0.068 0.697 0.089 Rastrigin 50 0.206 0.090 0.209 0.061 0.132 0.058 0.177 0.060 0.190 0.058 0.170 0.061 0.072 0.039 0.187 0.058 0.277 0.085 0.185 0.071 0.076 0.057 0.162 0.083 0.210 0.083 0.262 0.046 Rosenbrock 50 0.713 0.083 0.751 0.077 0.698 0.117 0.640 0.114 0.705 0.067 0.679 0.094 0.430 0.158 0.768 0.065 0.819 0.076 0.724 0.116 0.257 0.163 0.766 0.057 0.777 0.076 0.684 0.061 Styblinski-Tang 50 0.428 0.060 0.392 0.069 0.384 0.079 0.382 0.048 0.351 0.060 0.234 0.067 0.203 0.074 0.368 0.055 0.287 0.124 0.336 0.057 0.133 0.069 0.414 0.038 0.293 0.054 0.269 0.077 Ackley 100 0.042 0.013 0.035 0.016 0.079 0.021 0.037 0.011 0.060 0.012 0.081 0.021 0.048 0.021 0.051 0.022 0.180 0.041 0.069 0.021 0.005 0.004 0.042 0.011 0.013 0.005 0.028 0.006 Emb. Hartmann 6 100 0.554 0.216 0.577 0.198 0.645 0.225 0.561 0.234 0.413 0.197 0.649 0.112 0.547 0.227 0.566 0.212 0.693 0.125 0.678 0.163 0.411 0.161 0.703 0.155 0.485 0.163 0.765 0.163 Levy 100 0.522 0.069 0.545 0.053 0.598 0.053 0.595 0.050 0.589 0.101 0.680 0.082 0.511 0.101 0.599 0.064 0.796 0.041 0.611 0.092 0.115 0.072 0.600 0.051 0.216 0.063 0.398 0.075 Powell 100 0.660 0.071 0.643 0.103 0.810 0.076 0.657 0.063 0.725 0.057 0.860 0.073 0.626 0.186 0.679 0.085 0.938 0.014 0.673 0.089 0.228 0.194 0.725 0.088 0.303 0.150 0.654 0.107 Rastrigin 100 0.259 0.046 0.256 0.058 0.304 0.032 0.280 0.052 0.290 0.052 0.323 0.028 0.263 0.049 0.290 0.056 0.409 0.029 0.304 0.056 0.060 0.041 0.279 0.035 0.119 0.040 0.188 0.039 Rosenbrock 100 0.610 0.085 0.600 0.096 0.714 0.058 0.556 0.127 0.617 0.136 0.748 0.044 0.528 0.142 0.652 0.091 0.885 0.029 0.643 0.077 0.212 0.108 0.712 0.061 0.364 0.090 0.552 0.102 Styblinski-Tang 100 0.354 0.065 0.356 0.061 0.331 0.043 0.332 0.053 0.318 0.070 0.321 0.054 0.267 0.062 0.331 0.049 0.324 0.047 0.303 0.037 0.095 0.048 0.350 0.042 0.148 0.067 0.275 0.077 Mean 0.514 0.537 0.609 0.553 0.578 0.558 0.525 0.612 0.577 0.605 0.354 0.612 0.533 0.500 Median 0.554 0.570 0.698 0.611 0.617 0.679 0.511 0.652 0.693 0.665 0.261 0.703 0.485 0.509 Table A6: BO on noise-free synthetic test problems. The normalized highest observed value after 10 rounds of BO with q=10 is shown. Colors are normalized row-wise. Higher means better. Results are means over ten replicate runs. Problem d mean BEEBO max BEEBO q-UCB q-EI TS KB GIBBON Tu RBO T =0.05 T =0.5 T =5.0 T =0.05 T =0.5 T =5.0 κ=0.1 κ=1.0 κ=10.0 - - - default - Ackley 2 0.154 0.307 0.466 0.334 0.913 0.062 0.861 0.119 0.936 0.045 0.939 0.038 0.875 0.059 0.951 0.021 0.869 0.060 0.940 0.048 0.755 0.293 0.909 0.057 0.775 0.082 0.575 0.411 Levy 2 0.908 0.124 0.998 0.001 0.997 0.003 0.999 0.001 0.999 0.001 0.996 0.004 0.998 0.002 0.997 0.004 0.988 0.015 0.997 0.003 0.919 0.083 0.998 0.002 0.985 0.013 0.844 0.154 Rastrigin 2 0.545 0.307 0.684 0.238 0.953 0.038 0.942 0.053 0.932 0.048 0.911 0.034 0.911 0.043 0.936 0.033 0.839 0.096 0.939 0.034 0.731 0.255 0.913 0.054 0.902 0.070 0.726 0.300 Rosenbrock 2 0.976 0.048 0.979 0.040 0.883 0.313 0.992 0.009 0.961 0.100 0.987 0.014 0.955 0.118 0.977 0.035 0.989 0.014 0.972 0.047 0.863 0.312 0.994 0.008 0.837 0.305 0.829 0.303 Styblinski-Tang 2 0.441 0.263 0.983 0.027 0.941 0.065 0.991 0.011 0.991 0.013 0.964 0.052 0.989 0.012 0.998 0.003 0.984 0.020 0.997 0.003 0.836 0.186 0.983 0.021 0.911 0.063 0.266 0.187 Shekel 4 0.287 0.099 0.257 0.071 0.250 0.103 0.245 0.073 0.265 0.096 0.145 0.037 0.151 0.051 0.266 0.064 0.101 0.028 0.200 0.054 0.132 0.049 0.197 0.103 0.136 0.068 0.300 0.131 Hartmann 6 0.968 0.035 0.968 0.034 0.929 0.062 0.966 0.046 0.949 0.046 0.880 0.081 0.917 0.054 0.964 0.037 0.626 0.184 0.956 0.047 0.855 0.025 0.964 0.037 0.887 0.043 0.968 0.028 Cosine 8 0.728 0.100 0.821 0.106 0.775 0.095 0.747 0.079 0.756 0.088 0.679 0.104 0.763 0.059 0.785 0.083 0.817 0.082 0.753 0.058 1.000 0.000 0.743 0.063 0.626 0.048 0.797 0.083 Ackley 10 0.105 0.075 0.505 0.252 0.718 0.088 0.256 0.176 0.525 0.251 0.276 0.113 0.559 0.078 0.731 0.040 0.226 0.094 0.642 0.045 1.000 0.000 0.675 0.062 0.437 0.084 0.648 0.056 Levy 10 0.833 0.093 0.836 0.099 0.950 0.020 0.828 0.155 0.905 0.039 0.866 0.039 0.897 0.062 0.914 0.065 0.927 0.034 0.928 0.043 0.955 0.052 0.864 0.080 0.722 0.130 0.773 0.141 Powell 10 0.961 0.028 0.949 0.037 0.898 0.042 0.933 0.046 0.906 0.090 0.877 0.068 0.927 0.052 0.928 0.055 0.846 0.118 0.940 0.043 0.798 0.139 0.960 0.020 0.741 0.217 0.962 0.020 Rastrigin 10 0.397 0.148 0.277 0.123 0.378 0.101 0.274 0.157 0.348 0.162 0.325 0.159 0.389 0.147 0.418 0.100 0.266 0.151 0.386 0.075 0.759 0.396 0.421 0.069 0.318 0.128 0.449 0.112 Rosenbrock 10 0.985 0.011 0.982 0.014 0.978 0.009 0.979 0.013 0.967 0.023 0.957 0.032 0.948 0.042 0.976 0.017 0.991 0.007 0.979 0.012 0.930 0.050 0.984 0.010 0.913 0.104 0.989 0.013 Styblinski-Tang 10 0.689 0.127 0.706 0.133 0.274 0.182 0.675 0.149 0.455 0.134 0.182 0.153 0.408 0.140 0.597 0.154 0.105 0.088 0.624 0.128 0.220 0.163 0.660 0.168 0.170 0.130 0.585 0.143 Ackley 20 0.032 0.029 0.198 0.290 0.600 0.148 0.058 0.038 0.309 0.233 0.229 0.080 0.397 0.085 0.685 0.062 0.337 0.082 0.582 0.037 0.111 0.313 0.576 0.077 0.365 0.067 0.173 0.106 Levy 20 0.714 0.091 0.656 0.146 0.895 0.047 0.720 0.094 0.864 0.061 0.798 0.054 0.892 0.043 0.831 0.055 0.879 0.060 0.774 0.067 0.291 0.185 0.646 0.086 0.870 0.063 0.682 0.103 Powell 20 0.889 0.098 0.919 0.055 0.899 0.061 0.890 0.072 0.911 0.041 0.897 0.038 0.908 0.074 0.924 0.031 0.848 0.082 0.914 0.051 0.355 0.213 0.931 0.036 0.875 0.073 0.839 0.060 Rastrigin 20 0.243 0.101 0.213 0.115 0.352 0.101 0.278 0.078 0.323 0.082 0.326 0.080 0.425 0.093 0.335 0.079 0.300 0.066 0.353 0.074 0.087 0.075 0.278 0.127 0.330 0.095 0.224 0.122 Rosenbrock 20 0.952 0.016 0.973 0.015 0.989 0.005 0.961 0.020 0.975 0.011 0.945 0.025 0.957 0.020 0.979 0.006 0.974 0.011 0.973 0.010 0.434 0.307 0.979 0.006 0.964 0.021 0.914 0.066 Styblinski-Tang 20 0.626 0.115 0.596 0.093 0.381 0.111 0.601 0.089 0.519 0.120 0.262 0.107 0.272 0.065 0.570 0.108 0.186 0.112 0.562 0.094 0.131 0.101 0.611 0.094 0.348 0.072 0.391 0.080 Ackley 50 0.020 0.014 0.011 0.006 0.035 0.031 0.013 0.012 0.017 0.011 0.038 0.006 0.153 0.161 0.032 0.017 0.342 0.128 0.153 0.088 0.012 0.006 0.019 0.005 0.093 0.065 0.065 0.012 Levy 50 0.473 0.087 0.578 0.169 0.463 0.117 0.490 0.066 0.403 0.113 0.541 0.150 0.647 0.180 0.439 0.111 0.760 0.068 0.609 0.137 0.284 0.257 0.506 0.094 0.518 0.131 0.523 0.082 Powell 50 0.758 0.073 0.785 0.059 0.851 0.037 0.706 0.090 0.777 0.084 0.752 0.102 0.800 0.148 0.834 0.061 0.919 0.025 0.790 0.069 0.367 0.104 0.853 0.036 0.843 0.053 0.766 0.054 Rastrigin 50 0.164 0.042 0.173 0.040 0.200 0.043 0.186 0.035 0.172 0.062 0.186 0.088 0.231 0.129 0.206 0.056 0.306 0.096 0.270 0.078 0.085 0.051 0.157 0.090 0.169 0.069 0.300 0.055 Rosenbrock 50 0.822 0.025 0.834 0.063 0.903 0.034 0.743 0.055 0.789 0.085 0.692 0.074 0.892 0.068 0.875 0.049 0.949 0.017 0.882 0.033 0.423 0.119 0.861 0.034 0.893 0.021 0.787 0.090 Styblinski-Tang 50 0.449 0.032 0.454 0.061 0.434 0.067 0.444 0.069 0.427 0.047 0.325 0.067 0.263 0.083 0.455 0.035 0.285 0.060 0.432 0.047 0.167 0.215 0.529 0.039 0.426 0.059 0.287 0.094 Ackley 100 0.050 0.022 0.050 0.014 0.180 0.075 0.039 0.010 0.044 0.020 0.134 0.034 0.096 0.042 0.066 0.018 0.350 0.090 0.145 0.069 0.008 0.004 0.052 0.016 0.016 0.006 0.044 0.011 Emb. Hartmann 6 100 0.720 0.133 0.873 0.102 0.868 0.083 0.732 0.166 0.745 0.186 0.735 0.168 0.688 0.251 0.825 0.153 0.845 0.132 0.692 0.258 0.463 0.194 0.845 0.163 0.536 0.231 0.845 0.087 Levy 100 0.532 0.132 0.677 0.066 0.722 0.111 0.633 0.040 0.608 0.045 0.745 0.063 0.595 0.152 0.676 0.066 0.855 0.027 0.664 0.097 0.143 0.051 0.691 0.052 0.284 0.062 0.467 0.048 Powell 100 0.699 0.102 0.772 0.063 0.898 0.050 0.701 0.114 0.730 0.081 0.880 0.049 0.717 0.144 0.833 0.051 0.959 0.012 0.805 0.047 0.319 0.137 0.835 0.020 0.443 0.146 0.741 0.060 Rastrigin 100 0.270 0.043 0.357 0.062 0.351 0.032 0.306 0.029 0.320 0.065 0.352 0.035 0.271 0.075 0.336 0.037 0.412 0.031 0.352 0.059 0.054 0.038 0.320 0.028 0.111 0.032 0.271 0.041 Rosenbrock 100 0.701 0.113 0.746 0.058 0.883 0.041 0.679 0.072 0.708 0.084 0.830 0.074 0.668 0.093 0.789 0.060 0.927 0.025 0.742 0.052 0.218 0.152 0.797 0.028 0.383 0.128 0.614 0.046 Styblinski-Tang 100 0.402 0.041 0.364 0.021 0.355 0.047 0.361 0.052 0.347 0.067 0.315 0.037 0.309 0.031 0.361 0.047 0.330 0.054 0.374 0.046 0.106 0.045 0.416 0.045 0.163 0.060 0.305 0.063 Mean 0.560 0.625 0.670 0.613 0.633 0.605 0.632 0.681 0.647 0.676 0.449 0.672 0.545 0.574 Median 0.626 0.684 0.851 0.701 0.730 0.735 0.688 0.789 0.839 0.742 0.355 0.743 0.518 0.614 Table A7: BO on noise-free synthetic test problems. The relative batch instantaneous regret of the last, exploitative batch with q=5 is shown. Colors are normalized row-wise. Lower means better. Results are means over ten replicate runs. Problem d mean BEEBO max BEEBO q-UCB q-EI TS KB GIBBON Tu RBO T =0.05 T =0.5 T =5.0 T =0.05 T =0.5 T =5.0 κ=0.1 κ=1.0 κ=10.0 - - - default - Ackley 2 1.003 0.296 0.916 0.306 0.166 0.097 0.372 0.233 0.336 0.149 0.229 0.091 0.867 0.083 0.836 0.101 0.791 0.130 0.838 0.207 0.631 0.269 0.963 0.119 0.933 0.080 0.622 0.346 Levy 2 0.288 0.220 0.238 0.371 0.037 0.039 0.034 0.051 0.021 0.021 0.027 0.040 1.217 0.686 0.856 0.832 1.212 1.060 0.626 0.535 1.300 2.236 0.255 0.446 1.409 0.605 0.219 0.232 Rastrigin 2 0.951 0.343 0.819 0.520 0.396 0.325 0.579 0.320 0.480 0.401 0.487 0.220 1.027 0.344 0.913 0.151 0.988 0.188 0.768 0.268 0.806 0.400 0.958 0.329 1.293 0.372 0.612 0.319 Rosenbrock 2 0.009 0.009 0.018 0.032 0.006 0.007 0.117 0.347 0.003 0.006 0.023 0.027 6.307 14.676 2.837 3.999 3.306 4.617 1.373 2.340 0.162 0.163 0.061 0.119 8.714 13.216 0.005 0.013 Styblinski-Tang 2 0.446 0.490 0.478 0.463 0.354 0.456 0.288 0.265 0.202 0.125 0.306 0.389 0.974 0.852 0.994 0.561 0.954 0.721 1.003 0.785 0.987 0.567 0.779 0.702 1.047 0.362 0.298 0.195 Shekel 4 0.892 0.082 0.817 0.064 0.816 0.077 0.902 0.055 0.879 0.077 0.894 0.069 0.970 0.020 0.938 0.018 0.964 0.019 0.949 0.063 0.952 0.051 0.958 0.079 0.984 0.037 0.848 0.050 Hartmann 6 0.157 0.179 0.089 0.088 0.110 0.093 0.106 0.118 0.135 0.154 0.297 0.264 0.749 0.128 0.781 0.104 0.794 0.161 0.634 0.222 0.467 0.228 0.568 0.489 0.747 0.277 0.175 0.108 Cosine 8 0.425 0.207 0.240 0.079 0.117 0.045 0.301 0.111 0.233 0.066 0.283 0.040 0.719 0.203 0.728 0.212 0.742 0.205 0.454 0.261 0.562 0.139 0.464 0.199 0.994 0.781 0.399 0.120 Ackley 10 0.942 0.078 0.732 0.296 0.456 0.162 0.879 0.092 0.586 0.220 0.695 0.159 0.917 0.052 0.711 0.091 0.893 0.072 0.784 0.180 0.899 0.113 0.672 0.180 0.824 0.113 0.760 0.108 Levy 10 0.420 0.226 0.325 0.185 0.053 0.045 0.273 0.132 0.135 0.060 0.117 0.083 0.715 0.489 0.985 0.500 0.911 0.554 0.434 0.521 0.610 0.269 0.337 0.279 0.720 0.741 0.408 0.249 Powell 10 0.008 0.004 0.009 0.006 0.014 0.013 0.012 0.010 0.022 0.012 0.045 0.047 0.583 0.519 0.748 0.479 0.757 0.848 0.080 0.069 0.535 0.308 0.135 0.258 0.288 0.327 0.026 0.019 Rastrigin 10 0.644 0.111 0.666 0.171 0.583 0.102 0.708 0.092 0.658 0.141 0.648 0.049 0.838 0.081 0.864 0.110 0.881 0.097 0.754 0.105 0.941 0.118 0.755 0.237 0.792 0.131 0.729 0.095 Rosenbrock 10 0.009 0.006 0.004 0.003 0.003 0.002 0.009 0.010 0.014 0.014 0.023 0.021 0.525 0.217 0.614 0.505 0.439 0.279 0.058 0.053 0.451 0.217 0.052 0.047 0.211 0.118 0.043 0.030 Styblinski-Tang 10 0.319 0.220 0.230 0.099 0.522 0.121 0.266 0.117 0.304 0.107 0.601 0.178 0.866 0.242 0.879 0.268 1.208 0.312 1.015 0.783 0.953 0.171 1.043 0.977 0.724 0.222 0.475 0.190 Ackley 20 0.967 0.033 0.946 0.082 0.634 0.183 0.930 0.079 0.807 0.185 0.834 0.093 0.881 0.095 0.906 0.078 0.914 0.046 0.820 0.142 0.981 0.069 0.854 0.125 0.856 0.115 0.965 0.016 Levy 20 0.475 0.205 0.377 0.194 0.159 0.092 0.313 0.220 0.245 0.067 0.200 0.111 0.824 0.475 0.858 0.201 0.669 0.316 0.341 0.103 0.895 0.112 0.315 0.139 0.345 0.131 0.588 0.139 Powell 20 0.053 0.029 0.043 0.021 0.039 0.029 0.054 0.039 0.053 0.031 0.075 0.059 0.758 0.526 1.041 0.665 0.728 0.337 0.058 0.025 1.196 0.463 0.052 0.028 0.111 0.028 0.294 0.235 Rastrigin 20 0.781 0.187 0.738 0.201 0.825 0.122 0.793 0.163 0.726 0.092 0.722 0.158 0.880 0.161 0.970 0.097 0.878 0.156 0.898 0.140 1.044 0.074 0.827 0.142 0.774 0.063 0.872 0.084 Rosenbrock 20 0.037 0.019 0.024 0.015 0.016 0.009 0.036 0.017 0.031 0.020 0.039 0.021 0.521 0.320 0.776 0.334 0.402 0.350 0.043 0.028 0.984 0.206 0.027 0.014 0.446 1.091 0.147 0.089 Styblinski-Tang 20 0.694 0.669 0.574 0.508 0.424 0.093 0.356 0.096 0.380 0.055 0.532 0.090 0.866 0.173 0.897 0.178 0.932 0.234 0.481 0.071 0.990 0.102 1.028 1.281 0.619 0.177 0.741 0.128 Ackley 50 0.989 0.013 0.991 0.006 0.974 0.018 0.989 0.009 0.989 0.004 0.979 0.014 0.998 0.004 0.998 0.004 0.968 0.024 0.967 0.042 1.002 0.004 0.984 0.007 0.972 0.013 0.966 0.009 Levy 50 1.403 1.462 1.022 1.053 0.558 0.117 0.583 0.083 0.637 0.216 0.549 0.100 1.220 0.283 1.253 0.393 0.852 0.184 0.567 0.154 1.072 0.144 0.499 0.129 0.603 0.114 0.559 0.095 Powell 50 0.151 0.026 0.141 0.031 0.150 0.038 0.194 0.055 0.173 0.035 0.151 0.039 1.274 0.495 1.198 0.589 0.798 0.283 0.317 0.317 1.052 0.184 0.126 0.036 0.140 0.047 0.303 0.089 Rastrigin 50 0.913 0.306 0.846 0.139 0.871 0.066 0.823 0.089 0.794 0.076 0.817 0.051 1.039 0.042 0.960 0.040 0.919 0.065 0.900 0.085 1.008 0.042 0.882 0.077 0.839 0.081 0.777 0.061 Rosenbrock 50 0.223 0.085 0.193 0.077 0.236 0.110 0.284 0.102 0.227 0.063 0.257 0.069 1.167 0.286 0.977 0.376 0.697 0.180 0.295 0.197 1.074 0.140 0.198 0.053 0.234 0.075 0.395 0.076 Styblinski-Tang 50 0.499 0.058 0.617 0.278 0.537 0.045 0.540 0.046 0.570 0.062 0.677 0.069 1.178 0.251 1.321 0.360 0.917 0.178 0.664 0.179 0.996 0.064 0.542 0.069 0.726 0.133 0.745 0.061 Ackley 100 0.959 0.016 0.965 0.016 0.920 0.022 0.962 0.013 0.938 0.012 0.915 0.021 0.989 0.007 0.989 0.009 0.957 0.016 0.938 0.022 1.001 0.004 0.961 0.009 0.994 0.005 0.972 0.004 Emb. Hartmann 6 100 0.430 0.214 0.371 0.159 0.348 0.242 0.389 0.209 0.532 0.173 0.362 0.128 0.887 0.071 0.860 0.081 0.851 0.044 0.474 0.231 0.909 0.086 0.300 0.162 0.960 0.084 0.390 0.135 Levy 100 0.465 0.100 0.397 0.052 0.363 0.051 0.345 0.040 0.354 0.087 0.301 0.077 1.039 0.208 0.952 0.070 0.853 0.024 0.362 0.096 1.024 0.063 0.362 0.045 0.882 0.094 0.588 0.049 Powell 100 0.231 0.064 0.233 0.066 0.139 0.049 0.224 0.046 0.181 0.036 0.099 0.048 0.886 0.129 1.024 0.252 0.760 0.107 0.516 0.903 0.982 0.128 0.195 0.071 0.769 0.070 0.306 0.053 Rastrigin 100 0.709 0.048 0.706 0.072 0.686 0.040 0.710 0.095 0.679 0.058 0.671 0.037 0.939 0.030 0.922 0.030 0.924 0.028 0.719 0.068 1.006 0.037 0.706 0.057 0.943 0.030 0.790 0.021 Rosenbrock 100 0.323 0.064 0.328 0.052 0.248 0.048 0.366 0.084 0.316 0.082 0.227 0.051 0.968 0.152 0.935 0.113 0.830 0.078 0.378 0.172 1.027 0.062 0.255 0.055 0.818 0.091 0.443 0.061 Styblinski-Tang 100 0.577 0.055 0.581 0.060 0.607 0.042 0.612 0.054 0.625 0.043 0.615 0.045 0.987 0.114 0.941 0.093 0.916 0.051 0.649 0.058 1.000 0.050 0.592 0.050 0.912 0.091 0.704 0.060 Mean 0.527 0.475 0.375 0.435 0.402 0.415 1.075 0.987 0.927 0.611 0.894 0.536 0.989 0.520 Median 0.465 0.397 0.354 0.356 0.336 0.306 0.917 0.939 0.881 0.634 0.984 0.542 0.818 0.559 Table A8: BO on noise-free synthetic test problems. The relative batch instantaneous regret of the last, exploitative batch with q=10 is shown. Colors are normalized row-wise. Lower means better. Results are means over ten replicate runs. Problem d mean BEEBO max BEEBO q-UCB q-EI TS KB GIBBON Tu RBO T =0.05 T =0.5 T =5.0 T =0.05 T =0.5 T =5.0 κ=0.1 κ=1.0 κ=10.0 - - - default - Ackley 2 0.935 0.250 0.582 0.306 0.213 0.092 0.333 0.167 0.221 0.113 0.181 0.081 0.907 0.036 0.887 0.061 0.885 0.053 0.749 0.093 0.622 0.333 0.846 0.240 0.958 0.029 0.530 0.402 Levy 2 0.157 0.099 0.060 0.075 0.063 0.052 0.033 0.027 0.087 0.039 0.077 0.067 1.166 0.735 1.137 1.309 1.049 0.691 0.638 0.350 0.407 0.453 0.073 0.113 1.996 1.164 0.138 0.148 Rastrigin 2 0.631 0.269 0.676 0.294 0.404 0.209 0.566 0.189 0.482 0.399 0.662 0.296 0.994 0.255 0.927 0.213 1.016 0.234 0.608 0.109 0.831 0.427 0.708 0.121 1.217 0.214 0.292 0.299 Rosenbrock 2 0.002 0.002 0.002 0.004 0.005 0.004 0.003 0.003 0.002 0.002 0.007 0.012 1.054 0.835 1.335 1.611 2.055 4.821 0.471 0.644 0.116 0.228 0.009 0.011 2.095 3.809 0.001 0.001 Styblinski-Tang 2 0.233 0.080 0.124 0.071 0.371 0.396 0.190 0.181 0.290 0.353 0.234 0.259 1.129 0.298 1.104 0.304 0.861 0.295 0.930 0.436 0.359 0.312 0.294 0.400 1.273 0.469 0.261 0.105 Shekel 4 0.825 0.067 0.827 0.039 0.783 0.095 0.862 0.050 0.827 0.078 0.871 0.039 0.983 0.014 0.965 0.015 0.978 0.012 0.936 0.031 0.968 0.034 1.007 0.013 0.990 0.012 0.720 0.138 Hartmann 6 0.189 0.110 0.118 0.070 0.106 0.096 0.059 0.080 0.061 0.064 0.126 0.098 0.861 0.097 0.853 0.079 0.775 0.092 0.834 0.169 0.347 0.157 0.748 0.293 0.876 0.085 0.039 0.027 Cosine 8 0.217 0.084 0.159 0.100 0.167 0.073 0.213 0.066 0.188 0.064 0.270 0.092 0.831 0.118 0.917 0.166 0.797 0.215 1.055 0.570 0.570 0.194 0.793 0.604 1.003 0.159 0.200 0.067 Ackley 10 0.866 0.093 0.548 0.223 0.277 0.090 0.734 0.162 0.531 0.226 0.707 0.123 0.833 0.061 0.842 0.063 0.924 0.044 0.891 0.102 0.859 0.026 0.965 0.095 0.972 0.018 0.445 0.082 Levy 10 0.196 0.167 0.143 0.123 0.024 0.010 0.123 0.098 0.086 0.055 0.118 0.064 0.936 0.297 0.845 0.284 0.866 0.284 0.504 0.386 0.591 0.168 0.305 0.230 0.747 0.286 0.139 0.060 Powell 10 0.012 0.016 0.017 0.015 0.018 0.017 0.027 0.026 0.032 0.017 0.056 0.033 0.885 0.507 0.948 0.604 0.985 0.461 0.213 0.204 0.682 0.268 0.322 0.709 0.442 0.327 0.009 0.007 Rastrigin 10 0.773 0.187 0.775 0.116 0.615 0.122 0.714 0.177 0.593 0.117 0.598 0.169 0.869 0.097 0.909 0.084 0.856 0.128 0.719 0.042 0.828 0.090 0.738 0.149 0.791 0.067 0.423 0.073 Rosenbrock 10 0.008 0.008 0.006 0.005 0.004 0.002 0.011 0.008 0.013 0.007 0.034 0.017 0.550 0.287 0.684 0.259 0.744 0.178 0.155 0.115 0.303 0.136 0.087 0.072 0.633 0.537 0.005 0.006 Styblinski-Tang 10 0.276 0.155 0.389 0.329 0.523 0.135 0.257 0.086 0.419 0.077 0.545 0.070 1.036 0.164 1.079 0.255 1.074 0.198 1.060 0.498 0.907 0.125 1.064 0.761 0.926 0.186 0.256 0.084 Ackley 20 0.957 0.040 0.858 0.204 0.392 0.146 0.923 0.048 0.674 0.233 0.757 0.071 0.850 0.045 0.822 0.088 0.899 0.044 0.778 0.141 0.984 0.034 0.492 0.078 0.978 0.012 0.871 0.062 Levy 20 0.232 0.066 0.685 0.872 0.065 0.033 0.262 0.209 0.099 0.039 0.135 0.034 0.683 0.291 0.970 0.274 0.794 0.390 0.348 0.134 0.950 0.119 0.425 0.183 0.431 0.394 0.327 0.102 Powell 20 0.078 0.144 0.025 0.017 0.029 0.022 0.038 0.027 0.030 0.012 0.036 0.011 0.568 0.241 0.748 0.472 0.603 0.397 0.082 0.039 0.937 0.209 0.035 0.014 0.126 0.054 0.098 0.057 Rastrigin 20 0.739 0.102 0.730 0.076 0.707 0.074 0.732 0.101 0.660 0.102 0.660 0.099 0.843 0.119 0.865 0.055 0.824 0.059 0.712 0.035 0.999 0.053 0.747 0.093 0.724 0.061 0.783 0.081 Rosenbrock 20 0.019 0.007 0.013 0.008 0.004 0.002 0.018 0.009 0.011 0.005 0.031 0.030 0.461 0.335 0.784 0.288 0.410 0.200 0.036 0.008 0.788 0.119 0.018 0.007 0.095 0.039 0.074 0.045 Styblinski-Tang 20 0.268 0.074 0.288 0.069 0.445 0.058 0.311 0.082 0.410 0.117 0.679 0.152 1.104 0.212 1.104 0.216 1.080 0.296 0.628 0.224 0.985 0.078 0.852 1.063 0.859 0.657 0.523 0.131 Ackley 50 0.981 0.018 0.990 0.005 0.960 0.034 0.988 0.015 0.987 0.009 0.961 0.011 0.960 0.046 0.994 0.007 0.862 0.058 0.893 0.085 1.001 0.005 0.982 0.005 0.934 0.064 0.949 0.010 Levy 50 0.464 0.115 0.965 1.096 0.445 0.107 0.482 0.119 0.489 0.120 0.417 0.127 1.023 0.305 1.373 0.357 0.611 0.174 0.516 0.309 1.032 0.194 1.034 1.514 0.554 0.194 0.487 0.055 Powell 50 0.127 0.030 0.109 0.018 0.076 0.032 0.147 0.031 0.113 0.037 0.177 0.076 1.177 0.598 1.302 0.455 0.522 0.332 0.269 0.418 0.972 0.192 0.082 0.031 0.398 0.913 0.218 0.062 Rastrigin 50 0.889 0.124 0.858 0.050 0.764 0.111 0.804 0.065 0.789 0.045 0.776 0.085 0.943 0.093 0.940 0.037 0.829 0.063 0.826 0.133 0.997 0.046 1.000 0.323 0.832 0.041 0.696 0.035 Rosenbrock 50 0.131 0.030 0.123 0.054 0.075 0.035 0.197 0.043 0.159 0.073 0.249 0.103 0.784 0.390 1.066 0.234 0.417 0.180 0.107 0.023 1.039 0.137 0.117 0.042 0.135 0.056 0.285 0.085 Styblinski-Tang 50 0.450 0.039 0.500 0.148 0.534 0.238 0.467 0.089 0.487 0.068 0.593 0.053 1.050 0.210 1.313 0.213 0.797 0.117 0.520 0.073 0.990 0.055 0.633 0.631 0.804 0.821 0.697 0.068 Ackley 100 0.950 0.023 0.948 0.016 0.815 0.075 0.961 0.010 0.954 0.019 0.861 0.033 0.968 0.022 0.992 0.010 0.914 0.056 0.905 0.057 0.999 0.002 0.949 0.015 0.992 0.002 0.959 0.007 Emb. Hartmann 6 100 0.265 0.179 0.118 0.092 0.107 0.085 0.220 0.151 0.199 0.155 0.208 0.125 0.784 0.109 0.815 0.127 0.756 0.133 0.543 0.163 0.870 0.048 0.157 0.175 0.941 0.065 0.301 0.120 Levy 100 0.402 0.108 0.291 0.069 0.242 0.083 0.328 0.058 0.351 0.077 0.247 0.053 1.017 0.196 1.154 0.164 0.976 0.179 0.442 0.386 0.999 0.040 0.318 0.120 0.821 0.054 0.515 0.037 Powell 100 0.189 0.063 0.141 0.031 0.066 0.024 0.190 0.067 0.166 0.047 0.083 0.027 0.865 0.237 1.232 0.274 0.861 0.315 0.141 0.029 0.916 0.092 0.112 0.020 0.727 0.063 0.226 0.039 Rastrigin 100 0.726 0.068 0.702 0.077 0.628 0.042 0.709 0.084 0.666 0.085 0.636 0.046 0.946 0.057 0.946 0.051 0.806 0.090 0.788 0.119 1.002 0.027 0.730 0.058 0.927 0.024 0.715 0.037 Rosenbrock 100 0.227 0.076 0.200 0.059 0.093 0.040 0.255 0.080 0.229 0.080 0.153 0.080 0.880 0.167 1.060 0.106 0.823 0.223 0.225 0.047 0.988 0.070 0.164 0.016 0.766 0.057 0.366 0.047 Styblinski-Tang 100 0.527 0.052 0.568 0.036 0.598 0.054 0.566 0.044 0.581 0.036 0.625 0.029 1.051 0.145 1.267 0.286 0.965 0.017 0.667 0.271 0.987 0.039 0.522 0.020 0.898 0.033 0.677 0.041 Mean 0.422 0.410 0.322 0.386 0.360 0.387 0.909 1.005 0.867 0.581 0.813 0.525 0.844 0.401 Median 0.268 0.291 0.242 0.262 0.290 0.249 0.936 0.965 0.861 0.628 0.937 0.522 0.859 0.327 D.3 Control problems Figure A2: Experiments on the 14D robot arm pushing and 60D rover trajectory planning control problems. 10 replicates each. GIBBON (s) refers to the scaled larged-batch variant of GIBBON. D.4 Run time Figure A3: Example run times for the 10-round BO experiment on the 6D Hartmann problem with Q=100. Error bars are over 5 replicate runs. Run times vary depending on the test problem, with GIBBON appearing especially sensitive, becoming e.g. 10x slower on the 50D Ackley problem. Table A9: Total run times for five replicates of the experiments presented in Table A1. We sum over all test problems. Method Configuration Total time [h] mean BEEBO T = 0.05 66.12 mean BEEBO T = 0.5 47.08 mean BEEBO T = 5.0 37.13 max BEEBO T = 0.05 54.85 max BEEBO T = 0.5 44.20 max BEEBO T = 5.0 47.63 q-UCB κ = 0.1 3.33 q-UCB κ = 1.0 3.70 q-UCB κ = 10.0 4.41 q-EI - 24.33 TS - 6.56 KB - 223.78 GIBBON default 3380.48 GIBBON scaled 1055.93 D.5 Results with random initialization in round 0 Table A10: BO with random initialization on noise-free synthetic test problems. The normalized highest observed value after 10 rounds of BO with q=100 is shown. Colors are normalized row-wise. Higher means better. Results are means over five replicate runs. Problem d mean BEEBO max BEEBO q-UCB q-EI TS KB Tu RBO T =0.05 T =0.5 T =5.0 T =0.05 T =0.5 T =5.0 κ=0.1 κ=1.0 κ=10.0 - - - - Ackley 2 0.994 0.011 0.971 0.041 0.988 0.006 0.985 0.012 0.977 0.029 0.912 0.069 0.793 0.255 0.904 0.059 0.968 0.019 0.957 0.056 1.000 0.000 0.928 0.076 0.972 0.027 Levy 2 0.995 0.008 0.997 0.002 0.995 0.005 0.997 0.002 0.998 0.002 0.966 0.044 0.983 0.023 0.985 0.010 0.954 0.047 0.994 0.004 0.993 0.004 0.997 0.003 0.992 0.017 Rastrigin 2 0.838 0.190 0.674 0.433 0.836 0.138 0.784 0.350 0.756 0.356 0.847 0.199 0.540 0.359 0.675 0.250 0.296 0.255 0.873 0.173 0.800 0.447 0.691 0.299 0.902 0.220 Rosenbrock 2 0.893 0.080 0.900 0.103 0.605 0.395 0.525 0.496 0.705 0.413 0.242 0.306 0.678 0.394 0.475 0.452 0.469 0.485 0.728 0.304 0.888 0.249 0.753 0.423 0.875 0.266 Styblinski-Tang 2 1.000 0.001 1.000 0.000 1.000 0.000 0.998 0.002 0.999 0.002 0.999 0.002 0.998 0.001 0.999 0.002 0.993 0.008 0.999 0.001 0.998 0.002 1.000 0.000 1.000 0.000 Shekel 4 0.839 0.311 0.829 0.360 0.706 0.326 0.390 0.353 0.476 0.274 0.388 0.316 0.183 0.068 0.376 0.308 0.266 0.073 0.530 0.350 0.087 0.060 0.178 0.071 0.824 0.247 Hartmann 6 1.000 0.000 1.000 0.000 0.984 0.016 0.955 0.063 0.989 0.024 0.992 0.010 0.960 0.052 0.998 0.002 0.919 0.057 0.991 0.020 0.715 0.150 0.998 0.001 0.965 0.045 Cosine 8 1.000 0.000 0.997 0.002 0.377 0.144 0.999 0.000 0.968 0.023 0.871 0.057 0.906 0.079 0.903 0.050 0.390 0.242 0.787 0.118 1.000 0.000 0.987 0.009 0.912 0.066 Ackley 10 0.935 0.026 0.904 0.039 0.820 0.037 0.816 0.037 0.742 0.052 0.507 0.137 0.790 0.015 0.776 0.038 0.561 0.170 0.782 0.036 1.000 0.000 0.789 0.023 0.784 0.026 Levy 10 0.981 0.013 0.957 0.028 0.930 0.034 0.956 0.026 0.926 0.024 0.919 0.022 0.866 0.036 0.839 0.068 0.813 0.126 0.920 0.064 0.942 0.026 0.949 0.041 0.941 0.072 Powell 10 0.971 0.029 0.953 0.030 0.672 0.387 0.923 0.109 0.900 0.103 0.702 0.398 0.886 0.061 0.822 0.161 0.219 0.302 0.851 0.191 0.833 0.169 0.939 0.058 0.985 0.008 Rastrigin 10 0.465 0.144 0.495 0.180 0.570 0.091 0.526 0.087 0.516 0.140 0.642 0.083 0.394 0.143 0.564 0.152 0.193 0.172 0.441 0.065 1.000 0.000 0.394 0.088 0.672 0.165 Rosenbrock 10 0.992 0.006 0.990 0.002 0.865 0.068 0.990 0.003 0.986 0.007 0.966 0.037 0.965 0.043 0.952 0.019 0.220 0.373 0.975 0.017 0.820 0.029 0.992 0.003 0.993 0.004 Styblinski-Tang 10 0.815 0.051 0.814 0.062 0.245 0.050 0.784 0.025 0.643 0.124 0.217 0.157 0.165 0.115 0.584 0.177 0.028 0.062 0.619 0.093 0.399 0.163 0.827 0.066 0.654 0.085 Robot Pushing 14 0.350 0.121 0.377 0.124 0.560 0.172 0.425 0.107 0.310 0.140 0.395 0.131 0.424 0.154 0.522 0.170 0.379 0.093 0.417 0.160 0.247 0.118 0.694 0.255 0.518 0.149 Ackley 20 0.843 0.016 0.857 0.028 0.789 0.048 0.819 0.030 0.788 0.040 0.390 0.106 0.706 0.063 0.775 0.039 0.460 0.052 0.740 0.054 1.000 0.000 0.763 0.048 0.438 0.103 Levy 20 0.936 0.035 0.939 0.019 0.896 0.044 0.953 0.027 0.901 0.046 0.911 0.038 0.928 0.016 0.929 0.029 0.768 0.048 0.921 0.024 0.979 0.003 0.956 0.013 0.925 0.041 Powell 20 0.947 0.036 0.966 0.013 0.840 0.111 0.936 0.019 0.880 0.100 0.908 0.076 0.946 0.007 0.928 0.050 0.819 0.122 0.926 0.047 0.964 0.014 0.969 0.016 0.957 0.036 Rastrigin 20 0.373 0.042 0.462 0.049 0.518 0.054 0.514 0.059 0.480 0.077 0.491 0.087 0.450 0.115 0.463 0.069 0.383 0.094 0.451 0.053 1.000 0.000 0.481 0.068 0.523 0.064 Rosenbrock 20 0.992 0.004 0.993 0.004 0.920 0.041 0.991 0.005 0.982 0.013 0.923 0.054 0.967 0.018 0.984 0.005 0.915 0.044 0.984 0.007 0.939 0.020 0.994 0.003 0.993 0.002 Styblinski-Tang 20 0.706 0.061 0.669 0.113 0.305 0.198 0.607 0.086 0.417 0.122 0.279 0.077 0.204 0.130 0.524 0.184 0.054 0.074 0.639 0.082 0.271 0.262 0.665 0.129 0.604 0.088 Ackley 50 0.221 0.293 0.146 0.116 0.842 0.010 0.622 0.243 0.705 0.042 0.457 0.098 0.627 0.053 0.739 0.034 0.736 0.020 0.727 0.051 1.000 0.000 0.683 0.122 0.175 0.022 Levy 50 0.976 0.010 0.978 0.012 0.943 0.021 0.977 0.012 0.955 0.018 0.867 0.013 0.952 0.016 0.966 0.025 0.933 0.007 0.943 0.017 0.987 0.002 0.926 0.041 0.793 0.054 Powell 50 0.940 0.036 0.978 0.010 0.959 0.025 0.976 0.009 0.970 0.014 0.929 0.024 0.965 0.014 0.958 0.016 0.978 0.007 0.964 0.013 0.985 0.004 0.957 0.022 0.920 0.039 Rastrigin 50 0.273 0.156 0.505 0.031 0.453 0.040 0.473 0.042 0.466 0.016 0.445 0.027 0.466 0.073 0.418 0.023 0.503 0.047 0.468 0.012 1.000 0.000 0.423 0.085 0.459 0.070 Rosenbrock 50 0.976 0.012 0.985 0.003 0.988 0.005 0.978 0.011 0.981 0.005 0.968 0.016 0.974 0.004 0.987 0.003 0.984 0.013 0.981 0.003 0.979 0.003 0.987 0.003 0.968 0.018 Styblinski-Tang 50 0.605 0.067 0.693 0.038 0.417 0.163 0.536 0.072 0.415 0.055 0.332 0.037 0.341 0.082 0.716 0.031 0.371 0.040 0.690 0.069 0.254 0.220 0.720 0.041 0.499 0.073 Rover trajectory 60 0.448 0.209 0.678 0.029 0.708 0.060 0.533 0.066 0.657 0.104 0.629 0.040 0.626 0.076 0.613 0.070 0.665 0.079 0.635 0.074 0.265 0.069 0.616 0.043 0.764 0.074 Ackley 100 0.310 0.408 0.347 0.452 0.864 0.023 0.526 0.335 0.536 0.293 0.707 0.086 0.696 0.015 0.721 0.080 0.850 0.007 0.747 0.071 0.007 0.005 0.289 0.081 0.110 0.014 Emb. Hartmann 6 100 0.980 0.009 0.988 0.008 0.916 0.101 0.982 0.016 0.933 0.057 0.914 0.051 0.941 0.035 0.915 0.038 0.913 0.116 0.922 0.110 0.554 0.315 0.949 0.065 0.931 0.084 Levy 100 0.890 0.150 0.966 0.024 0.943 0.019 0.962 0.012 0.942 0.017 0.946 0.013 0.952 0.005 0.937 0.029 0.964 0.014 0.943 0.030 0.310 0.382 0.908 0.056 0.692 0.013 Powell 100 0.786 0.051 0.929 0.099 0.985 0.004 0.985 0.002 0.981 0.009 0.981 0.003 0.983 0.004 0.978 0.013 0.983 0.008 0.963 0.018 0.288 0.165 0.967 0.021 0.860 0.027 Rastrigin 100 0.522 0.027 0.367 0.194 0.467 0.027 0.481 0.057 0.479 0.007 0.469 0.037 0.442 0.021 0.442 0.047 0.432 0.020 0.493 0.014 0.238 0.426 0.674 0.126 0.394 0.022 Rosenbrock 100 0.810 0.029 0.972 0.012 0.976 0.008 0.928 0.119 0.978 0.006 0.975 0.008 0.977 0.008 0.968 0.012 0.985 0.008 0.974 0.003 0.270 0.406 0.943 0.058 0.857 0.010 Styblinski-Tang 100 0.564 0.034 0.470 0.152 0.396 0.079 0.432 0.043 0.331 0.061 0.309 0.022 0.321 0.027 0.542 0.130 0.280 0.017 0.591 0.023 0.198 0.273 0.593 0.063 0.412 0.034 Mean 0.776 0.793 0.751 0.779 0.762 0.697 0.714 0.768 0.618 0.788 0.692 0.788 0.750 Median 0.890 0.929 0.840 0.923 0.880 0.847 0.793 0.822 0.665 0.851 0.888 0.908 0.857 Table A11: BO with random initialization on noise-free synthetic test problems. The relative batch instantaneous regret of the last, exploitative batch is shown. Colors are normalized row-wise. Lower means better. Results are means over five replicate runs. Problem d mean BEEBO max BEEBO q-UCB q-EI TS KB Tu RBO T =0.05 T =0.5 T =5.0 T =0.05 T =0.5 T =5.0 κ=0.1 κ=1.0 κ=10.0 - - - - Ackley 2 0.268 0.132 0.189 0.049 0.334 0.082 0.259 0.187 0.221 0.145 0.299 0.151 1.011 0.027 0.993 0.030 0.994 0.018 0.806 0.209 0.624 0.361 0.749 0.266 0.145 0.101 Levy 2 0.153 0.024 0.130 0.055 0.066 0.059 0.111 0.010 0.091 0.034 0.109 0.009 1.260 0.454 1.195 0.283 1.256 0.263 1.219 0.204 0.280 0.401 0.088 0.100 0.000 0.000 Rastrigin 2 0.427 0.019 0.600 0.381 0.523 0.228 0.306 0.210 0.543 0.073 0.491 0.053 1.009 0.061 0.991 0.104 1.047 0.058 0.808 0.060 0.728 0.196 0.851 0.103 0.031 0.063 Rosenbrock 2 0.001 0.000 0.001 0.000 0.002 0.001 0.003 0.002 0.002 0.000 0.003 0.003 0.895 0.134 0.898 0.131 0.917 0.264 1.101 0.303 0.002 0.001 0.003 0.004 0.000 0.000 Styblinski-Tang 2 0.173 0.007 0.170 0.009 0.170 0.008 0.169 0.007 0.171 0.008 0.170 0.008 1.118 0.087 1.046 0.080 1.047 0.097 0.751 0.154 0.471 0.591 0.169 0.320 0.000 0.000 Shekel 4 0.790 0.049 0.635 0.094 0.707 0.047 0.757 0.097 0.644 0.229 0.727 0.096 0.992 0.006 0.989 0.006 0.992 0.004 0.959 0.041 0.945 0.033 1.001 0.011 0.387 0.223 Hartmann 6 0.052 0.017 0.087 0.030 0.096 0.012 0.189 0.119 0.085 0.029 0.065 0.029 0.959 0.075 0.971 0.017 0.851 0.087 0.863 0.067 0.356 0.006 0.288 0.171 0.028 0.031 Cosine 8 0.060 0.119 0.004 0.006 0.304 0.037 0.000 0.000 0.015 0.010 0.062 0.033 0.987 0.097 0.971 0.071 0.966 0.068 1.111 0.214 0.436 0.018 1.217 0.099 0.080 0.053 Ackley 10 0.447 0.082 0.329 0.074 0.250 0.035 0.324 0.037 0.321 0.047 0.485 0.075 0.936 0.025 0.930 0.021 0.949 0.020 0.937 0.038 0.983 0.015 0.903 0.260 0.299 0.004 Levy 10 0.079 0.068 0.025 0.018 0.296 0.067 0.037 0.031 0.048 0.032 0.093 0.078 1.324 0.110 0.979 0.112 1.088 0.175 0.737 0.322 0.595 0.100 0.552 0.407 0.024 0.011 Powell 10 0.019 0.028 0.009 0.001 0.051 0.004 0.026 0.012 0.052 0.008 0.156 0.056 1.045 0.237 0.926 0.175 1.248 0.273 0.262 0.173 0.144 0.029 0.046 0.038 0.003 0.003 Rastrigin 10 0.625 0.076 0.550 0.082 0.599 0.129 0.533 0.133 0.524 0.119 0.420 0.151 0.911 0.017 0.930 0.018 0.921 0.016 0.961 0.070 0.763 0.108 0.926 0.100 0.355 0.163 Rosenbrock 10 0.003 0.002 0.004 0.002 0.083 0.008 0.016 0.011 0.010 0.005 0.065 0.014 0.895 0.170 0.803 0.115 0.962 0.127 0.044 0.014 0.085 0.011 0.004 0.001 0.001 0.000 Styblinski-Tang 10 0.200 0.023 0.225 0.042 0.571 0.053 0.228 0.028 0.333 0.018 0.487 0.068 1.236 0.149 1.216 0.053 1.184 0.027 1.173 0.326 0.815 0.044 0.676 0.113 0.170 0.049 Robot Pushing 14 0.800 0.177 0.797 0.081 0.879 0.082 0.970 0.057 0.972 0.057 0.986 0.058 0.970 0.026 0.795 0.123 0.984 0.031 0.892 0.103 0.949 0.043 0.673 0.077 0.506 0.070 Ackley 20 0.668 0.127 0.261 0.133 0.211 0.050 0.219 0.030 0.309 0.078 0.607 0.090 0.924 0.019 0.931 0.016 0.910 0.007 0.959 0.085 0.980 0.002 0.912 0.221 0.636 0.111 Levy 20 0.078 0.032 0.078 0.078 0.117 0.092 0.186 0.157 0.114 0.060 0.207 0.052 0.924 0.108 0.859 0.093 1.151 0.105 0.473 0.078 0.743 0.042 0.219 0.086 0.093 0.045 Powell 20 0.097 0.068 0.006 0.002 0.035 0.023 0.082 0.020 0.077 0.008 0.118 0.030 0.757 0.147 0.690 0.195 0.842 0.134 0.086 0.040 0.446 0.142 0.020 0.007 0.011 0.006 Rastrigin 20 0.713 0.083 0.614 0.063 0.506 0.090 0.618 0.049 0.644 0.077 0.562 0.006 0.860 0.048 0.833 0.015 0.850 0.022 0.923 0.228 0.864 0.018 0.725 0.020 0.476 0.102 Rosenbrock 20 0.038 0.039 0.004 0.002 0.029 0.019 0.117 0.064 0.055 0.054 0.050 0.016 0.645 0.113 0.587 0.109 0.978 0.124 0.065 0.022 0.394 0.125 0.008 0.004 0.005 0.001 Styblinski-Tang 20 0.405 0.163 0.357 0.111 0.730 0.069 0.396 0.047 0.529 0.030 0.560 0.038 1.161 0.102 1.093 0.077 1.167 0.034 1.193 0.399 0.903 0.037 0.723 0.090 0.257 0.059 Ackley 50 0.897 0.052 0.859 0.128 0.159 0.010 0.402 0.248 0.465 0.246 0.538 0.098 0.932 0.036 0.957 0.024 0.849 0.024 0.863 0.032 0.986 0.001 0.360 0.115 0.868 0.009 Levy 50 0.039 0.032 0.033 0.041 0.043 0.012 0.020 0.007 0.048 0.015 0.239 0.080 0.667 0.072 0.582 0.135 0.883 0.224 0.099 0.019 0.873 0.009 0.109 0.026 0.227 0.053 Powell 50 0.022 0.010 0.016 0.006 0.015 0.008 0.025 0.026 0.036 0.035 0.076 0.031 0.470 0.158 0.533 0.048 0.575 0.194 0.046 0.024 0.880 0.034 0.017 0.006 0.041 0.015 Rastrigin 50 0.753 0.083 0.601 0.031 0.891 0.420 0.587 0.062 0.585 0.055 0.579 0.093 0.798 0.052 0.820 0.021 0.792 0.028 0.644 0.031 0.932 0.003 0.769 0.203 0.558 0.042 Rosenbrock 50 0.016 0.007 0.010 0.003 0.007 0.003 0.014 0.006 0.035 0.024 0.051 0.021 0.656 0.165 0.540 0.085 0.698 0.174 0.030 0.006 0.794 0.039 0.012 0.003 0.057 0.033 Styblinski-Tang 50 0.460 0.223 1.063 1.782 0.709 0.130 0.449 0.113 0.574 0.097 0.721 0.032 1.032 0.109 1.134 0.117 0.995 0.045 0.613 0.127 0.960 0.012 0.436 0.334 0.483 0.059 Rover trajectory 60 0.475 0.127 0.403 0.115 0.511 0.187 0.380 0.083 0.485 0.110 0.679 0.139 0.684 0.150 0.450 0.145 0.473 0.115 0.561 0.060 0.923 0.029 0.479 0.160 0.186 0.042 Ackley 100 0.684 0.402 0.815 0.360 0.136 0.022 0.554 0.291 0.476 0.283 0.295 0.082 0.952 0.027 0.908 0.057 0.883 0.039 0.672 0.129 0.997 0.001 0.799 0.159 0.904 0.010 Emb. Hartmann 6 100 0.089 0.057 0.039 0.028 0.179 0.142 0.140 0.093 0.100 0.051 0.194 0.128 0.636 0.095 0.843 0.026 0.717 0.197 0.562 0.324 0.882 0.019 0.118 0.125 0.068 0.036 Levy 100 0.089 0.105 0.043 0.034 0.044 0.014 0.039 0.012 0.173 0.085 0.061 0.031 0.629 0.051 0.735 0.134 0.624 0.127 0.131 0.111 0.980 0.021 0.086 0.037 0.300 0.020 Powell 100 0.117 0.020 0.038 0.029 0.008 0.002 0.010 0.003 0.034 0.040 0.011 0.004 0.482 0.066 0.549 0.089 0.477 0.067 0.031 0.012 1.019 0.069 0.021 0.009 0.112 0.017 Rastrigin 100 0.503 0.074 0.584 0.192 0.548 0.029 0.474 0.068 0.550 0.075 0.554 0.061 0.759 0.021 0.825 0.041 0.774 0.031 0.628 0.049 0.990 0.007 0.833 0.365 0.584 0.009 Rosenbrock 100 0.119 0.018 0.026 0.010 0.015 0.004 0.059 0.071 0.061 0.049 0.036 0.019 0.415 0.078 0.600 0.100 0.502 0.068 0.199 0.102 0.982 0.040 0.043 0.037 0.141 0.012 Styblinski-Tang 100 0.372 0.034 0.448 0.158 0.505 0.074 0.547 0.103 0.635 0.145 0.774 0.097 0.947 0.064 1.211 0.140 0.910 0.048 0.429 0.077 0.988 0.009 0.349 0.051 0.527 0.035 Mean 0.307 0.287 0.295 0.264 0.286 0.329 0.882 0.866 0.899 0.624 0.734 0.434 0.245 Median 0.173 0.170 0.179 0.189 0.173 0.239 0.924 0.908 0.917 0.672 0.873 0.360 0.145 D.6 BO curves for all experiments in Table 2 and Table A1 Figure A4: Experiments on the Shekel, Hartmann, Cosine and embedded Hartmann test functions with κ = 0.1 for BEEBO and q-UCB. Figure A5: Experiments on the Shekel, Hartmann, Cosine and embedded Hartmann test functions with κ = 1.0 for BEEBO and q-UCB. Figure A6: Experiments on the Shekel, Hartmann, Cosine and embedded Hartmann test functions with κ = 10.0 for BEEBO and q-UCB. Figure A7: Experiments on the Ackley test function with κ = 0.1 for BEEBO and q UCB. Figure A8: Experiments on the Ackley test function with κ = 1.0 for BEEBO and q UCB. Figure A9: Experiments on the Ackley test function with κ = 10.0 for BEEBO and q UCB. Figure A10: Experiments on the Levy test function with κ = 0.1 for BEEBO and q UCB. Figure A11: Experiments on the Levy test function with κ = 1.0 for BEEBO and q UCB. Figure A12: Experiments on the Levy test function with κ = 10.0 for BEEBO and q UCB. Figure A13: Experiments on the Rastrigin test function with κ = 0.1 for BEEBO and q-UCB. Figure A14: Experiments on the Rastrigin test function with κ = 1.0 for BEEBO and q-UCB. Figure A15: Experiments on the Rastrigin test function with κ = 10.0 for BEEBO and q-UCB. Figure A16: Experiments on the Rosenbrock test function with κ = 0.1 for BEEBO and q-UCB. Figure A17: Experiments on the Rosenbrock test function with κ = 1.0 for BEEBO and q-UCB. Figure A18: Experiments on the Rosenbrock test function with κ = 10.0 for BEEBO and q-UCB. Figure A19: Experiments on the Powell test function with κ = 0.1 for BEEBO and q-UCB. Figure A20: Experiments on the Powell test function with κ = 1.0 for BEEBO and q-UCB. Figure A21: Experiments on the Powell test function with κ = 10.0 for BEEBO and q-UCB. Neur IPS Paper Checklist Question: Do the main claims made in the abstract and introduction accurately reflect the paper s contributions and scope? Answer: [Yes] Justification: As claimed, we experimentally demonstrate a) the controllability of the acquisition strategy, b) competitive performance on 33 test problems compared to q-UCB, q EI, Thompson sampling, GIBBON, Tu RBO and Kriging Believer, and c) behaviour under heteroskedastic noise. Guidelines: The answer NA means that the abstract and introduction do not include the claims made in the paper. The abstract and/or introduction should clearly state the claims made, including the contributions made in the paper and important assumptions and limitations. A No or NA answer to this question will not be perceived well by the reviewers. The claims made should match theoretical and experimental results, and reflect how much the results can be expected to generalize to other settings. It is fine to include aspirational goals as motivation as long as it is clear that these goals are not attained by the paper. 2. Limitations Question: Does the paper discuss the limitations of the work performed by the authors? Answer: [Yes] Justification: We address limitations in our discussion section, highlighting computational complexity constraints in exact GP inference as well as challenges under heteroskedastic noise. Guidelines: The answer NA means that the paper has no limitation while the answer No means that the paper has limitations, but those are not discussed in the paper. The authors are encouraged to create a separate "Limitations" section in their paper. The paper should point out any strong assumptions and how robust the results are to violations of these assumptions (e.g., independence assumptions, noiseless settings, model well-specification, asymptotic approximations only holding locally). The authors should reflect on how these assumptions might be violated in practice and what the implications would be. The authors should reflect on the scope of the claims made, e.g., if the approach was only tested on a few datasets or with a few runs. In general, empirical results often depend on implicit assumptions, which should be articulated. The authors should reflect on the factors that influence the performance of the approach. For example, a facial recognition algorithm may perform poorly when image resolution is low or images are taken in low lighting. Or a speech-to-text system might not be used reliably to provide closed captions for online lectures because it fails to handle technical jargon. The authors should discuss the computational efficiency of the proposed algorithms and how they scale with dataset size. If applicable, the authors should discuss possible limitations of their approach to address problems of privacy and fairness. While the authors might fear that complete honesty about limitations might be used by reviewers as grounds for rejection, a worse outcome might be that reviewers discover limitations that aren t acknowledged in the paper. The authors should use their best judgment and recognize that individual actions in favor of transparency play an important role in developing norms that preserve the integrity of the community. Reviewers will be specifically instructed to not penalize honesty concerning limitations. 3. Theory Assumptions and Proofs Question: For each theoretical result, does the paper provide the full set of assumptions and a complete (and correct) proof? Answer: [NA] Justification: The paper does not make use of any theoretical results. All reported results are based on empirical experiments. All underlying assumptions are standard in research on BO with GPs. Guidelines: The answer NA means that the paper does not include theoretical results. All the theorems, formulas, and proofs in the paper should be numbered and crossreferenced. All assumptions should be clearly stated or referenced in the statement of any theorems. The proofs can either appear in the main paper or the supplemental material, but if they appear in the supplemental material, the authors are encouraged to provide a short proof sketch to provide intuition. Inversely, any informal proof provided in the core of the paper should be complemented by formal proofs provided in appendix or supplemental material. Theorems and Lemmas that the proof relies upon should be properly referenced. 4. Experimental Result Reproducibility Question: Does the paper fully disclose all the information needed to reproduce the main experimental results of the paper to the extent that it affects the main claims and/or conclusions of the paper (regardless of whether the code and data are provided or not)? Answer: [Yes] Justification: As described in the methods section, we use standard Bo Torch and GPy Torch utilities for all our experiments, and provide extended details on the technical implementation in the supplementary section. Our repository includes the full benchmarking setup with appropriate run scripts and instructions. Guidelines: The answer NA means that the paper does not include experiments. If the paper includes experiments, a No answer to this question will not be perceived well by the reviewers: Making the paper reproducible is important, regardless of whether the code and data are provided or not. If the contribution is a dataset and/or model, the authors should describe the steps taken to make their results reproducible or verifiable. Depending on the contribution, reproducibility can be accomplished in various ways. For example, if the contribution is a novel architecture, describing the architecture fully might suffice, or if the contribution is a specific model and empirical evaluation, it may be necessary to either make it possible for others to replicate the model with the same dataset, or provide access to the model. In general. releasing code and data is often one good way to accomplish this, but reproducibility can also be provided via detailed instructions for how to replicate the results, access to a hosted model (e.g., in the case of a large language model), releasing of a model checkpoint, or other means that are appropriate to the research performed. While Neur IPS does not require releasing code, the conference does require all submissions to provide some reasonable avenue for reproducibility, which may depend on the nature of the contribution. For example (a) If the contribution is primarily a new algorithm, the paper should make it clear how to reproduce that algorithm. (b) If the contribution is primarily a new model architecture, the paper should describe the architecture clearly and fully. (c) If the contribution is a new model (e.g., a large language model), then there should either be a way to access this model for reproducing the results or a way to reproduce the model (e.g., with an open-source dataset or instructions for how to construct the dataset). (d) We recognize that reproducibility may be tricky in some cases, in which case authors are welcome to describe the particular way they provide for reproducibility. In the case of closed-source models, it may be that access to the model is limited in some way (e.g., to registered users), but it should be possible for other researchers to have some path to reproducing or verifying the results. 5. Open access to data and code Question: Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: The repository includes the implementation of the proposed method as well as the benchmarking setup with alternative methods. No additional data is required for reproduction. Guidelines: The answer NA means that paper does not include experiments requiring code. Please see the Neur IPS code and data submission guidelines (https://nips.cc/ public/guides/Code Submission Policy) for more details. While we encourage the release of code and data, we understand that this might not be possible, so No is an acceptable answer. Papers cannot be rejected simply for not including code, unless this is central to the contribution (e.g., for a new open-source benchmark). The instructions should contain the exact command and environment needed to run to reproduce the results. See the Neur IPS code and data submission guidelines (https: //nips.cc/public/guides/Code Submission Policy) for more details. The authors should provide instructions on data access and preparation, including how to access the raw data, preprocessed data, intermediate data, and generated data, etc. The authors should provide scripts to reproduce all experimental results for the new proposed method and baselines. If only a subset of experiments are reproducible, they should state which ones are omitted from the script and why. At submission time, to preserve anonymity, the authors should release anonymized versions (if applicable). Providing as much information as possible in supplemental material (appended to the paper) is recommended, but including URLs to data and code is permitted. 6. Experimental Setting/Details Question: Does the paper specify all the training and test details (e.g., data splits, hyperparameters, how they were chosen, type of optimizer, etc.) necessary to understand the results? Answer: [Yes] Justification: We follow GPy Torch and Bo Torch for all hyperparameters pertaining to GPs, and describe this accordingly. Our appendix includes additional details on method hyperparameters to ensure reproducibility. Guidelines: The answer NA means that the paper does not include experiments. The experimental setting should be presented in the core of the paper to a level of detail that is necessary to appreciate the results and make sense of them. The full details can be provided either with the code, in appendix, or as supplemental material. 7. Experiment Statistical Significance Question: Does the paper report error bars suitably and correctly defined or other appropriate information about the statistical significance of the experiments? Answer: [Yes] Justification: We include full BO curves with standard deviations over five replicates for all quantitative experiments in the appendix. These detailed curves are referenced in the main text at the appropriate place. Guidelines: The answer NA means that the paper does not include experiments. The authors should answer "Yes" if the results are accompanied by error bars, confidence intervals, or statistical significance tests, at least for the experiments that support the main claims of the paper. The factors of variability that the error bars are capturing should be clearly stated (for example, train/test split, initialization, random drawing of some parameter, or overall run with given experimental conditions). The method for calculating the error bars should be explained (closed form formula, call to a library function, bootstrap, etc.) The assumptions made should be given (e.g., Normally distributed errors). It should be clear whether the error bar is the standard deviation or the standard error of the mean. It is OK to report 1-sigma error bars, but one should state it. The authors should preferably report a 2-sigma error bar than state that they have a 96% CI, if the hypothesis of Normality of errors is not verified. For asymmetric distributions, the authors should be careful not to show in tables or figures symmetric error bars that would yield results that are out of range (e.g. negative error rates). If error bars are reported in tables or plots, The authors should explain in the text how they were calculated and reference the corresponding figures or tables in the text. 8. Experiments Compute Resources Question: For each experiment, does the paper provide sufficient information on the computer resources (type of compute workers, memory, time of execution) needed to reproduce the experiments? Answer: [Yes] Justification: We list the used hardware and total GPU hours in the supplement and provide example timings for experiment runtimes. Guidelines: The answer NA means that the paper does not include experiments. The paper should indicate the type of compute workers CPU or GPU, internal cluster, or cloud provider, including relevant memory and storage. The paper should provide the amount of compute required for each of the individual experimental runs as well as estimate the total compute. The paper should disclose whether the full research project required more compute than the experiments reported in the paper (e.g., preliminary or failed experiments that didn t make it into the paper). 9. Code Of Ethics Question: Does the research conducted in the paper conform, in every respect, with the Neur IPS Code of Ethics https://neurips.cc/public/Ethics Guidelines? Answer: [Yes] Justification: The paper does not make use of human participants or datasets. To the best of our understanding, there are no potential harmful consequences and wider negative societal impact expected from the proposed method. Guidelines: The answer NA means that the authors have not reviewed the Neur IPS Code of Ethics. If the authors answer No, they should explain the special circumstances that require a deviation from the Code of Ethics. The authors should make sure to preserve anonymity (e.g., if there is a special consideration due to laws or regulations in their jurisdiction). 10. Broader Impacts Question: Does the paper discuss both potential positive societal impacts and negative societal impacts of the work performed? Answer: [NA] Justification: The paper introduces a method for Bayesian optimization (BO). While BO has widespread applications in the sciences and engineering, there is no direct societal impact expected from this contribution. Guidelines: The answer NA means that there is no societal impact of the work performed. If the authors answer NA or No, they should explain why their work has no societal impact or why the paper does not address societal impact. Examples of negative societal impacts include potential malicious or unintended uses (e.g., disinformation, generating fake profiles, surveillance), fairness considerations (e.g., deployment of technologies that could make decisions that unfairly impact specific groups), privacy considerations, and security considerations. The conference expects that many papers will be foundational research and not tied to particular applications, let alone deployments. However, if there is a direct path to any negative applications, the authors should point it out. For example, it is legitimate to point out that an improvement in the quality of generative models could be used to generate deepfakes for disinformation. On the other hand, it is not needed to point out that a generic algorithm for optimizing neural networks could enable people to train models that generate Deepfakes faster. The authors should consider possible harms that could arise when the technology is being used as intended and functioning correctly, harms that could arise when the technology is being used as intended but gives incorrect results, and harms following from (intentional or unintentional) misuse of the technology. If there are negative societal impacts, the authors could also discuss possible mitigation strategies (e.g., gated release of models, providing defenses in addition to attacks, mechanisms for monitoring misuse, mechanisms to monitor how a system learns from feedback over time, improving the efficiency and accessibility of ML). 11. Safeguards Question: Does the paper describe safeguards that have been put in place for responsible release of data or models that have a high risk for misuse (e.g., pretrained language models, image generators, or scraped datasets)? Answer: [NA] Justification: The paper does not introduce any trained models or novel data. Guidelines: The answer NA means that the paper poses no such risks. Released models that have a high risk for misuse or dual-use should be released with necessary safeguards to allow for controlled use of the model, for example by requiring that users adhere to usage guidelines or restrictions to access the model or implementing safety filters. Datasets that have been scraped from the Internet could pose safety risks. The authors should describe how they avoided releasing unsafe images. We recognize that providing effective safeguards is challenging, and many papers do not require this, but we encourage authors to take this into account and make a best faith effort. 12. Licenses for existing assets Question: Are the creators or original owners of assets (e.g., code, data, models), used in the paper, properly credited and are the license and terms of use explicitly mentioned and properly respected? Answer: [Yes] Justification: We credit the GPy Torch and Bo Torch packages that our codebase builds upon. The packages are used as dependencies, and as such are not included directly as assets. Guidelines: The answer NA means that the paper does not use existing assets. The authors should cite the original paper that produced the code package or dataset. The authors should state which version of the asset is used and, if possible, include a URL. The name of the license (e.g., CC-BY 4.0) should be included for each asset. For scraped data from a particular source (e.g., website), the copyright and terms of service of that source should be provided. If assets are released, the license, copyright information, and terms of use in the package should be provided. For popular datasets, paperswithcode.com/datasets has curated licenses for some datasets. Their licensing guide can help determine the license of a dataset. For existing datasets that are re-packaged, both the original license and the license of the derived asset (if it has changed) should be provided. If this information is not available online, the authors are encouraged to reach out to the asset s creators. 13. New Assets Question: Are new assets introduced in the paper well documented and is the documentation provided alongside the assets? Answer: [Yes] Justification: The implementation of BEEBO constitutes the only asset, which follows Bo Torch APIs and has a README file demonstrating its application when working in Bo Torch. Guidelines: The answer NA means that the paper does not release new assets. Researchers should communicate the details of the dataset/code/model as part of their submissions via structured templates. This includes details about training, license, limitations, etc. The paper should discuss whether and how consent was obtained from people whose asset is used. At submission time, remember to anonymize your assets (if applicable). You can either create an anonymized URL or include an anonymized zip file. 14. Crowdsourcing and Research with Human Subjects Question: For crowdsourcing experiments and research with human subjects, does the paper include the full text of instructions given to participants and screenshots, if applicable, as well as details about compensation (if any)? Answer: [NA] Justification: None of the above are included in this paper. Guidelines: The answer NA means that the paper does not involve crowdsourcing nor research with human subjects. Including this information in the supplemental material is fine, but if the main contribution of the paper involves human subjects, then as much detail as possible should be included in the main paper. According to the Neur IPS Code of Ethics, workers involved in data collection, curation, or other labor should be paid at least the minimum wage in the country of the data collector. 15. Institutional Review Board (IRB) Approvals or Equivalent for Research with Human Subjects Question: Does the paper describe potential risks incurred by study participants, whether such risks were disclosed to the subjects, and whether Institutional Review Board (IRB) approvals (or an equivalent approval/review based on the requirements of your country or institution) were obtained? Answer: [NA] Justification: The presented paper does not involve any human subjects. Guidelines: The answer NA means that the paper does not involve crowdsourcing nor research with human subjects. Depending on the country in which research is conducted, IRB approval (or equivalent) may be required for any human subjects research. If you obtained IRB approval, you should clearly state this in the paper. We recognize that the procedures for this may vary significantly between institutions and locations, and we expect authors to adhere to the Neur IPS Code of Ethics and the guidelines for their institution. For initial submissions, do not include any information that would break anonymity (if applicable), such as the institution conducting the review.