# estimation_of_spectral_risk_measures__35d49dca.pdf

Estimation of Spectral Risk Measures

Ajay Kumar Pandey, 1 Prashanth L. A., 1 Sanjay P. Bhat 2

1 Department of Computer Science and Engineering, Indian Institute of Technology Madras, Chennai, India 2 TCS Research, Hyderabad, India, and Department of Electrical Engineering, Indian Institute of Technology Madras, Chennai, India ajaykp@cse.iitm.ac.in, prashla@cse.iitm.ac.in, sanjay.bhat@tcs.com

We consider the problem of estimating a spectral risk measure (SRM) from i.i.d. samples, and propose a novel method that is based on numerical integration. We show that our SRM estimate concentrates exponentially when the underlying distribution has bounded support. Further, we also consider the case when the underlying distribution satisﬁes an exponential moment bound, which includes sub-Gaussian and subexponential distributions. For these distributions, we derive a concentration bound for our estimation scheme. We validate the theoretical ﬁndings on a synthetic setup, and in a vehicular trafﬁc routing application.

1 Introduction

Conditional value-at-risk (CVa R) is a popular measure in the context of risk-sensitive optimization. CVa R is the expectation of a random variable (r.v.) (usually the loss in a ﬁnancial application) conditioned on the r.v. exceeding the value-atrisk (Va R). The latter is a high-probability upper bound on the random loss that could be incurred. The advantage of employing CVa R instead of Va R in a risk-sensitive optimization setting is that CVa R is a coherent risk measure (Artzner et al. 1999), while Va R is not. Spectral risk measures (SRM), proposed in (Acerbi 2002), generalize Va R and CVa R. SRM uses a risk-aversion function to assign a weight to each loss. Note that Va R gives a zero weight for each loss beyond a given quantile, while CVa R assigns a constant weight in the tail region beyond Va R. In contrast, SRMs can assign larger weights to higher losses, and thus model a user s risk attitude better. Moreover, SRMs satisfy coherency provided the risk-aversion function is positive, increasing and integrates to one. In this paper, we consider the problem of estimating the SRM of a r.v., given independent and identically distributed (i.i.d.) samples from the underlying distribution. In this context, our contributions are as follows: First, we provide a natural estimation scheme for SRM that uses the empirical distribution function (EDF) to estimate quantiles, together with a trapezoidal rule-based approximation to the integral in the deﬁnition of the SRM. Second, we provide concentration bounds for our proposed SRM estimate for the following two

Copyright 2021, Association for the Advancement of Artiﬁcial Intelligence (www.aaai.org). All rights reserved.

cases: ﬁrst, when the underlying distribution is assumed to have bounded support; and second, when the distribution satisﬁes an exponential moment bound. The latter class includes sub-Gaussian and sub-exponential distributions (Wainwright 2019). Third, we perform simulation experiments to show the efﬁcacy of our proposed SRM estimation scheme. In particular, we consider a synthetic setup, and show that our scheme provides accurate estimates of SRM. Next, we incorporate our SRM estimation scheme in the inner loop of the successive rejects (SR) algorithm (Audibert, Bubeck, and Munos 2010), which is a popular algorithm in the best arm identiﬁcation framework for multi-armed bandits. We test the resulting SR algorithm variant in a vehicular trafﬁc routing application using SUMO trafﬁc simulator (Behrisch et al. 2011). The application is motivated by the fact that, in practice, human road users may not always prefer the route with lowest mean delay. Instead, a route that minimizes worst-case delay, while doing reasonably well on the average, is preferable, and such a preference can be encoded into the risk aversion function employed in SRMs. Related work. In (Brown 2007; Wang and Gao 2010) concentration bounds for the classic CVa R estimator are derived assuming that the underlying distribution has bounded support. Our bound, using a different estimator, exhibits a similar rate of exponential convergence around true CVa R. For the case of distributions with unbounded support, concentration bounds for empirical CVa R have been derived recently in (Kolla et al. 2019; Kagrecha, Nair, and Jagannathan 2019; Bhat and Prashanth 2019; Prashanth, Jagannathan, and Kolla 2020). In (Kolla et al. 2019) (resp. (Bhat and Prashanth 2019; Prashanth, Jagannathan, and Kolla 2020)), the authors derive a one-sided concentration bound (resp. two-sided bounds), when the underlying distributions are either sub-Gaussian or sub-exponential. In comparison to (Kolla et al. 2019), we derive two-sided concentration results, and our bounds show an improved dependence on the number of samples, say n, and accuracy, say ϵ. In (Khim et al. 2020), the authors employ tools from statistical learning theory to derive a concentration result for a class of risk measures that includes CVa R under the assumption that the underlying loss distribution has bounded support. The closest related work is (Bhat and Prashanth 2019), where the authors employ a Wasserstein distance approach to provide concentration bounds for SRM estimation. In com-

The Thirty-Fifth AAAI Conference on Artificial Intelligence (AAAI-21)

parison to (Bhat and Prashanth 2019), our bounds have a similar dependence on n and ϵ for the sub-Gaussian case, and a much improved dependence on ϵ for the sub-exponential case. Further, unlike (Bhat and Prashanth 2019), we specify all the constants in our bounds. From an application viewpoint, CVa R-based models have been explored in different contexts, for instance, in a bandit application (Galichet, Sebag, and Teytaud 2013), in a portfolio optimization problem (Krokhmal, Palmquist, and Uryasev 2002), and in a general risk management setting (Mulvey and Erkan 2006). In the simulation experiments, we consider a bandit application in the context of vehicular trafﬁc routing, with SRM as the performance objective, and show the efﬁcacy of our proposed estimation scheme.

2 Preliminaries For a r.v. X, Va R Vβ(X) and CVa R Cβ(X) at the level β, β (0, 1), are deﬁned as follows:

Vβ(X) := inf{c : P(X c) β}, (1)

Cβ(X) := Vβ(X) + 1 1 β E[X Vβ(X)]+, (2)

where [x]+ = max(0, x) for a real number x. Vβ(X) can be interpreted as the smallest loss that will not be exceeded with probability β. Note that, if X has a continuous and strictly increasing cumulative distribution function (CDF) F, then Vβ(X) = F 1(β). Further, Cβ(X) can be interpreted as the expected loss, conditional on the event that the loss exceeds Vβ(X), i.e., Cβ(X) = E[X|X Vβ(X)]. Acerbi s formula (Acerbi and Tasche 2002), an alternative form for Cα(X) deﬁned in (2), is as follows:

Cα(X) = 1 1 α

α Vβ(X) dβ. (3)

Let Xi, i = 1, . . . , n denote i.i.d. samples from the distribution of X. Then, the estimate of Vβ(X), denoted by b Vn,β, is formed as follows (Serﬂing 2009):

b Vn,β = b F 1 n (β) = inf{x : b Fn(x) β}, (4)

where b Fn(x) = 1

n Pn i=1 I[Xi x] is the EDF of X. Here I { } denotes the indicator function, i.e., for an event A, I {A} = 1 if A happens, and I {A} = 0 otherwise. Letting X(1), . . . , X(n) denote the order statistics, i.e., X(1) X(2) X(n), we have b Vn,β = X( nβ ).

3 Spectral Risk Measure (SRM) Let L1 ([0, 1]) denote the space of equivalence classes of Lebesgue-integrable functions on [0, 1], where two functions are considered equivalent if they agree everywhere except possibly on a set of zero Lebesgue measure. For any f L1 ([0, 1]), we deﬁne the following norm: f = R 1 0 |f(p)|dp. The deﬁnition of SRM requires a risk-aversion function ϕ L1 ([0, 1]), which is said to be admissible if it satisﬁes the following properties (see Deﬁnition 2.4 in (Acerbi 2002)): (i) ϕ is positive; (ii) ϕ is increasing; and (iii) ϕ = 1.

The SRM of a r.v. X is deﬁned by

0 ϕ(β)Vβ(X) dβ, (5)

where Vβ is as deﬁned in (1), and ϕ L1 ([0, 1]) is admissible. In Theorem 4.1 of (Acerbi 2002), the author establishes that admissibility of ϕ is both necessary and sufﬁcient for ensuring coherency of S(X). Note that by virtue of working with the Lebesgue space L1 ([0, 1]), our conclusions remain unaffected if Vβ is arbitrarily deﬁned at β = 0 and β = 1. On the other hand, using the Riemann integral in (5) would have required us to restrict our attention to the open interval (0, 1) to get around the fact that limβ 0 Vβ = . We choose to focus on the family of r.v.s for which S(X) is well-deﬁned, i.e., {X | ϕ( )V( )(X) L1 ([0, 1])}. As noted in (Acerbi 2002), if ϕ is bounded, and X is integrable, then S(X) is well-deﬁned. SRM can be seen as a weighted average of the quantiles (Va R) of the underlying distribution, with the risk-aversion function ϕ used to deﬁne the weights. Moreover, CVa R can be recovered by using the following risk-aversion function:

ϕ(β) = 1 1 αI {β > α} , α (0, 1). (6)

In particular, using (6) in (5) leads to (3). Notice that CVa R uses the same weight for all tail-loss Va R values. In contrast, the risk-aversion function ϕ can be chosen such that higher losses receive a higher weight or at least, the same weight as lower losses (Dowd and Blake 2006). Thus, SRM can model a user s risk aversion better.

4 SRM Estimation: Bounded Case Estimation Scheme We estimate S(X), given i.i.d. samples X1, . . . , Xn from the distribution of X, by approximating the integral in SRM deﬁnition (5). Notice that the integrand Vβ(X) in (5) has to be estimated using the samples. Recall that b Vn,β is the estimate of Vβ(X), given by (4). We use the weighted Va R estimate to form a discrete sum to approximate the integral, an idea motivated by the trapezoidal rule (Talvila 2016). The estimate b Sn,m of S(X) is formed as follows:

ϕ(βk 1)b Vn,βk 1 + ϕ(βk)b Vn,βk

In the equation above, {βk}m k=0 is a partition of [0, 1] such that β0 = 0 and βk = βk 1 + β, where β = 1/m is the length of each sub-interval. For the special case of ϕ( ) as deﬁned in (6), we obtain an estimate of CVa R from (7) as follows:

b C n,m,α = β 2(1 α)

I {βk 1 > α} b Vn,βk 1

+ I {βk > α} b Vn,βk

In the next section, we present concentration bounds for the SRM estimator in (7), assuming that the underlying distribution has bounded support, and subsequently, specialize the bound to handle the case of CVa R.

Concentration Bounds

For notational convenience, we shall use Vβ and S to denote Vβ(X) and S(X), for any β (0, 1).

For all the results presented below, we let b S n,m denote the SRM estimate formed using (7). Let F and f denote the distribution and density of X, respectively. For the sake of analysis, we make the following assumptions on the risk-aversion function and the underlying distribution: (A1) The function ϕ( ) is differentiable in int(supp ϕ), where int(supp ϕ) denotes the interior of the support of the function ϕ. Further, ϕ satisﬁes the following conditions: |ϕ(β)| C1 and |ϕ (β)| C2, β int(supp ϕ). (A2) The r.v. X is continuous, with a CDF F that is differentiable on the interior of the support of X, and |X| B a.s. Further, the density f of X satisﬁes f(x) 1/δ1 > 0, for |x| B.

In (Dowd and Blake 2006), the authors recommend the exponential risk-aversion function, deﬁned by ϕ(β) = κ e κ(1 β)

1 e κ , β [0, 1], for some κ > 0. This choice for riskaversion satisﬁes (A1). Next, in (A2), we assumed that the density f of X is bounded below by 1 δ1 > 0. A similar assumption is used for analyzing the classic CVa R estimation scheme, cf. (Kagrecha, Nair, and Jagannathan 2019; Prashanth, Jagannathan, and Kolla 2020; Sun and Hong 2010). Further, one could argue that an assumption like (A2) is necessary to get a meaningful Va R concentration result, since the underlying distribution could be arbitrarily ﬂat see Remark 3.5 in (Prashanth, Jagannathan, and Kolla 2020). Truncated normal and exponential distributions are examples that satisfy (A2). The bounds on the error in a trapezoidal approximation of any integral usually assume the derivative of the integrand to be bounded (cf. Corollary 2.2 in (Talvila 2016)). In the context of SRM estimation, this condition translates to a bound on the derivatives of ϕ and Va R. (A1) ensures ϕ ( ) is bounded, while the condition f(x) 1/δ1 > 0 in (A2) together with V β = 1 f(Vβ), β (0, 1), ensures the derivative of Va R is bounded. Note that (A2) imposes a boundedness assumption on the r.v. X. If the random variable is unbounded, then for every ϵ > 0, there is an x such that 0 < f(x) < ϵ. This implies 1/f cannot be bounded above uniformly w.r.t x, and hence the derivative of Va R cannot be bounded either. We now present the main result that establishes an exponential tail bound for the SRM estimator in (7).

Theorem 4.1 (SRM concentration: bounded case). Assume (A1) and (A2). Choose K1 B C2 + δ1 C1, where B, δ1, C1, C2 are speciﬁed in (A1) and (A2). Fix ϵ > 0. Set the number of sub-intervals m = K1

2ϵ in (7). Then, we have

the following bound:

P S b S n,m > ϵ 4 K1

where δ1 is as speciﬁed in (A2).

Proof. See Section 7.

Next, we present a specialization of the tail bound above, to handle the case of CVa R.

Corollary 4.2 (CVa R concentration: bounded case). Assume (A2). Fix α [0, 1], and form the CVa R estimate b C n,m,α using (8), with m = K1

2ϵ , where K1 δ1 1 α. Then, for any ϵ > 0, we have

P[|Cα(X) b C n,m,α |> ϵ] 4 K1

exp n (1 α)2ϵ2

Proof. Follows from Theorem 4.1 after observing that the risk-aversion function ϕ( ) deﬁned in (6) is continuous and differentiable in the interior of its support, i.e., on (α, 1).

Under stronger conditions on the risk-aversion function and the underlying density, we can obtain an improved concentration result. These conditions are variants of (A1) and (A2), and are formalized below. (A1 ) (A1) holds. In addition, ϕ( ) is a twice differentiable function satisfying |ϕ (β)| C3, β int(supp ϕ). (A2 ) (A2) holds. In addition, the density f( ) is a differentiable function satisfying |f (x)|/f(x)3 δ2, for |x| B. As before, the exponential risk-aversion function satisﬁes (A1 ). Next, the stronger condition |f (x)|/f(x)3 δ2 used

in (A2 ), in conjunction with V β = f (Vβ) f(Vβ)3 , implies that the second derivative of Va R is bounded. Now, as before, a bounded second derivative implies that the underlying r.v. X is bounded. To see this, the expression for the second derivative of Va R involves a f /f 3 term, and if the r.v. X is unbounded, then a uniform bound on f /f 3 would mean that, as x , f decays too slowly to integrate to something ﬁnite, leading to a contradiction. More precisely, the differential inequality |f |/f 3 < K can be solved to get f(x) > C/

a + bx for large x and suitable constants a, b, and C. However, the expression on the RHS integrates to inﬁnity, and hence, no density f with unbounded support can have f /f 3 bounded.

Theorem 4.3 (SRM concentration: bounded case (variant)). Assume (A1 ) and (A2 ). Choose K2 such that |B C3 +2 δ1 C2 +δ2 C1| K2, where B, δ1, C1, C2, C3 are speciﬁed in (A1 ) and (A2 ). Fix ϵ > 0. Set the number of sub-

intervals m = q

in (7). Then, we have the following

P S b S n,m > ϵ 4

Proof. See Section 7.

For small values of ϵ, the bound in (10) is better than that in (9). However, the bound in (9) is derived under weaker assumptions on the r.v. X and the risk-aversion function ϕ, as compared to the bound in (10). A corollary of the result in Theorem 4.3 can be derived for the case of CVa R, and the reader is referred to (Pandey, Prashanth, and Bhat 2019) for further details.

5 SRM Estimation: Unbounded Case In this section, we focus on distributions that satisfy an exponential moment bound an assumption made precise below. (A3) The r.v. X is continuous, with a differentiable CDF F. Further, there exist ρ 1, ξ > 0, and χ < such that E (exp (ξ|X|ρ)) < χ. Sub-Gaussian and sub-exponential distributions with a differentiable CDF satisfy the assumption above (Vershynin 2018). In particular, a sub-Gaussian r.v. X satisﬁes (A3) with ρ = 2, while a sub-exponential r.v. X satisﬁes (A3) with ρ = 1.

Estimation Scheme The bounds for the trapezoidal-rule-based approximation of SRM in the section above required the underlying r.v. X to be bounded. Using these bounds together with a truncationbased estimation approach, we control the trapezoidal error approximation for the case of distributions with unbounded support. Let X1, . . . , Xn denote i.i.d. samples from the distribution of X. We form a truncated set of samples as follows: for each i = 1, . . . , n, deﬁne

Xi =Xi I {Xi Bn} , with Bn =

h ρ log log n

ξ , ρ = 1. (11)

Let e Fn(x) = 1

n Pn i=1 I[ Xi x], x R, denote the EDF of the truncated r.v. X = XI {X Bn}. Let e F 1 n denote the inverse of e Fn as deﬁned by (4). We form an SRM estimate along the lines of (7), except that the samples used are truncated, i.e.,

ϕ(βk 1)e Vn,βk 1 + ϕ(βk)e Vn,βk

where e Vn,β = e F 1 n (β), β (0, 1).

Concentration Bounds For the concentration bound presented below, we require the following assumption: (A4) There exists a positive sequence {ψ(n), n 1} such that, for all n 1, ψ(n) = o( n)1, and the density f underlying the r.v. X satisﬁes f(x) 1 ψ(n) for all |x| Bn, where Bn is deﬁned in (11). It is apparent that a Gaussian r.v. with variance σ2 satisﬁes (A3) with ρ = 2, and (A4) with ψ(n) =

2πσ (log n)

1f(n) = o(g(n)) if f(n)

g(n) 0, as n .

Further, an exponential r.v. with parameter λ satisﬁes (A3) with ρ = 1, and (A4) with ψ(n) = 1

λ ξ . Theorem 5.1 (SRM concentration: Exponential moment bounded case). Assume (A1). Suppose the r.v. X with CDF F satisﬁes (A3) and (A4). Let

(eξ) 1 ρ (ρ 1), ρ > 1,

ξ 1 + χ (1 F (Bn))e , ρ = 1.

Fix n 1 and ϵ > c log n. Let the number m of sub-intervals

satisfy m (Bn C2+ψ(n) C1)

(ϵ c log n) , where Bn is given by (11). Let

e Sn,m be formed using (12). Then, we have

P h S e S n,m > ϵ i [Bn C2 + ψ(n) C1] ϵ c log n

n h ϵ c log n i2

Proof. See Section 7.

The bound in the theorem above can be interpreted as follows: For every ϵ > 0, there exists an n0 such that the tail bound for SRM estimation is applicable for all n n0. It is easy to see that n0 is the smallest natural number satisfying n > exp c

ϵ . We believe that a bound for any estimator based on truncation will involve such a minimum number of samples constraint (cf. Proposition 4 in (Bhat and Prashanth 2019). For the sub-exponential case, i.e., ρ = 1, the tail bound above exhibits exponential decay, while the corresponding bound in (Bhat and Prashanth 2019) shows exponential decay for ϵ 1, and power law decay for ϵ > 1. Further, unlike (Bhat and Prashanth 2019), our bounds make all the constants explicit. A specialization of the result in 5.1 for the case of CVa R is provided in (Pandey, Prashanth, and Bhat 2019).

6 Simulation Experiments In this section, we demonstrate the efﬁcacy of our proposed method for SRM estimation (7), which we shall refer to as SRM-Trapz. In our experiments, we use the risk aversion function ϕ(β) = 5 e 5(1 β)

1 e 5 , β [0, 1]. In the following sub-section, we consider a synthetic experimental setting to compare the accuracy of SRM estimators. Subsequently, we use SRM-Trapz as a subroutine in a vehicular trafﬁc routing application (see section 6).

Synthetic Setup Figure 1 presents the estimation error as a function of the sample size for SRM-Trapz. The algorithm is run with two different sub-divisions. The samples are generated using a Gaussian distribution with mean 0.5 and variance 25. We observe that SRM-Trapz with 500 subdivisions performs on

par with SRM-Trapz with 150 subdivisions for every sample size. Further, as expected, increasing sample size leads to lower estimation error, while also increasing the conﬁdence (demonstrated by the shrinkage in standard error).

Figure 1: Error in SRM estimation (|SRM-True - Empirical SRM|) for different sample sizes. SRM-True is approximated using (7) with a large number of samples. Empirical SRM is calculated by two methods, (i) SRM-Trapz method with m = 150 subdivisions (SRM-Trapz 150), and (ii) SRM-Trapz method with m = 500 subdivisions (SRM-Trapz 500). In both methods, SRM is estimated using (7). The underlying distribution considered for this simulation is X N 0.5, 52 . The bars in the plot shows standard error averaged over 103 trials.

Distribution SRM-True SRM 1000

Exp(0.2) 10.99 0.01 11.02 1.21 N(0, 102) 107.36 0.01 107.80 1.32 Exp(0.01) 221.30 0.01 221.39 2.47 U( 103, 103) 612.47 0.02 612.65 4.91

Table 1: The results for SRM estimation, on four distributions, using two methods. Distributions are (a) Exponential distribution with mean 1/0.2 (Exp(0.2)), (b) Normal distribution with mean zero and variance 102 (N(0, 102)), (c) Exponential distribution with mean 1/0.01 (Exp(0.01)), (d) Uniform distribution with range 103 to 103 (U( 103, 103)). Methods are (i) Calculation of SRM (SRM-True) using (7) with m = 105 subdivisions, (ii) SRM-Trapz method (SRM-Trapz 1000) using (7) with m = 103 subdivisions. In method (i), 1012 and in method (ii), 104 i.i.d. samples are used for estimating SRM on each distribution, and the standard error is averaged over 106 and 103 iterations respectively.

Table 1 presents the results obtained by SRM-Trapz with 1000 subdivisions, for four different input distributions, three of which have unbounded support. SRM-Trapz, which is the un-truncated estimator, is found to exhibit low estimation error even for distributions with unbounded support. In Table 1, SRM-True is used as a proxy for the true SRM value in the experiments, and is set using (7) with a large number

(approximately, 1012) of samples. Given the lack of a closed form expression for SRM, the estimation scheme that we propose is a viable alternative to form a proxy SRM-True for the true SRM value.

Vehicular Trafﬁc Routing

Figure 2: Area of an urban city map, used for SUMO network.

In the vehicular routing application, the traditional objective is to ﬁnd a route with the lowest expected delay. However, such an objective ignores risk factors. An alternative is to consider the weighted-sum delay of each route, and we use SRM to quantify this objective. Thus, given a set of routes, the aim is to ﬁnd the route (by adaptive sampling) with the lowest SRM of the delay. Simulation of Urban MObility (SUMO) (Behrisch et al. 2011) is an open source, highly portable, microscopic road trafﬁc simulation package designed to handle large road networks. Trafﬁc Control Interface (Tra CI) (Wegener et al. 2008) is a library, providing extensive commands to control the behavior of the simulation online, including vehicle state, road conﬁguration, and trafﬁc lights. We implement our routing algorithm using SUMO and TRACI. For the experiments, we use the street map of the area around IIT Madras, Chennai, India (see Figure 2) obtained from Open Street Map (OSM) (Haklay and Weber 2008), and then used Netconvert tool to load the map in SUMO. The network has 426 junctions and a total edge length of 123 km. We ran SUMO on this network for 30, 000 time-steps, in which 7000 cars, 500 buses, 2000 bikes, 1000 cycles, and 1000 pedestrians were added at different time-steps and in different lanes uniformly. We choose K = 5 routes between two ﬁxed points, marked as S and D in Figure 2. On these selected routes, we added n = 1000 cars and tracked them. In Table 2, b Xn,i is the estimated average delay of the ith route, and b S n,m,i is the SRM estimate for the ith route, i = 1, . . . , K, using (7), and with n samples. The estimates are averages over 1000 independent trials. We set the number of subdivisions m = 100. From Table 2 it is apparent that ROUTE4 has the lowest expected delay, and ROUTE2 has the lowest SRM. We consider a best-arm identiﬁcation (BAI) bandit framework (Audibert, Bubeck, and Munos 2010), where an algorithm is given a ﬁxed budget. Here, the budget refers to the total number of

ROUTE1 ROUTE2 ROUTE3 ROUTE4 ROUTE5 ˆ Xn,i 283.81 287.15 306.80 266.85 325.86 b S n,m,i 431.28 361.81 455.83 378.68 390.95

Table 2: Results for the estimated average delay ( ˆXn,i) and estimated SRM (b S n,m,i), for ith route, where i = 1, . . . , K.

samples across routes. After the sampling budget, the algorithm is expected to recommend a route, and is judged by the probability that the recommended route is correct (i.e. the best route). We ran successive rejects (SR) (Audibert, Bubeck, and Munos 2010), which is a popular BAI algorithm, except that SR is modiﬁed to ﬁnd the route with lowest SRM. Note that the regular SR algorithm ﬁnds the route with the lowest expected delay. The setting of SUMO is as noted above. We set the budget n = 1000, number of routes K = 5, and m = 100 subdivisions for SRM-Trapz. We observed that SR algorithm picks ROUTE2 with probability 0.91.

7 Convergence Proofs

Proof of Theorem 4.1

We establish two results concerning the error of a trapezoidalrule-based approximation, and a concentration bound for the Va R estimate in (4). These results will be used in the proof of Theorem 4.1, which is provided subsequently.

Lemma 7.1. Under conditions of Theorem 4.1, we have

0 ϕ(β)Vβ dβ

ϕ(βk 1)Vβk 1 + ϕ(βk)Vβk

Proof. Let L ([0, 1]) denote the space of essentially bounded functions on [0, 1]. For any f L ([0, 1]), let f = inf{τ | |f(x)| τ for almost every x [0, 1]} denote the associated norm. Then, for k = 1, . . . , m, we have

βk ϕ(β)Vβ dβ

ϕ(βk 1)Vβk 1 + ϕ(βk)Vβk

ϕ( )V( ) K1

where the ﬁrst inequality in (14) follows from Corollary 2.2 in (Talvila 2016). The latter result provides a bound on the error in the trapezoidal approximation of a Lebesgue integral. For the second inequality in (14), we have used (A1), (A2), and the choice of K1 to infer ϕ( )V( ) B C2 + δ1 C1 K1.

The main claim can be inferred as follows:

0 ϕ(β)Vβ dβ

ϕ(βk 1)Vβk 1 + ϕ(βk)Vβk

k=1 Ek m K1

Lemma 7.2 (Va R concentration). Assume (A2). Then, for any β (0, 1) and ϵ > 0, we have

P h Vβ b Vn,β ϵ i 2 exp 2nϵ2

where δ1 is speciﬁed in (A2).

Proof. Following the proof of Proposition 2 in (Kolla et al. 2019), we obtain

P |b Vn,β Vβ| ϵ 2 exp 2nζ2 ϵ , (16)

where ζϵ = min{F(Vα +ϵ) F(Vα), F(Vα) F(Vα ϵ)}. Using (A2), we have F(Vα +ϵ) F(Vα) = f( V )ϵ ϵ δ1 , for some V (Vα, min{Vα + ϵ, B}). Along similar lines, we have F(Vα) F(Vα ϵ) ϵ δ1 , and we obtain ζϵ ϵ δ1 . The main claim follows by using the inequality above in (16).

Proof of Theorem 4.1.. Notice that

P h S b S n,m > ϵ i

ϕ(βk 1)Vβk 1 + ϕ(βk)Vβk

ϕ(βk 1)b Vn,βk 1 + ϕ(βk)b Vn,βk

The inequality above follows by using Lemma 7.1 to infer that for m K1

2ϵ , we have R 1 0 ϕ(β)Vβ dβ Pm k=1 ϕ(βk 1)Vβk 1+ϕ(βk)Vβk

2. Now, we have

P[| S b Sn,m |> ϵ]

ϕ(βk 1)Vβk 1 + ϕ(βk)Vβk

ϕ(βk 1)b Vn,βk 1 + ϕ(βk)b Vn,βk

P ϕ(β0)Vβ0 ϕ(β0)b Vn,β0 > ϵ 2m β

+ 2P ϕ(β1)Vβ1 ϕ(β1)b Vn,β1 > ϵ 2m β

2P ϕ(βm 1)Vβm 1 ϕ(βm 1)b Vn,βm 1 > ϵ 2m β

+ P ϕ(βm)Vβm ϕ(βm)b Vn,βm > ϵ 2m β

ϵ 2 mϕ(β0) β

ϵ 2 mϕ(β1) β

ϵ 2 mϕ(βm 1) β

ϵ 2 mϕ(βm) β

where we applied Lemma 7.2 to obtain the ﬁnal inequality. Thus,

P h S b S n,m > ϵ i 4m exp

= 4m exp n ϵ2

The claim follows.

Proof of Theorem 4.3.. The proof follows in a similar manner as that of Theorem 4.1, using a variation of the result in Lemma 7.1. The reader is referred to Section 6 of (Pandey, Prashanth, and Bhat 2019) for the details.

Proof of Theorem 5.1 Proof. Notice that

P h S e S n,m > ϵ i = P [I1 + I2 > ϵ] , where

0 ϕ(β)Vβ dβ

ϕ[βk 1]e Vn,βk 1+ϕ[βk]e Vn,βk

and I2 = Z 1

η ϕ(β)Vβ dβ.

In the above, η = F(Bn), with Bn as deﬁned in the theorem statement. We bound I2 as follows:

1 β = P (X > Vβ) = P exp(ξXρ) > exp(ξVρ β)

χ exp ξVρ β , (18)

or equivalently, Vβ 1

ξ log χ 1 β

We now handle the case of ρ > 1 and ρ = 1 separately. Case ρ > 1:

Using log x x e x > 0 , we obtain Vβ χ eξ(1 β) 1

ρ , leading to Z 1

(1 β) 1 ρ = χ

ρ ρ(1 η)1 1

ρ ρ ρ 1χ exp ξVρ η(ρ 1)

(using (18))

ρ ρ ρ 1χ exp ξBρ n(ρ 1)

(since Vη = Bn)

η Vβ dβ ρχC1 (eξ) 1 ρ (ρ 1) log n , (20)

where the ﬁnal inequality uses the deﬁnition of Bn in (11). Case ρ = 1: Using (19), we have Z 1

η log χ 1 β

1+ χ (1 η)e

χ exp ( ξVη)

1+ χ (1 η)e

= χ exp ( ξBn)

1 + χ (1 η)e

. (since Vη = Bn)

Using (A1) and Bn = log log(n)

ξ , we obtain

η Vβ dβ C1χ ξ log n

1 + χ (1 η)e

Applying the bound in the Theorem 4.1 to the truncated r.v. Z = XI {X Bn}, and using (A4), we obtain

P [I1 > ϵ] K1

With c as in the theorem statement, we obtain

P [I1 + I2 > ϵ]

K1 ϵ c log n exp

n ϵ c log n 2

(using (20), (21) and (22))

= (Bn C2 + ψ(n) C1) ϵ c log n exp

n ϵ c log n 2

By using a parallel argument, a concentration result for bounding the lower semi-deviations can be derived. We omit the details.

8 Conclusions We proposed a novel SRM estimation scheme that is based on numerical integration, and derived concentration bounds for our SRM estimator for the case of distributions with bounded support, as well as distributions satisfying an exponential moment bound. As future work, it would be interesting to derive a lower bound for SRM estimation, and close the gap (if any) w.r.t. the upper bound that we have derived.

Acknowledgments This research work was supported in part by a Department of Science and Technology (DST) grant under the ECRA program.

References Acerbi, C. 2002. Spectral measures of risk: A coherent representation of subjective risk aversion. Journal of Banking & Finance 26(7): 1505 1518. Acerbi, C.; and Tasche, D. 2002. On the coherence of expected shortfall. Journal of Banking & Finance 26(7): 1487 1503. Artzner, P.; Delbaen, F.; Eber, J. M.; and Heath, D. 1999. Coherent measures of risk. Mathematical ﬁnance 9(3): 203 228. Audibert, J. Y.; Bubeck, S.; and Munos, R. 2010. Best arm identiﬁcation in multi-armed bandits. In Conference on Learning Theory, 41 53. Behrisch, M.; Bieker, L.; Erdmann, J.; and Krajzewicz, D. 2011. SUMO Simulation of Urban MObility. In The Third International Conference on Advances in System Simulation (SIMUL 2011), volume 42. Bhat, S. P.; and Prashanth, L. A. 2019. Concentration of risk measures: A Wasserstein distance approach. In Advances in Neural Information Processing Systems, 11739 11748. Brown, D. B. 2007. Large deviations bounds for estimating conditional value-at-risk. Operations Research Letters 35(6): 722 730. Dowd, K.; and Blake, D. 2006. After Va R: the theory, estimation, and insurance applications of quantile-based risk measures. Journal of Risk and Insurance 73(2): 193 229. Galichet, N.; Sebag, M.; and Teytaud, O. 2013. Exploration vs exploitation vs safety: Risk-aware multi-armed bandits. In Asian Conference on Machine Learning, 245 260. Haklay, M.; and Weber, P. 2008. Openstreetmap: Usergenerated street maps. IEEE Pervasive Computing 7(4): 12 18. Kagrecha, A.; Nair, J.; and Jagannathan, K. 2019. Distribution oblivious, risk-aware algorithms for multi-armed bandits with unbounded rewards. In Advances in Neural Information Processing Systems, 11269 11278. Khim, J.; Leqi, L.; Prasad, A.; and Ravikumar, P. 2020. Uniform convergence of rank-weighted learning. In International Conference on Machine Learning, 5254 5263. PMLR. Kolla, R. K.; Prashanth, L. A.; Bhat, S. P.; and Jagannathan, K. P. 2019. Concentration bounds for empirical conditional value-at-risk: The unbounded case. Operations Research Letters 47(1): 16 20. Krokhmal, P.; Palmquist, J.; and Uryasev, S. 2002. Portfolio optimization with conditional value-at-risk objective and constraints. Journal of Risk 4: 43 68. Mulvey, J. M.; and Erkan, H. G. 2006. Applying CVa R for decentralized risk management of ﬁnancial companies. Journal of Banking & Finance 30(2): 627 644.

Pandey, A. K.; Prashanth, L. A.; and Bhat, S. P. 2019. Estimation of Spectral Risk Measures. Co RR abs/1912.10398. Prashanth, L. A.; Jagannathan, K.; and Kolla, R. K. 2020. Concentration bounds for CVa R estimation: The cases of light-tailed and heavy-tailed distributions. In International Conference on Machine Learning, volume 119, 5577 5586. PMLR.

Serﬂing, R. J. 2009. Approximation theorems of mathematical statistics, volume 162. John Wiley & Sons. Sun, L.; and Hong, L. J. 2010. Asymptotic representations for importance-sampling estimators of value-at-risk and conditional value-at-risk. Operations Research Letters 38(4): 246 251. Talvila, E. 2016. Higher order corrected trapezoidal rules in Lebesgue and Alexiewicz spaces. ar Xiv preprint ar Xiv:1604.08643 .

Vershynin, R. 2018. High-dimensional probability: An introduction with applications in data science, volume 47. Cambridge university press.

Wainwright, M. J. 2019. High-dimensional statistics: A nonasymptotic viewpoint, volume 48. Cambridge University Press.

Wang, Y.; and Gao, F. 2010. Deviation inequalities for an estimator of the conditional value-at-risk. Operations Research Letters 38(3): 236 239. Wegener, A.; et al. 2008. Tra CI: an interface for coupling road trafﬁc and network simulators. In Proceedings of the 11th communications and networking simulation symposium, 155 163. ACM.