# online_knapsack_with_frequency_predictions__aa96d864.pdf

Online Knapsack with Frequency Predictions

Sungjin Im Electrical Engineering and Computer Science

University of California, Merced

sim3@ucmerced.edu

Ravi Kumar Google Research Mountain View, CA ravi.k53@gmail.com

Mahshid Montazer Qaem Electrical Engineering and Computer Science

University of California, Merced mmontazerqaem@ucmerced.edu

Manish Purohit Google Research Mountain View, CA mpurohit@google.com

There has been recent interest in using machine-learned predictions to improve the worst-case guarantees of online algorithms. In this paper we continue this line of work by studying the online knapsack problem, but with very weak predictions: in the form of knowing an upper and lower bound for the number of items of each value. We systematically derive online algorithms that attain the best possible competitive ratio for any ﬁxed prediction; we also extend the results to more general settings such as generalized one-way trading and two-stage online knapsack. Our work shows that even seemingly weak predictions can be utilized effectively to provably improve the performance of online algorithms.

1 Introduction

An algorithm designer s ultimate goal is to develop algorithms that are reliable and work well in practice. High reliability has been the major focus of discrete algorithmic research that primarily seeks to give strong worst-case guarantees for all inputs. Unfortunately, such algorithms treat all input instances on an equal footing even pathological ones that rarely occur in practice and do not necessarily exploit the latent structure that can be present in realistic instances. In contrast, machine learning has demonstrated amazing successes in many real-world applications, yet occasionally exhibits unacceptable failures on speciﬁc (worst-case) instances.

Learning-augmented algorithm design has recently emerged as a model to achieve both reliability and worst-case guarantees, particularly in the online setting. In this model, an online algorithm is given certain predictions on the input it will face in the future, and the algorithm then makes irrevocable decisions as the actual input is revealed one by one. Ideally, the algorithm should perform close to the optimum when the predictions are good, and yet possess worst-case guarantees even when the predictions are malicious. Recent work has shown that this is achievable for online problems such as caching [17, 26, 29], ski-rental [2, 14, 22], scheduling [6, 15, 22, 28], load balancing [23], secretary problem [5], metrical task systems [4], set cover [7], ﬂow/matching [24], bin packing [3], etc.

Most prior work on learning-augmented online algorithms assumes that we are given as prediction a speciﬁc value of key parameters of the online instance, e.g., the number of days to ski in the ski-rental problem or the next request time of a page in the caching problem. In this paper, we take a relaxed approach and assume that the prediction only gives a range of values for key parameters. This is a natural way to model weak predictions where the predictor has high uncertainty and can only provide a reliable range in which the value must lie. Given such weak predictions, we seek to obtain beyond worst-case guarantees for all inputs respecting the prediction, by possibly exploiting the structure

35th Conference on Neural Information Processing Systems (Neur IPS 2021).

provided by these predictions. In fact, we seek algorithms that give worst-case guarantees for all inputs while promising sharper bounds for instances that indeed respect the predictions. In this paper, we design such learning-augmented algorithms with weak predictions for the online knapsack problem, which is a basic problem of practical importance [31, 30].

Knapsack Problem and the Prediction Model. In the classic online knapsack problem, we have a knapsack of unit capacityi and n items that arrive online. Each item i has a proﬁt pi 0 and a size 0 si 1. For convenience, we deﬁne the value of an item, vi := pi/si, to be the proﬁt of the item per unit size. We assume that the value of each item lies in a range known a priori, i.e., we have vmin vi vmax for some known constants vmin and vmax.ii When an item arrives, an online algorithm must make an irrevocable choice of whether to accept or reject the item. The goal is to select a subset of items of the highest total proﬁt whose total size is at most the knapsack capacity.

Recall that an online algorithm is said to have a competitive ratio c, or equivalently, be c-competitive if the algorithm s objective (proﬁt) is at least c times the total proﬁt obtained by the optimal solution for all inputs. Zhou, Chakrabarty, and Lukose [31] gave a 1/(1 + ln vmax

vmin )-competitive algorithm (that we call ZCL) under the assumption that all items have inﬁnitesimal sizes (or can be accepted fractionally) and further showed that no algorithm can obtain a better competitive ratio.

Online knapsack is a fundamental problem in online algorithm design and ﬁnds applications in many real-world settings, e.g., in the context of online auctions [9, 31] and charging electric vehicles [30]. For these practical applications however, the worst-case guarantees offered by standard competitive analysis are too pessimistic to be useful. Indeed, practical instances are not typically worst-case and further one often possesses some foreknowledge of the kind of items that are expected to arrive in the future. In an online ad-auction, for instance, the advertiser can reliably predict a range such that the number of queries in a day for a particular keyword lies within the range. In this work, we explore formal models to utilize such predictions to obtain provably good algorithms for online knapsack beyond what is possible via classical competitive analysis. In particular, we introduce and study the online knapsack problem with frequency predictions (KNAPSACK-FP), described next.

Let V denote the set of all possible values that items may take, and vmin = min(V ) and vmax = max(V ) denote minimum and maximum values respectively. We may assume that the items have discrete values by rounding them to the closest power of (1 + ) for some > 0, yielding |V | = O(log(vmax/vmin)) and making the competitive ratio only marginally worse. For any value v 2 V , let sv = P

i|vi=v si denote the total size of items with value v. In the online knapsack with frequency predictions problem, for each value v 2 V , the frequency predictions P = { v; uv}v2V provided to the algorithm are a lower bound v and an upper bound uv such that v sv uv. We say that the algorithm has a competitive ratio of for KNAPSACK-FP with prediction P if the algorithm s proﬁt is at least times the optimum for all inputs respecting the given frequency prediction P. Our goal is to design an online algorithm with the best competitive ratio for this prediction model.

1.1 Our Results and Techniques

Our main result is a nearly optimal algorithm for KNAPSACK-FP.

Theorem 1. For KNAPSACK-FP, given any frequency predictions, we can ﬁnd an algorithm of competitive ratio vmax

vmin (|V | + 1) maxi si, where is the best possible competitive ratio for the problem (with predictions).

Our algorithm, called SENTINEL, pre-computes a budget bv for each value v and accepts items that maintain the invariant that the total size of accepted items of value at most v is at most Bv := P

vmin x v bx for all v 2 V . In some sense, Bv can be viewed as a sentinel to guard against accepting too many low-valued items. The key intuition underlying our algorithm is that the knowledge regarding the arrival of future higher-valued items enables it to not over-allocate capacity to earlier lower-valued items. Our algorithm can also be implemented using only O(log |V |) time per item. We note that the additive loss of vmax

vmin (|V | + 1) maxi si in the competitive ratio in Theorem 1 is

i This is without loss of generality by scaling all sizes. ii Such an assumption is necessary since otherwise we cannot hope for a bounded competitive ratio in the adversarial model [27, 31].

due to a reduction to a continuous version of the problem (as discussed in Section 2) and this quantity approaches zero as maxi si ! 0.

Interestingly our algorithm has a distinguishing feature from the ZCL [31] algorithm for the standard online knapsack. The ZCL algorithm constructs a threshold function (t) = (e vmax/vmin)t (vmin/e) a priori and accepts an item online if its value is no smaller than (t) when t is the knapsack

capacity that has been consumed. In contrast, we show that no such algorithm that maintains a threshold function based on the capacity used can give the optimum algorithm (Section 3.2). We note that by combining our algorithm and ZCL (say by using a γ fraction of the capacity to run our algorithm and the other 1 γ fraction to run ZCL), we can simultaneously obtain a worstcase guarantee of 1 γ 1+log( vmax

vmin ) as well as our improved guarantees for instances that respect the

prediction. Further, we can easily extend our result to show that our algorithm has a small loss in the competitive ratio when the predictions are slightly incorrect, for example, when we have v/(1 + ) sv uv(1 + ).

Extensions. We extend our model and results to two generalizations of the knapsack problem: the generalized one-way trading problem and the two-stage online knapsack problem. In generalized one-way trading [13, 30], each item of type t is associated with a certain concave function ft with ft(0) = 0. If an algorithm accepts s amount of items having type t, then it obtains proﬁt ft(s). The goal, as always, is to accept a set of items subject to the knapsack capacity to maximize the total proﬁt. This is a well-studied problem, motivated by converting a given amount of money from one currency to another, in the presence of time-varying conversion rates.

The two-stage online knapsack is a generalization that we introduce in this paper. The key difference from the standard setting is that after seeing all the items, the online algorithm is given another chance in a second stage to accept any items that it rejected before. But, an item accepted in the second stage gives only a λ 1 fraction of its original proﬁt. The two-stage knapsack effectively interpolates between the online and ofﬂine problems: it becomes the standard online knapsack problem if λ = 0, and the ofﬂine problem if λ = 1. The two-stage setting has natural applications. For example, while clients typically prefer to know if their requests will be accepted instantly, some might choose to wait further for a discount, albeit with the possibility of getting no service if the service provider accepts other competing requests arriving later.

For both extensions we generalize our results and obtain the best competitive algorithms given frequency predictions (Sections 4 and 5). This shows that our prediction model and analysis could ﬁnd more applications, providing beyond-worst-case guarantees even with weak predictions.

Experiments. In Section 6, we conduct experiments on synthetic data sets, and show our algorithm augmented with frequency predictions considerably outperforms the classic worst-case ZCL algorithm.

1.2 Other Related Work

The knapsack problem has been extensively studied in the literature. For a comprehensive survey of the results on the ofﬂine knapsack problem, see [18]. The ofﬂine version of the standard knapsack problem and most of its variants are well-known to be NP-hard.

The classic online knapsack problem falls into a general class of problems called online packing problems and one can obtain an (1/ ln(vmax/vmin))-competitive algorithm via the online primaldual technique [10, 11]. For a nice survey of online primal-dual technique and its applications for various packing and covering problems, see [12]. Sun et al. [30] generalize the threshold algorithm in [31] by introducing a control parameter that adjusts how fast the threshold grows. They empirically learn the best parameter within a certain range to retain some worst-case guarantees, but do not provide optimality. Kong et al. [21] empirically show that reinforcement learning can ﬁnd a good threshold for the online knapsack.

Another line of work considers the stochastic online knapsack problem where items proﬁts and sizes are drawn from a distribution (e.g., see [27, 25, 20] and their follow-up works). In the random arrival model, the items are assumed to be chosen adversarially but arrive in a uniformly random order. In this model, one can obtain constant competitive algorithms [1, 19] and further if all items have an inﬁnitesimal size, one can obtain nearly-optimal online algorithms [19]. There are many other

variants studied and we only mention a few here, e.g., removable online knapsack problems [16] and online knapsack with reservation costs [8].

2 Preliminaries

To ease the presentation, we will consider the following relaxed model, which differs from

KNAPSACK-FP in two aspects: ﬁrst, we allow an item to be accepted fractionally and second, we allow an item to have any value in [vmin, vmax], not from a discrete set. The second relaxation is purely for ease of presentation and it preserves the competitive ratio. The ﬁrst relaxation results in a small additive loss in the competitive ratio since at most one item is accepted fractionally among those of the same value and all items are assumed to be small. We now formalize the relaxed model.

Continuous-Valued Online Knapsack with Frequency Predictions (C-KNAPSACK-FP). In the continuous version, we assume that there is a continuum of items, each having inﬁnitesimal size and some value v 2 [vmin, vmax]. In a particular problem instance, for any value v 2 [vmin, vmax], let s(v)dv be the total size of items having value in the range [v, v + dv] for some sufﬁciently small dv. For each value v 2 [vmin, vmax], we are given a lower boundiii (v) and an upper bound u(v) such that s(v) 2 [ (v), u(v)]. We assume that s(v), (v), u(v) are all continuous in v. Deﬁne S(v) :=

x=vmin s(x)dx, L(v) :=

x=vmin (x)dx, and U(v) :=

x=vmin u(x)dx, which are differentiable at all v 2 [vmin, vmax] by deﬁnition. As in the standard model, we have a knapsack of unit capacity, and the goal is to select a subset of items of total size at most one that maximizes total value. The following relates the competitive ratio of the standard and the continuous-valued versions.

Lemma 2. Given a prediction P = { v; uv}v2V for KNAPSACK-FP, we can create a prediction P 0 = { (v); u(v)}v2[vmin,vmax] for C-KNAPSACK-FP in polynomial time such that

1. An -competitive algorithm for C-KNAPSACK-FP with prediction P 0 implies an ( vmax

vmin |V | maxi si)-competitive algorithm for KNAPSACK-FP with prediction P. 2. If no algorithm has a competitive ratio better than for C-KNAPSACK-FP with prediction

P 0, then no algorithm has a competitive ratio better than + maxi si for KNAPSACK-FP with prediction P.

Thus, if we ﬁnd an algorithm with the optimum competitive ratio for C-KNAPSACK-FP, then we will have Theorem 1. In the rest of the paper we only consider the continuous version for convenience.

3 The SENTINEL algorithm

Let { (v); u(v)}v2[vmin,vmax] be a ﬁxed set of predictions; this deﬁnes a family of knapsack instances, where each instance is a sequence of items, each with a proﬁt and size. Let denote the desired competitive ratio of our algorithm. We ﬁrst derive an online algorithm that guarantees a competitive ratio of assuming one exists and then discuss how one can ﬁnd the optimal such competitive ratio . For any instance I, let OPT(I) denote the total value obtained by the optimal solution for instance I. To systematically derive our algorithm, it is instructive to consider the following family of instances. For any value v 2 [vmin, vmax], let M(v) denote an instance containing the maximum possible amount of items of value at most v and the minimum possible amount of items of value larger than v. Formally, M(v) is an instance such that s(x) = u(x) for all x v and s(x) = (x) for all x > v. Further, items in M(v) arrive in the following order: ﬁrst (x) items of value x 2 [vmin, vmax] arrive in non-decreasing order of value, followed by u(x) (x) items of value x v arriving in non-decreasing order. Note that we consider M(v) only to derive our algorithm.

Step 1. Let v denote the largest value such that

x=v x (x)dx = OPT(M(v )). (1)

iii We use the functional notation (e.g., (v)) for the continuous case and the subscript notation (e.g., v) for the discrete case.

Intuitively v denotes the largest value such that an algorithm is guaranteed to be -competitive even if it does not accept any items of value less than v . Note that since we assume that the functions ( ) and u( ) are continuous, such a value v is guaranteed to exist by the intermediate value theorem.

Step 2. Consider an adversary that has constructed an instance of the form M(x) for some unknown value x. By instead constructing an instance of the form M(x + dx), the adversary can gain an

additional proﬁt of d

OPT(M(x)). In order to maintain a competitive ratio of , our algorithm must

accept enough additional items of value x so that it receives a proﬁt of at least d

Motivated by this discussion, we deﬁne a function : [v , vmax] ! [0, 1) to determine the additional amount of items of a particular value that our algorithm accepts.

OPT(M(x)), 8x 2 [v , vmax]. (2)

Step 3. For any value x 2 [vmin, vmax], we deﬁne a budget function as follows.

(x) + (x), 8x v ; and B(v) =

b(x)dx 8v 2 [vmin, vmax]. (3)

For any value v, the budget function B(v) deﬁned above prescribes the maximum total amount of items of value at most v that must be accepted. Formally, let A(v) denote the total amount of items of value at most v that has been accepted by the online algorithm so far. Then, when an item of value v arrives, the algorithm accepts the item if and only if A(x) < B(x) for all x v.

Step 4. Finally, we discuss how to ﬁnd the optimal competitive ratio. The functions ( ), b( ), and B( ) deﬁned above are all (implicitly) functions of the desired competitive ratio . To make this dependence explicit, let B ( ) denote the budget function for some ﬁxed . It can be easily veriﬁed that B (vmax) is a continuous and monotonically increasing function of . Assume U(vmax) > 1 since otherwise we can accept all items by setting b(x) = u(x) and we obtain a competitive ratio of 1. We set 2 (0, 1) to be such that B (vmax) = 1. Again, by the intermediate value theorem, such an is guaranteed to exist and further one can compute it up to arbitrary precision by binary search. For clarity, we include a formal description of the algorithm in the Supplementary Material.

3.1 Analysis

We begin by proving that our algorithm is indeed -competitive for any instance of

C-KNAPSACK-FP that respects the given frequency predictions. Let v be the largest value such that A( v) = B( v); if no such a value exists, set v = vmin. Let ALG and OPT denote the total value of items chosen by our online algorithm and the optimal solution respectively. In the following two claims, we ﬁrst bound ALG and OPT respectively.

Claim 3. ALG

x=vmin x b(x)dx +

x= v x s(x)dx.

Proof. For convenience of notation, let a(v) = d dx A(x)|x=v denote the marginal amount of items of value v accepted by the algorithm. Since we have A(v) < B(v) for all v > v, the algorithm accepts all items having value larger than v and thus a(x) = s(x), 8x > v. Also, by deﬁnition of v, we have A( v) = B( v). Thus, we have the following.

integrating the ﬁrst term by parts yields

ALG = v A( v) vmin A(vmin)

since A( v) = B( v) and A(x) B(x), 8x, by deﬁnition of our algorithm

v B( v) vmin B(vmin)

Claim 4. OPT OPT(M( v)) +

x= v x (s(x) (x))dx.

Proof. Let the optimum solution accept s(x) (x) items of value x for free without consuming the knapsack capacity for all x 2 [ v, vmax]. The remaining items form an instance I0 that is dominated by M( v), i.e., for any value x 2 [vmin, vmax], the amount of items having value x in the instance I0 is at most the amount of items with value x in M( v). However, by deﬁnition, OPT(M( v)) is the maximum proﬁt that one can obtain from M( v) and the claim follows.

The following is from the deﬁnition of SENTINEL and is useful to relate the value of the optimal solution for any instance of the form M(v) to the budget function b( ) utilized by our algorithm.

Claim 5. For any value v 2 [v , vmax], OPT(M(v)) =

x=v x (x)dx +

x=vmin x b(x)dx.

Proof. For all v v , we have the following.

OPT(M(v)) =

OPT(M(v )) +

OPT(M(x))dx

substituting (1) and (2),

x=v x (x)dx +

x=v x (x)dx.

The claim follows by the deﬁnition of b(x) = (x)+ (x) for all x v , and b(x) = 0, 8x < v .

Finally, we are ready to show that the algorithm is indeed -competitive. Claims 3 and 5 together yield that:

ALG OPT(M( v)) +

x (s(x) (x))dx

OPT(M( v)) +

x (s(x) (x))dx

where the last inequality follows from Claim 4. Thus, we have the following theorem. As discussed, the value of depends on the predictions. Theorem 6. For any input instance that respects the frequency predictions { (v), u(v)}v2[vmin,vmax], the SENTINEL algorithm is -competitive for C-KNAPSACK-FP.

3.1.1 Optimality of Competitive Ratio

In this section we show that the competitive ratio obtained by SENTINEL is optimal. In other words, no algorithm, even randomized, can obtain a competitive ratio of better than for C-KNAPSACK-FP.

We will consider the family of instances {M(v)}v2[vmin,vmax]. Recall that in the instance M(v), ﬁrst (x) amount of items of all values x 2 [vmin, vmax] arrive in non-decreasing order of value, followed by u(x) (x) amount of items of value x 2 [vmin, v], which also arrive in non-decreasing order of value. Consequently, until an algorithm sees items of value higher than v in the second stage, it cannot distinguish between instances in {M(x)}x2[v,vmax]. In other words, the instance M(v) is a preﬁx of each of the instances in {M(x)}x2[v,vmax]. Thus for any value v, the adversary can either strategically stop the instance at M(v) or continue to release higher-valued items. Lemma 7. For any given set { (v), u(v)}v2[vmin,vmax] of frequency predictions, let denote the competitive ratio of the SENTINEL algorithm. Then no deterministic algorithm can obtain a competitive ratio larger than on all instances in the family {M(v)}v2[vmin,vmax].

To see why randomization does not help, consider the deterministic algorithm that accepts each item by the expected amount that the optimum randomized algorithm accepts. This is well-deﬁned as we are allowed to choose items fractionally. It is easy to see that the algorithm never exceeds the knapsack capacity and obtains as much value as the randomized algorithm. Thus, we have:

Theorem 8. For any given set { (v), u(v)}v2[vmin,vmax] of frequency predictions, there exists no randomized algorithm that attains a competitive ratio better than that of the SENTINEL algorithm.

Figure 1: Illustration of the optimal competitive ratio obtained by SENTINEL (in red) for predictions v = 1/100 and uv = (1 + δ) v for integers v 2 [vmin = 1, vmax = 100]. The competitive ratio is shown for δ 2 [0, 2]. For comparison, the competitive ratio resulting from accepting v items of value v is shown in blue, i.e., OPT(M(0))/OPT(M(vmax)).

Thus SENTINEL yields the optimal competitive ratio for any given set of predictions. It can be easily veriﬁed that the classical adversarial online knapsack problem is equivalent to the special case with predictions (v) = 0 and u(v) = 1 for all v 2 [vmin, vmax]. For this special case, we recover the competitive ratio 1 1+ln( vmax

vmin ) of ZCL. For more informative predictions, we obtain much tighter

competitive ratios. Indeed, if (v) = u(v), 8v 2 [vmin, vmax], then = 1 and SENTINEL obtains the optimum solution. Figure 1 illustrates the optimum competitive ratio for a family of predictions.

3.2 Need for Sentinel

Prior algorithms for (variants of) the online knapsack problem [31, 30] have all been threshold algorithms, i.e., the algorithm chooses a threshold that is a function of the amount of residual capacity left in the knapsack and the algorithm accepts an item if and only if its value is higher than the threshold. On the other hand, the SENTINEL algorithm does not ﬁt into this framework since the threshold for choosing an item depends on the values of items already chosen by the algorithm. In this section we show that there are predictions {lv; uv}v2V such that no pure threshold-based algorithm can achieve the best competitive ratio. In fact, these predictions use only two values.

Let I denote a family of instances that respect the following predictions. There are two distinct values, V = {1, 2}. The predicted bounds are given by l1 = l2 = 0 and u1 = u2 = 2/3. As discussed in Section 3.1.1, since randomization has no beneﬁts, we only consider deterministic algorithms.

Claim 9. No deterministic online algorithm that maintains a non-decreasing threshold function (z), where z is the knapsack capacity used, and accepts an item if and only if its value is higher than (z) has a competitive ratio better than 4/5 for the family of instances I.

On the other hand, with these predictions there is a simple algorithm that obtains a competitive ratio of 6/7. Let Asentinel denote the following algorithm: accept all items of value 2 and accept at most 4/7 items of value 1. The proof of the following claim follows from a simple case analysis.

Claim 10. For the instances in I, the algorithm Asentinel yields a competitive ratio of 6/7.

4 Generalized One-Way Trading

In the classical one-way trading problem [13], a trader needs to convert a unit of money from one currency to another. Faced with a sequence of (unknown) exchange rates that are presented online, the goal is to maximize the amount of money obtained in the target currency. In this section we consider the following generalized one-way trading problem, also considered by [30]: There are T different types of items and each type t is associated with a concave value function ft. For convenience, we consider the setting where each item i has the same inﬁnitesimal size λ. Items arrive online and the total value earned by the algorithm is P

t ft(xt), where xt denotes the total size of items of type t

accepted by the algorithm. The goal is to design an algorithm that maximizes the total value such that the total size of accepted items is at most one. As in the online knapsack problem, we assume that the algorithm has access to a lower bound t and an upper bound ut on the total size of items of type t.

We now show that we can reduce this problem to the online knapsack problem with predictions. For each type t, we create t/λ mandatory items with values ft(λ), ft(2λ) ft(λ), . . . , ft( t) ft( t λ) respectively. Additionally, we create ut/λ optional items with values ft( t+λ) ft( t), . . . , ft(ut) ft(ut λ) respectively. Note that many created items belonging to different types may have the same value. For each value v, let (v) be the total size of mandatory items of value v and similarly let u(v) be the total size of all mandatory and optional items of value v. Let A be an online algorithm for the knapsack problem using predictions { (v), u(v)}. For the generalized one-way trading problem, as items arrive online, we construct an instance of the knapsack problem online such that the kth item of type t has value ft(kλ) ft((k 1)λ). Finally, we accept an item (in the generalized one-way trading problem) iff the corresponding item is accepted by algorithm A in the knapsack instance.

It is easy to verify that the optimum solution attains the same value, say OPT, in both the problem instances. Let denote the competitive ratio attained by A, and hence the total value obtained by A on the knapsack instance is at least OPT. On the other hand, suppose A accepts k items of type t, then due to concavity of ft, it attains a value of at most Pk

j=1 ft(jλ) ft((j 1)λ) = ft(kλ). Thus, the total value obtained by the online algorithm for the generalized one-way trading problem is at least that obtained by A and hence it is also -competitive.

5 Two-Stage Knapsack

Consider the following two-stage version of the typical fractional online knapsack problem. An instance here is parameterized by a discount factor 0 λ 1. Once again, for convenience, we work in the continuous model, so there is a continuum of items each having inﬁnitesimal sizeiv. We assume that all items have a value between [vmin, vmax]. When an item of value v arrives, an algorithm can either pick the item and receive a value of v or it may choose to reject the item. After all items have arrived, the second stage starts and the algorithm can go back to previously rejected items and choose some of them, subject to the knapsack capacity. However, an item originally of value v is only worth λv in the second stage. Note that an item cannot be rejected once it gets accepted in either stage.

We note that this two-stage knapsack formulation naturally interpolates between the online and ofﬂine versions of the standard knapsack problem; when λ = 0, an algorithm cannot obtain any value from the second stage and hence reduces to the regular online knapsack problem, whereas with λ = 1, an algorithm can commit to only accepting items in the second stage after all items have been revealed and thus the problem reduces to the ofﬂine version.

To the best of our knowledge this problem has not been considered in the literature. Thus we ﬁrst show how to obtain the optimal competitive ratio for the problem in the absence of predictions. We then continue to show how to incorporate predictions on item frequencies for each value. We present the algorithm without predictions and defer the one with predictions to the Supplementary Material.

5.1 Algorithm for Two-Stage Knapsack without Frequency Predictions

As before let be the optimal competitive ratio for the problem, which will be decided later. In the absence of predictions, the optimal competitive ratio can be attained by a threshold algorithm. In the ﬁrst stage of the problem, the algorithm will maintain a threshold function depending on the knapsack capacity used and accept an item if and only if its value is higher than the current threshold. Finally, in the second stage, it is easy to observe that any reasonable algorithm accepts the highest value items that can still be accommodated in the knapsack. We now show how to systematically derive such an algorithm.

For any value v, let z(v) be the fraction of the knapsack capacity for which the algorithm maintains the threshold at value v. To derive z(v) we will consider the following family of adversarial instances: (i) items arrive in increasing order of value; (ii) there are enough items of each value to ﬁll the entire knapsack; and (iii) the adversary can stop releasing items at any point in time.

iv As in Lemma 2, when items have sufﬁciently small sizes, this assumption only leads to a small additive loss in the competitive ratio.

Figure 2: Illustration of the empirical and theoretical competitive ratios of our algorithm SENTINEL and ZCL. In the left, vmin = 1, vmax = 100, and δ 2 [0, 2]; and in the right, vmin 2 [1, 100], vmax vmin = 99 and δ = 2. In both uv = d(1 + δ)e v. All items have size 1/10000.

We construct z(v) that is continuous at all v 2 (vmin, vmax] as follows. First we set, z(vmin) =

1 λ .This is because if the adversary only gives items of value vmin and they are enough to ﬁll up the knapsack, then by accepting z(vmin) of them in the ﬁrst stage and more to the full capacity in the second stage we obtain proﬁt (z(vmin) + λ(1 z(vmin)))vmin = vmin.

Now consider any value v > vmin. If the adversary stops the instance at value v, the proﬁt obtained by the algorithm is:

ALG(v) := vminz(vmin) +

x z(x)dx + λv

Here, the ﬁrst two terms denote the value earned by the algorithm in the ﬁrst stage while the last term denotes the proﬁt gained in the second stage. In particular, (1 z(vmin)

x=vmin z(x)dx) is the amount of capacity left in the knapsack after the ﬁrst stage and the algorithm will ﬁll it with the best items of value v and receive value of λv per each unit.

To maintain a competitive ratio of , we need to maintain for all v 2 (vmin, vmax], M(v) = v. By solving this, together with z(vmin) +

x2(vmin,vmax] z(x)dx = 1, we can ﬁnd and z(v). Then, set the threshold function (t) as follows:

vmin if 0 t λ

1 λ Z 1(t) otherwise ,

where Z(v) :=

x=vmin z(x)dx; this integral includes z(vmin). And if we see an item of value v in the ﬁrst phase when the knapsack is t-full, we accept it if v (t). In other words, we keep the threshold at vmin until the knapsack gets λ

1 λ -full; afterwards, keep the threshold at v for z(v) units. In the second phase, we accept the highest valued remaining items to the full capacity. Further details are in the Supplementary Material.

6 Experiments

We design experiments to show that our algorithm, SENTINEL, achieves a considerably larger proﬁt than ZCL [31] even when we are given loose frequency bounds as predictions. Speciﬁcally, we consider the following setup. The knapsack capacity is set to 1. Each item is assumed to have a size 10 4 and an integer value in the range of [vmin = 1, vmax = 100]. For each value v, its frequency lower bound v is sampled from [50, 150] independently and uniformly at random; therefore, in expectation, at least 104 items arrive in an instance. We use a control parameter δ to derive the upper bounds uv from v and set uv to d(1 + δ) ve. For each instance, sv is sampled from [ v, uv]

independently and uniformly at random and items arrive in random order. The results are the geometric average over 10 instances.

In the ﬁrst experiment we observe how the competitive ratios of SENTINEL and ZCL change as we vary δ from 0 to 2. Intuitively, smaller δ value means better predictions. We conﬁrm that SENTINEL outperforms ZCL for all values of δ in terms of empirical as well as theoretical competitive ratios. The actual competitive ratios degrade as δ grows, but SENTINEL continues to dominate ZCL.

In the second experiment, we consider different values of vmin in the range of {1, . . . , 100}, ﬁxing the difference vmax vmin = 99 and δ = 2. The goal of this experiment is to see how the competitive ratios change as the ratio vmax/vmin changes. Note that the ratio decreases from 100 to nearly 2. Recall that ZCL has competitive ratio 1/(1 + ln(vmax/vmin)). Again, SENTINEL consistently outperforms ZCL, although the difference becomes smaller as vmax/vmin decreases.

7 Conclusions

In this paper we studied the online knapsack problem and its extensions when given the range of the number of items of each value as predictions. We showed that even such weak predictions can be exploited to give provably better competitive ratios. It would be interesting to revisit other problems with such weak predictions. Another fruitful direction is to explore other types of weak predictions.

[1] S. Albers, A. Khan, and L. Ladewig. Improved online algorithms for knapsack and GAP in the

random order model. Algorithmica, pages 1 36, 2021.

[2] K. Anand, R. Ge, and D. Panigrahi. Customizing ML predictions for online algorithms. In

ICML, pages 303 313, 2020.

[3] S. Angelopoulos, S. Kamali, and K. Shadkami. Online bin packing with predictions, 2021.

ar Xiv:2102.03311.

[4] A. Antoniadis, C. Coester, M. Elias, A. Polak, and B. Simon. Online metric algorithms with

untrusted predictions. In ICML, pages 345 355, 2020.

[5] A. Antoniadis, T. Gouleakis, P. Kleer, and P. Kolev. Secretary and online matching problems

with machine learned advice. In Neur IPS, 2020.

[6] É. Bamas, A. Maggiori, L. Rohwedder, and O. Svensson. Learning augmented energy mini-

mization via speed scaling. In Neur IPS, 2020.

[7] É. Bamas, A. Maggiori, and O. Svensson. The primal-dual method for learning augmented

algorithms. In Neur IPS, 2020.

[8] H. Böckenhauer, E. Burjons, J. Hromkovic, H. Lotze, and P. Rossmanith. Online simple

knapsack with reservation costs. In STACS, 2021.

[9] C. Borgs, J. Chayes, N. Immorlica, K. Jain, O. Etesami, and M. Mahdian. Dynamics of bid

optimization in online advertisement auctions. In WWW, pages 531 540, 2007.

[10] N. Buchbinder and J. Naor. Online primal-dual algorithms for covering and packing problems.

In ESA, pages 689 701. Springer, 2005.

[11] N. Buchbinder and J. Naor. Improved bounds for online routing and packing via a primal-dual

approach. In FOCS, pages 293 304, 2006.

[12] N. Buchbinder and J. Naor. The design of competitive online algorithms via a primal-dual

approach. Found. Trends Theor. Comput. Sci., 3(2-3):93 263, 2009.

[13] R. El-Yaniv, A. Fiat, R. M. Karp, and G. Turpin. Optimal search and one-way trading online

algorithms. Algorithmica, 30(1):101 139, 2001.

[14] S. Gollapudi and D. Panigrahi. Online algorithms for rent-or-buy with expert advice. In ICML,

pages 2319 2327, 2019.

[15] S. Im, R. Kumar, M. Montazer Qaem, and M. Purohit. Non-clairvoyant scheduling with

predictions. In SPAA, pages 285 294, 2021.

[16] K. Iwama and S. Taketomi. Removable online knapsack problems. In ICALP, pages 293 305,

[17] Z. Jiang, D. Panigrahi, and K. Sun. Online algorithms for weighted paging with predictions.

ICALP, pages 69:1 69:18, 2020.

[18] H. Kellerer, U. Pferschy, and D. Pisinger. Knapsack Problems. Springer, Berlin, Germany,

[19] T. Kesselheim, K. Radke, A. Tonnis, and B. Vocking. Primal beats dual on online packing LPs

in the random-order model. SIAM Journal on Computing, 47(5):1939 1964, 2018.

[20] A. J. Kleywegt and J. D. Papastavrou. The dynamic and stochastic knapsack problem. Opera-

tions Research, 46(1):17 35, 1998.

[21] W. Kong, C. Liaw, A. Mehta, and D. Sivakumar. A new dog learns old tricks: RL ﬁnds classic

optimization algorithms. In ICLR, 2018.

[22] R. Kumar, M. Purohit, and Z. Svitkina. Improving online algorithms using ML predictions. In

Neur IPS, pages 9661 9670, 2018.

[23] S. Lattanzi, T. Lavastida, B. Moseley, and S. Vassilvitskii. Online scheduling via learned

weights. In SODA, pages 1859 1877, 2020.

[24] T. Lavastida, B. Moseley, R. Ravi, and C. Xu. Learnable and instance-robust predictions for

online matching, ﬂows and load balancing. In ESA, pages 59:1 59:17, 2021.

[25] G. S. Lueker. Average-case analysis of off-line and on-line knapsack problems. Journal of

Algorithms, 29(2):277 305, 1998.

[26] T. Lykouris and S. Vassilvtiskii. Competitive caching with machine learned advice. In ICML,

pages 3296 3305, 2018.

[27] A. Marchetti-Spaccamela and C. Vercellis. Stochastic on-line knapsack problems. Mathematical

Programming, 68(1):73 104, 1995.

[28] M. Mitzenmacher. Scheduling with predictions and the price of misprediction. In ITCS, pages

14:1 14:18, 2020.

[29] D. Rohatgi. Near-optimal bounds for online caching with machine learned advice. In SODA,

pages 1834 1845, 2020.

[30] B. Sun, A. Zeynali, T. Li, M. H. Hajiesmaili, A. Wierman, and D. H. K. Tsang. Competitive

algorithms for the online multiple knapsack problem with application to electric vehicle charging. Proc. ACM Meas. Anal. Comput. Syst., 4(3):51:1 51:32, 2020.

[31] Y. Zhou, D. Chakrabarty, and R. M. Lukose. Budget constrained bidding in keyword auctions

and online knapsack problems. In WINE, pages 566 576, 2008.