# selective_response_strategies_for_genai__d2ef86b2.pdf

Selective Response Strategies for Gen AI

Boaz Taitler 1 Omer Ben-Porat 1

The rise of Generative AI (Gen AI) has significantly impacted human-based forums like Stack Overflow, which are essential for generating highquality data. This creates a negative feedback loop, hindering the development of Gen AI systems, which rely on such data to provide accurate responses. In this paper, we provide a possible remedy: A novel strategy we call selective response. Selective response implies that Gen AI could strategically provide inaccurate (or conservative) responses to queries involving emerging topics and novel technologies, thereby driving users to use human-based forums. We show that selective response can potentially have a compounding effect on the data generation process, increasing both Gen AI s revenue and user welfare in the long term. From an algorithmic perspective, we propose an approximately optimal approach to maximize Gen AI s revenue under social welfare constraints. From a regulatory perspective, we derive sufficient and necessary conditions for selective response to improve welfare.

1. Introduction

The maxim, Better to remain silent and be thought a fool than to speak and to remove all doubt, offers a compelling perspective on the strategic value of withholding information. While often invoked in interpersonal contexts, it resonates surprisingly well in the context of Generative AI (Gen AI) systems like Chat GPT. These systems are designed to answer user queries immediately, yet one might wonder: Are there situations where the system should remain silent?

One such scenario arises when the system hallucinates. Hallucinations, defined as the generation of incorrect or fabricated information, are an intrinsic property of generative models that cannot be entirely avoided (Kalai & Vempala,

1Technion Israel Institute of Technology, Israel. Correspondence to: Boaz Taitler <boaztaitler@campus.technion.ac.il>.

Proceedings of the 42 nd International Conference on Machine Learning, Vancouver, Canada. PMLR 267, 2025. Copyright 2025 by the author(s).

2024). Another scenario involves questions concerning safety and ethics, with potentially life-threatening consequences (Shin, 2023; Mello & Guha, 2023; Li et al., 2024). However, as we argue in this paper, it can be advantageous for both Gen AI operators and users if the system avoids responding indiscriminately to every prompt, especially when addressing emerging technologies and novel content.

To illustrate, consider Gen AI s competitive relationship with a human-driven platform like Stack Overflow. Users may direct their questions to either Gen AI or Stack Overflow, seeking solutions to their problems. Posting a code-related question on Stack Overflow generates clarification questions in the comments, solutions offered by experts, feedback from other users (upvotes) and the original poster (acceptance flag), etc. Such valuable data could significantly enhance Gen AI, improving its performance. In contrast, querying Gen AI can lead to quicker user satisfaction and increased engagement with Gen AI, potentially enhancing its revenue streams. On the downside, the lack of community interaction may result in less comprehensive solutions and reduce the opportunity for generating rich, labeled data that community-driven platforms like Stack Overflow thrive on (del Rio-Chanona et al., 2024; Burtch et al., 2024; Li & Kim, 2024). This absence of dynamic, user-generated content and in-depth discussions can be detrimental to user welfare in the long term, as Gen AI s ability to provide high-quality answers depends on such data.

Motivated by the issue above, this paper pioneers the framework of selective response. Namely, strategically choosing when, if, and how to engage with user queries, particularly those involving emerging topics and novel technologies. We explicitly suggest that when a new topic emerges, Gen AI could strategically decide to provide lower-quality answers than what it can or even disclaim to have not enough data to respond. We represent such behavior abstractly by modeling Gen AI as not responding or remaining silent . Clearly, selective response has a short-term negative impact; however, as we show, an appropriate selective response would lead to an improved data generation process, benefiting the long term for both Gen AI s revenue and user social welfare.

Our contribution Our contribution is two-fold. The first is conceptual: Our paper is the first to explore selective response for Gen AI. We present a stylized model of an ecosys-

Selective Response Strategies for Gen AI

tem that evolves sequentially, featuring two platforms: A generative AI-based platform called Gen AI and a humandriven Q&A platform named Forum. Gen AI generates revenue by engaging with users and can adopt a selective response strategy: Determining the proportion of users it responds to in each round. Here, not responding represents a broad spectrum of possible behaviors such as strategically withholding data, providing lower-quality answers than Gen AI can produce, or claiming insufficient data, ultimately driving users to seek answers on Forum.1 We treat these behaviors collectively as selective response, which abstracts them for conceptual clarity. In contrast, Forum operates as a non-strategic player.

Users decide between Gen AI and Forum based on the utility they derive from each platform. Those who choose Forum contribute to the creation of new data, which Gen AI can later incorporate during retraining. Crucially, Gen AI s quality in each round depends on the cumulative data generated since the beginning of the interaction. Our novel model allows us to explore the dynamics of content creation, welfare, and revenue from a game-theoretic lens.

Our second contribution is technical: We begin by demonstrating that selective response can Pareto-dominate the always-responding approach. Specifically, we establish the following result.

Theorem 1.1 (Informal statement of Observation 3.1). Compared to the case where Gen AI always answers, selective response strategies can improve user welfare, Gen AI s revenue, and even both.

We also quantify the extent to which selective response can improve revenue and welfare w.r.t. the always-responding approach.

Next, we analyze the long-term effects of selective response, revealing that it leads to higher proportions of users choosing Gen AI and increased data generation (Theorem 4.1). Building on this result, we devise an approximately optimal solution to Gen AI s revenue maximization problem.

Theorem 1.2 (Informal statement of Theorem 4.4). Let ε be a small positive constant and let A be a finite set of selective responses. There exists an algorithm guaranteeing an additive O(εT 2) approximation of Gen AI s optimal revenue,

and its runtime is O T 2|A|

We extend this result to the case where Gen AI is constrained to meet an exogenously given social welfare threshold.

Finally, we analyze the impact of selective response on social welfare. We provide valuable insights into how a

1In real-world scenarios, multiple Gen AI systems vie for user traffic, making the analysis of such competition significantly more complex. We address this complexity in Section 7.

one-round intervention affects the data generation process and its implications on welfare. We leverage these insights to demonstrate how regulators that aim to enhance social welfare can have successful one-round interventions, improving user welfare while ensuring a bounded impact on Gen AI s revenue.

Altogether, our work challenges the conventional notion that Gen AI should always provide answers. Despite its theoretical nature, the messages our paper conveys can translate into practical considerations for both Gen AI companies and regulators and influence how forum-Gen AI collaborations should form.

1.1. Related Work

The literature on generative AI is growing at an immense pace. Most research focuses on mitigating hallucinations (Ji et al., 2023), performance (Frieder et al., 2024; Koco n et al., 2023; junyou li et al., 2024; Chow et al., 2025), and expanding applications (Kasneci et al., 2023; Liu et al., 2024). Our work connects to the emerging body of research on foundation models and game theory (Raghavan, 2024; Laufer et al., 2024; Conitzer et al., 2024; Dean et al., 2024). This literature studies competition between generative AI models and human content creators (Yao et al., 2024; Esmaeili et al., 2024; Keinan & Ben-Porat, 2025), the impact of generative AI on content diversity (Raghavan, 2024), and works motivated by social choice and mechanism design (Conitzer et al., 2024; Sun et al., 2024).

The most closely related work to ours is that of Taitler & Ben-Porat (2025), which examines whether the existence of generative AI is beneficial to users. In their model, the generative AI platform decides when to train, and they propose a regulatory approach to ensure social welfare for users. In contrast, our model introduces a different approach, where the generative AI chooses a portion of queries to answer, demonstrating that responding selectively can benefit both the generative AI platform and its users.

Our notion of selective response is also inspired by the economic literature on information design (Bergemann & Morris, 2019; Bergemann et al., 2015), which explores how the strategic disclosure and withholding of information can influence agents behavior within a system. Another related concept is signaling (Crawford & Sobel, 1982; Milgrom, 1981), referring to strategic communication used by agents to potentially improve outcomes (Babichenko et al., 2024; Lu et al., 2023). Similarly, cheap talk (Lo et al., 2023; Crandall et al., 2018) can be used for fostering cooperation. In that sense, selective response can be observed as an information design problem, where Gen AI strategically manages information disclosure to influence user behavior and ultimately optimize its revenue. Also related is the strand of literature of algorithmic deferring (Hemmer et al., 2023;

Selective Response Strategies for Gen AI

Mozannar & Sontag, 2020), where the algorithm can defer questions and tasks to other experts.

Finally, since our model includes an ecosystem with two platforms (Gen AI and Forum), it relates to a growing body of work on competition between platforms (Rietveld & Schilling, 2021; Karle et al., 2020; Bergemann & Bonatti, 2024; Tullock, 1980; Mc Intyre & Srinivasan, 2017). Previous works explore the effects of competition in marketplaces on users social welfare (Jagadeesan et al., 2023; Feldman et al., 2013), as we do in this paper.

We consider a sequential setting over T discrete rounds, where in each round, users interact either with Generative AI (Gen AI) or a complementary human-driven platform, Forum. An instance of our problem is represented by the tuple a, γ, r, β, ws , and we now elaborate on the components of the model.

Gen AI. Gen AI adopts a selective response strategy x = (x1, x2, . . . , x T ), where xt [0, 1] represents the proportion of users who receive answers in round t among those who have already chosen Gen AI. For example, xt = 1 means that Gen AI answers all users who selected it in round t, whereas xt = 0 means it answers none. The performance of Gen AI depends on the cumulative amount of data it has collected and trained on at the start of each round t, denoted Dt(x). The quality of Gen AI is represented by the accuracy function a(Dt(x)), a strictly increasing function a : [0, T] [0, 1], satisfying da(D)

d D > 0 for all D R 0.2

We use superscripts g and s to denote the utility users receive from Gen AI and Forum, respectively. The (expected) utility users derive from Gen AI in round t, denoted wg t (x), reflects the average quality a(Dt(x)) that users obtain from Gen AI. It is given by

wg t (x) = a(Dt(x)) xt. (1)

Crucially, Gen AI can intentionally respond less accurately than its maximum capability. In each round t, the proportion of users who choose Gen AI is denoted by pt(x). This fraction is determined by the selective response strategy x and user decisions, which will be discussed shortly.

The (time-discounted) revenue of Gen AI over T rounds, U(x), is defined by

t=1 γtr(pt(x)),

2We use the term accuracy for simplicity, allowing us to address user satisfaction abstractly. Evaluating the performance of Gen AI is significantly more complex.

where γt represents the discount factor applied to the revenue at round t, reflecting the decreasing value of future revenue. The function r : [0, 1] R 0 maps the proportion of users pt(x) in round t to revenue, and is assumed to be both non-decreasing and Lr-Lipschitz. For instance, a superlinear r captures the compounding market effects of Gen AI, where revenue grows at an accelerating rate as the proportion of users increases (Katz & Shapiro, 1985; Bailey et al., 2022; Mc Intyre & Srinivasan, 2017). Indeed, this is the case if a higher user base attracts disproportionately more offers for collaborations and investment opportunities (rich getting richer).

Data Accumulation. The cumulative data available to Gen AI evolves as users interact with Forum. At the start of round t, the cumulative data Dt(x) is defined recursively as: Dt(x) = Dt 1(x) + (1 pt 1(x)),

with the initial condition D1(x) = 0. This initial condition represents the emergence of a new topic, where Gen AI has not acquired any relevant data from previous training sets.

Forum. Forum provides a human-driven platform where users can post and answer questions. The utility users derive from Forum, ws, is constant across rounds and satisfies ws [0, 1].

Users. Users decide between Gen AI and Forum by comparing the expected utility they derive from each platform. We model user decisions using a softmax function:

σt(x) = eβwg t (x)

eβwg t (x) + eβws ,

where β > 0 is a sensitivity parameter that captures users responsiveness to utility differences.

Recall that xt represents the proportion of users in σt(x) who receive an answer from Gen AI. The remaining users, who do not receive an answer, can either post their question on Forum or leave them unanswered. We assume the former, meaning that pt(x) = xtσt(x) is the proportion of users who receive an answer from Gen AI, while the rest contribute to data generation by posting their question on Forum.

User Welfare. The instantaneous user welfare wt(x) accounts for the utilities derived from both platforms in round t. It is defined by

wt(x) = pt(x) a(Dt(x)) + (1 pt(x))ws. (2)

The cumulative user welfare, W is therefore the sum of the instantaneous welfare over all the rounds W(x) = PT t=1 wt(x).

Selective Response Strategies for Gen AI

Assumptions and Useful Notations As we explain later, the following assumption on the structure of the accuracy function is crucial for analyzing the dynamics of the data generation process.

Assumption 2.1. The accuracy function a(D) is La Lipschitz with constant La 4

We further discuss this assumption in Section 7. Additionally, we use the following notions throughout the paper. Given an arbitrary strategy x, any strategy xτ that is obtained by reducing the response level in round τ and maintaining the other entries of x is called a τ-selective modification of x. That is, xτ is any strategy that is identical to strategy x except for round τ, in which it answers less than xτ. Formally, xτ τ [0, xτ) and xτ t = xt for every t = τ. For brevity, if x is clear for the context, we use xτ

as any arbitrary τ-selective modification. Another useful notation in x, where x = (1, 1, . . . , 1) is full response or the always-responding strategy; we use these interchangeably. We use this strategy as a point of comparison, establishing a baseline to test other strategies.

Example 2.2. Consider the instance T = 10, a(D) = 1 e 0.3D, γ = 0.9, r(p) = p2, β = 10, and ws = 0.5. Consider the following selective response strategy x = (1, . . . , 1) and x which is defined by.

( 0 t 4 1 otherwise .

At t = 1 it holds that d1(x) = d1( x) = 0. Notice that a(0) = 0 and therefore p1( x) = 1 1 1+eβws 0.0067. Similarly, for x it is p1(x) = 0 1 1+eβws = 0. Thus, the generated data is d2( x) 1 0.0067 = 0.9933 and d2(x) = 1.

With that, we have the ingredients to calculate the instantaneous welfare at time t = 1.

w1( x) = p1( x)a(Dt( x)) + (1 p1( x))ws

0.0067 0 + 0.9933 0.5 0.4966,

w1(x) = 0 0 + 1 ws = 0.5.

Figure 1 demonstrates the proportions of the strategies x and x as a function of the round for t [T]. Notice that the selective response x induces lower user proportions in the earlier rounds, but it eventually surpasses the full response strategy x.

Finally, the revenue is attained by calculating U(x) = P10 t=1 γt 1(pt(x))2. Computing this for the two strategies, we see that U(x) is roughly 5% higher than U( x). Similarly, the welfare W(x) is about 7.6% higher than W( x). As this example suggests, selective response can improve both revenue and welfare. Indeed, this is the focus of the next section.

2 4 6 8 10 0

Proportions

Selective response x Full response x

Figure 1: A visualization for Example 2.2. The blue (circle) curve shows the proportion of users pt(x) for the selective response strategy x at each round t. The red (square) curve depicts the corresponding proportion for the full response.

3. The Benefits of Selective Response

This section motivates our work by showing that selective response may benefit both Gen AI and its users. We first demonstrate a qualitative result: Selective response can improve revenue, welfare, or both. Then, in Subsection 3.1, we quantify the extent of these improvements.

Recall the definition of the full response strategy x. We use it as a benchmark in evaluating the potential impact of adopting a selective response strategy on Gen AI and its users.

Observation 3.1. There exist instances and a selective response strategy x that satisfy each one of the following inequalities:

1. U(x) > U( x) and W(x) > W( x),

2. U(x) < U( x) and W(x) > W( x),

3. U(x) > U( x) and W(x) < W( x).

The first inequality in Observation 3.1 indicates that there exists a selective response strategy that Pareto dominates the always-responding strategy. The subsequent two inequalities imply that increasing either Gen AI s revenue or the users social welfare may come at the expense of the other.

3.1. Price of Always Responding

In this subsection, we quantify the negative impact of always answering users queries. We introduce two indices: RPAR, an abbreviation for Revenue s Price of Always

Selective Response Strategies for Gen AI

Response, and WPAR, which stand for Welfare s Price of Always Response. Formally, RPAR maxx U(x)

WPAR maxx W (x)

W ( x) are the price of always answering with respect to revenue and social welfare, respectively. These metrics capture the inefficiencies in revenue and welfare that arise when Gen AI always responds to all user queries. Our next result demonstrates that the revenue inefficiency is unbounded. Proposition 3.2. For every M R>0 there exists an instance I with Lr = Θ(ln(M)) such that RPAR(I) > M.

Proposition 3.2 relies on the revenue scaling function r(p), which can bias Gen AI s incentives toward data generation rather than immediate revenue. For example, when r(p) takes the form of a sigmoid function, the parameter Lr controls the steepness of the curve. If the sigmoid is sufficiently steep, r(p) approximates a step function, requiring Gen AI to surpass a certain user proportion threshold to generate revenue. This mirrors threshold-based incentives, where substantial rewards are only provided once a predefined threshold is met.

Our next proposition shows that there exist instances where selective responses can result in social welfare nearly twice as large as that of the always-responding strategy.

Proposition 3.3. For every ε > 0 there exists an instance I with WPAR(I) > 2 ε.

We end this section by analyzing Price-of-Anarchy (Koutsoupias & Papadimitriou, 1999; Roughgarden, 2005), a standard economic concept that measures the harm due to strategic behavior of Gen AI. Formally, Po A = maxx W (x) minx R W (x), where R is the set of revenue-maximizing strategies. We show that it can increase with the smoothness parameter of the reward function Lr. Since this analysis depends on the revenue-optimal strategy of Gen AI, which we only examine in later sections, we defer this analysis to the Appendix B.1

4. The Impact of Selective Response on Gen AI s Revenue

In this section, we analyze the revenue-maximization problem faced by Gen AI. Subsection 4.1 examines the impact of using selective responses on both user proportions and generated data. We show that any τ-selective modification of any strategy and any τ generates more future data and attracts more users to Gen AI from round τ + 1 onward. Subsequently, we develop two approaches for maximizing Gen AI s welfare. In Subsection 4.2, we develop an approximately optimal algorithm for maximizing Gen AI s optimal revenue. In Subsection 4.3, we focus on undiscounted settings, i.e., γ = 1, and consider a welfare-constraint revenue maximization: Maximizing revenue under a minimal social welfare level constraint. We emphasize the trade-off

between our approaches: The first approach cannot handle welfare constraints, while the second is restricted to undiscounted revenue.

4.1. Selective Response Implies Increased User Proportions

Next, we analyze the impact of using a τ-selective modification of any base strategy on the proportions and data generation. At first glance, using selective responses harms immediate revenue, as it encourages users to turn to Forum. However, as suggested by Observation 3.1, lower response levels can ultimately prove beneficial. But why is this the case?

The answer lies in the dynamics of data generation. By employing a more selective response, Gen AI incentivizes users to engage with Forum, which results in the creation of more data. This additional data becomes crucial in future rounds, enabling Gen AI to attract a more significant user proportion in subsequent interactions. While this reasoning is intuitive, its application over time presents a technical challenge: As the proportion of users choosing Gen AI increases, the marginal data generated per round may decrease, potentially leading to less data than a strategy where Gen AI answers every query. However, the theorem below demonstrates the compounding effect of selective response, guaranteeing consistently higher user proportions in future rounds.

Theorem 4.1. Fix any strategy x. For every t > τ it holds Dt(xτ) > Dt(x) and pt(xτ) pt(x) where pt(xτ) = pt(x) if and only if xt = xτ t = 0.

Proof sketch of Theorem 4.1. To prove this theorem, we first show that Dt(xτ) Dt(x) > 0 for every t > τ. To do so, we introduce some additional notations. First, we define Q(D, x) = x eβa(D)x

eβa(D)x+eβws , which represents the resulting proportion when using a selective response x with data D. Next, we define f(D, x) = D Q(D, x) as the total data generated when choosing x with initial data D. Note that for every t [T], we have f(Dt(x), xt) = Dt+1(x) and Q(Dt(x), xt) = pt(x). Following, we prove that f(D, x) is monotonically increasing with respect to D.

Proposition 4.2. For every x [0, 1] and D R 0 it holds that df(D,x)

Proposition 4.2, combined with Assumption 2.1, imply that for every t > τ, if Dt(xτ) > Dt(x), then it follows that Dt+1(xτ) > Dt+1(x). Iterating Proposition 4.2 leads to Dt(xτ) > Dt(x) for every t > τ. Finally, since Q(D, x) is monotonically increasing with respect to D, we conclude that pt(xτ) pt(x) for every t > τ, thus completing the proof of Theorem 4.1.

Selective Response Strategies for Gen AI

4.2. Revenue Maximization

In this subsection, we develop an approximately optimal algorithm for maximizing Gen AI s revenue. We begin by noting the challenges of the problem, emphasizing why identifying the optimal strategy is nontrivial.

Recall that Theorem 4.1 demonstrates that employing a selective response increases future proportions. This argument can be applied iteratively by employing selective responses in different rounds, further enhancing the future proportions. This intuition hints that a step function-based strategy could be optimal: Gen AI should answer no queries in early rounds and then answer all queries. In such a case, the effective space of optimal strategy reduces to all T step function-based strategies. Unfortunately, this intuition is misleading. Observation 4.3. There exist instances where the optimal strategy x / {0, 1}T .

Due to Observation 4.3, the search for optimal strategies spans the continuous domain [0, 1]T . This observation motivates us to adopt an approximation-based approach to identify near-optimal strategies efficiently. To that end, we devise the ASR algorithm, which stands for Approximately optimal Selective Response. ASR follows a standard dynamic programming structure, but its approximation analysis is nontrivial, as we elaborate below. Therefore, we introduce it in Appendix C.2 and provide an informal description here, along with key insights from its analysis.

Overview of the ASR algorithm Fix any finite set A, A [0, 1]. Naively, if we wish to find arg maxx AT U(x), we could exhaustively search along all AT strategies via inefficient dynamic programming. However, we show how to design a small-size state representation and execute dynamic programming effectively. The challenge is ensuring that any strategy s revenue within the small state representation approximates the actual revenue of that strategy. To achieve this, we discretize the amount of data D. Recall that in each round, the amount of generated data is at most 1, meaning that for any strategy x, the total data up to round t is Dt(x) [0, t 1]. Consequently, we define states by the round t and the discretized data value within [0, t 1]. At the heart of our dynamic programming approach is the calculation of the expected revenue for each state and action y A, based on the induced proportions, generated data, and the anticipated next state. The next theorem provides the guarantees of ASR. Theorem 4.4. Fix any instance and let ε > 0. The ASR algorithm outputs a strategy x such that

U(x) > max x AT U(x ) εLr T 2, (3)

and its run time is O T 2|A|

Proof sketch of Theorem 4.4. To prove the theorem, there are two key elements we need to establish. First, for any two similar data quantities under any selective response strategy, the resulting revenues are similar as well. Imagine that D1

is the actual data quantity generated by some strategy up to some arbitrary round, and D2 is the discretized data quantity of the same strategy in our succinct representation. If Gen AI plays x A in the next round, how different do we expect the data quantity to be in the next round? In other words, we need to bound the difference f(D1, x) f(D2, x) , where f follows the definition from the proof of Theorem 4.1. To that end, we prove the following lemma.

Lemma 4.5. For any D1, D2 R 0 and x A, it holds that f(D1, x) f(D2, x) D1 D2 .

We further leverage this lemma in proving the second key element: The discrepancy of the induced proportions is bounded by the discrepancy in the data quantities, i.e., Q(D1, x) Q(D2, x) < D1 D2 . Equipped with Lemma 4.5 and the former inequality, we bound the discrepancy the dynamic programming process propagates throughout its execution.

Observe that Theorem 4.4 guarantees approximation with respect to the best strategy that chooses actions from A only. Indeed, the right-hand-side of Inequality (3) includes maxx AT U(x ). In fact, by taking A to be the δ-uniform discretization of the [0, 1] interval for a small enough δ > 0, we can extend our approximation guarantees to the best continuous strategy at the expense of a slightly larger approximation factor.

Theorem 4.6. Let δ (0, 1

β ] and let Aδ = {0, δ, 2δ . . . 1}. Let x be the solution of ASR with parameters ε > 0 and Aδ. Then,

U(x) max x U(x ) 7β + 1

4 (1 γ)2 Lrδ εLr T 2,

and the run time of ASR is O T 2

4.3. Welfare-Constrained Revenue Maximization

While the ASR algorithm we developed in the previous subsection guarantees approximately optimal revenue, it might harm user welfare. Indeed, Observation 3.1 implies that selective response can improve revenue but decrease welfare. This motivates the need for a welfare-constrained revenue maximization framework, where the objective is to maximize Gen AI s revenue while ensuring that the social welfare remains above a predefined threshold W. Formally,

max x AT U(x)

s.t. W(x) W. (P1)

Selective Response Strategies for Gen AI

Noticeably, if the constant W is too large, that is, W > maxx W(x), Problem (P1) has no feasible solutions; hence, we assume W maxx W(x). Our approach for this constrained optimization problem is inspired by the PARS-MDP problem (Ben-Porat et al., 2024). We reduce it to a graph search problem, where we iteratively discover the Pareto frontier of feasible revenue and welfare pairs, propagating optimal solutions of sub-problems. Due to space constraints, we defer its description to Appendix C.3 and present here its formal guarantees.

Theorem 4.7. Fix an instance such that γ = 1. Let ε > 0 and let x be the optimal solution for Problem (P1). There exists an algorithm with output x that guarantees

1. U(x) > U(x ) εT 2 max{1, Lr},

2. W(x) > W 2εT 2(La + 1),

and its running time is O T 2|A|

ε2 log( T |A|

Unfortunately, the technique we employed in the previous subsection for extending the approximation from the optimal discrete strategy to the optimal continuous strategy is ineffective in the constrained variant; see Section 7.

5. The Impact of Selective Response on Social Welfare

In this section, we flesh out the impact of implementing τ-selective modifications on social welfare. Specifically, we focus on modifying an arbitrary initial strategy x by applying a selective response in a single round τ.

The next Theorem 5.1 provides a powerful tool in characterizing the change in the instantaneous user welfare in τ-selective modifications. We first present the theorem and then analyze its consequences.

Theorem 5.1. Fix any instance and a strategy x. There exist thresholds B and C, B C < ws, such that for any τ-selective modification xτ it holds that:

1. In round τ, if wg t (x) < B then wτ(xτ) > wτ(x);

2. For every round t > τ, if wg t (xτ) < B, then it holds that wt(xτ) < wt(x);

3. For every round t > τ such that wg t (x) > C, it holds that wt(xτ) > wt(x).

We interpret the theorem using the illustrations in Figure 2. In Figure 2a, the horizontal axis is the round number and the vertical axis is the expected utility users obtain from Gen AI, wg t . There are four curves: The red (circle) is the base strategy x; the blue (triangle) represents a τ-selective

modification xτ; the black (dotted) line is the threshold B; and the orange (dashed) line is the threshold C. Figure 2b also uses the round number as the horizontal axis and includes both strategies x and xτ, but its vertical axis shows the instantaneous welfare wt.

Before round τ, the strategies agree on the response levels; thus, the utilities are identical, and the blue and red curves intersect in both figures. Next, we focus on round τ. Recall that the τ-selective modification has a lower response level in round τ, i.e., xτ τ < xτ. Consequently, Figure 2a demonstrates that the utility Gen AI induces is lower. Part 1 of the theorem implies that the τ-selective modification obtains a higher instantaneous welfare, as shown in Figure 2b. To see why, notice that both wg τ(xτ), wg τ(x) are less than B < ws; thus, any user that is directed to Forum under the modification obtains a higher utility.

For any round t, t > τ, the blue curve is above the red curve in Figure 2a. Namely, Gen AI s quality under the τ-selective modification xτ is greater than that of the base strategy x. This is a direct corollary of Theorem 4.1: More data is created (Dt(xτ) > Dt(x)) and more users choose Gen AI (pt(xτ) pt(x)); hence wg t (xτ) > wg t (x).

Next, we focus on Part 2 of the theorem, which is demonstrated by the shaded gray area (featuring horizontal lines) in the two figures. In Figure 2a, the gray area represents rounds with t > τ and wg t (xτ) < B. Consequently, Part 2 of Theorem 4.1 implies that the instantaneous welfare of xτ

is lower than that of x (shaded area in Figure 2b).

We can reformulate the instantaneous welfare from Equation (2) to include σt( ), namely,

wt(x) = σt(x)wg t (x) + (1 xtσt(x))ws. (4)

On the one hand, Gen AI s expected utility increases under the τ-selective modification (wg t (xτ) > wg t (x)), while both utilities are under B and thus under ws. On the other hand, the proportion of users switching to Gen AI grows: σt(xτ) > σt(x). Therefore, the first product on the righthand side of Equation (4) increases for the τ-selective modification, while the second product decreases. Part 2 quantifies this tradeoff, implying that the instantaneous welfare is decreasing overall. This is illustrated in the gray area in Figure 2b, as the red curve is above the blue curve.

Similarly, Part 3 of the theorem, represented by the green shaded area (with vertical lines), corresponds to rounds in which the red curve in Figure 2a exceeds the threshold C, that is, wg t (x) > C. In these rounds, Part 3 asserts that the instantaneous welfare of xτ exceeds that of x (green area in Figure 2b).

Finally, we discuss the thresholds B and C. For the latter, the theorem holds trivially for C = ws. However, as we show in the proof of the theorem, we have a tighter threshold

Selective Response Strategies for Gen AI

Figure 2: Illustrating Theorem 5.1. The left figure illustrates Gen AI s expected utility vs round index, and the right figure illustrates instantaneous welfare vs round index. The red (circle), blue (triangle) curves represent the base strategy x, and a τ-selective modification xτ. The orange (dashed) and black (dotted) lines represent the thresholds B and C, respectively. The gray and green shaded areas highlight different parts of the theorem. The gray area indicates rounds where the condition of Part 2 is met (Figure 2a), and the resulting lower instantaneous welfare is illustrated in Figure 2b. The green region denotes rounds where xτ leads to higher instantaneous welfare (Part 3).

of C = ws W(e 1)+1

β where W is the Lambert function. As for B, Theorem 4.1 implies its existence, yet finding a closed-form expression remains an open question.

6. Regulating Selective Response for Improved Social Welfare with Minimal Intervention

In this section, we adopt the perspective of a regulator aiming to benefit users through interventions. We show how to use the results from the previous section to ensure that the intervention will be beneficial from a welfare perspective. Additionally, we bound the revenue gap that such an intervention may create. A crucial part of our approach is that the regulator can see previous actions, but not future actions, making it closer to real-world scenarios. Specifically, for any arbitrary round τ, we assume the regulator observes x1, . . . xτ, but has no access to Gen AI s future strategy (xt)T t=τ+1.

6.1. Sufficient Conditions for Increasing Social Welfare

We focus on τ-selective modifications that guarantee to increase welfare w.r.t. a base strategy x. We further assume Gen AI commits to a 0 response level as long as its quality is below C, where C is the threshold from Theorem 5.1. This commitment, formally given by mint>τ{wg t (x) | wg t (x) > 0} > C, represents the minimum utility required from Gen AI for rounds t > τ.

Corollary 6.1. Let B and C be the thresholds from Theorem 5.1. Assume that wg τ(x) < B and that Gen AI commits, i.e., mint>τ{wg t (x) | wg t (x) > 0} > C holds for all t > τ. Then, W(xτ) W(x).

Intuitively, Corollary 6.1 ensures that the welfare improvement due to this intervention (the green shaded region in Figure 2) surpasses the welfare reduction (the gray region).

6.2. Bounding Gen AI s Revenue Gap

A complementary question is to what extent forcing a τselective response can harm Gen AI s revenue. Our goal is to establish a bound on the revenue gap between the base strategy x and the modified strategy xτ, where the selective response occurs in round τ. We stress that incomplete information about future actions makes this analysis challenging.

By definition, x and xτ are identical except for round τ. Consequently, they generate the same amount of data in all rounds before τ. Using a τ-selective response reduces the proportion of answers in that round, which in turn increases the accumulated data available in round τ + 1. Therefore, the revenue gap can be decomposed into two components: (1) The immediate effect of the proportion change in round τ, r(pτ(xτ)) r(pτ(x)); and (2) the downstream effects on subsequent rounds due to the change in the data generation process. Using several technical lemmas that we prove in Appendix E, we show that:

Selective Response Strategies for Gen AI

Corollary 6.2. It holds that

γτ 1 (r(pτ(xτ)) r(pτ(x))) + Lrγτ pτ(xτ) pτ(x)

The above bound is less informative as γ approaches 1. In Theorem E.1, we obtain a tighter bound by having some additional assumptions.

7. Discussion and Future Work

This paper pioneers the novel approach of selective response, showing that withholding responses can be a powerful tool for Gen AI systems. By opting not to answer every query as accurately as it can particularly when new or complex topics emerge Gen AI can encourage user participation on community-driven platforms and thereby generate more high-quality data for future training. This mechanism ultimately enhances Gen AI s long-term performance and revenue. Therefore, selective response is related to active learning, nudging users to generate more data. This mirrors the exploration-exploitation tradeoff from the multi-armed bandit literature: Gen AI forgoes immediate gains and risks user dissatisfaction in the pursuit of better long-term revenue. From a welfare perspective, our results indicate that selective response can benefit users, leading to better solutions and increased overall satisfaction. Since this work is the first to address selective response strategies for Gen AI, numerous promising directions remain for future research; we highlight some of them below.

First, from a technical standpoint, all of the results in this paper rely on Assumption 2.1, involving the Lipshitz condition of the accuracy function and the sensitivity parameter β. Future work could seek to relax this assumption. Furthermore, our constrained optimization approach in Subsection 4.3 could be extended to approximate the optimal (continuous) strategy instead of the optimal discrete strategy.

Second, our stylized model adopts the simplifying though unrealistic assumption that only a single Gen AI platform exists. Admittedly, this makes it easier to focus on the idea of selective responses, and indeed, this assumption is pivotal in keeping our analysis tractable. Future research could explore scenarios with multiple Gen AI platforms and human-centered forums. In such settings, one platform s selective response might redirect users to competing Gen AI platforms, leading to the tragedy of the commons (Hardin, 1968): Although all Gen AI platforms benefit from fresh data generation, none may choose to respond selectively if it means losing users to competitors.

Third, we assumed Forum behaves non-strategically. In reality, human-centered platforms often monetize their data by selling it to Gen AI platforms, adding a further layer

of strategic interaction for Gen AI. Moreover, data transfer between the platforms can form the basis for collaboration: Gen AI could employ selective response to bolster Forum content creation, and Forum could, in turn, attribute that content to Gen AI for subsequent use in retraining.

Acknowledgments

We thank the anonymous reviewers for their helpful comments. This research was supported by the Israel Science Foundation (ISF; Grant No. 3079/24).

Impact Statement

Our work aims to benefit both Gen AI and its users by introducing a new action: allowing Gen AI to choose whether to answer or to adjust the quality of its answer. As shown in this work, a wise use of selective response can be beneficial for both Gen AI and its users. From an ethical perspective, using a selective response can be seen as a responsible decision, as it allows Gen AI to redirect users instead of providing low-quality answers. Furthermore, if Gen AI is transparent about its need for data, it can notify users about its lack of confidence and explain that it may be able to assist them better in the future by choosing not to answer in the present.

Babichenko, Y., Talgam-Cohen, I., Xu, H., and Zabarnyi, K. Algorithmic cheap talk. In Proceedings of the 25th ACM Conference on Economics and Computation, EC 24, pp. 5 6. Association for Computing Machinery, 2024.

Bailey, M., Johnston, D., Kuchler, T., Stroebel, J., and Wong, A. Peer effects in product adoption. American Economic Journal: Applied Economics, 14(3):488 526, 2022.

Ben-Porat, O., Mansour, Y., Moshkovitz, M., and Taitler, B. Principal-agent reward shaping in mdps. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 38, pp. 9502 9510, 2024.

Bergemann, D. and Bonatti, A. Data, competition, and digital platforms. American Economic Review, 114(8): 2553 2595, 2024.

Bergemann, D. and Morris, S. Information design: A unified perspective. Journal of Economic Literature, 57(1):44 95, 2019.

Bergemann, D., Brooks, B., and Morris, S. The limits of price discrimination. American Economic Review, 105 (3):921 957, 2015.

Selective Response Strategies for Gen AI

Burtch, G., Lee, D., and Chen, Z. The consequences of generative ai for online knowledge communities. Scientific Reports, 14(1):10413, 2024.

Chow, Y., Tennenholtz, G., Gur, I., Zhuang, V., Dai, B., Kumar, A., Agarwal, R., Thiagarajan, S., Boutilier, C., and Faust, A. Inference-aware fine-tuning for best-of-n sampling in large language models. In The Thirteenth International Conference on Learning Representations, 2025.

Conitzer, V., Freedman, R., Heitzig, J., Holliday, W. H., Jacobs, B. M., Lambert, N., Moss e, M., Pacuit, E., Russell, S., Schoelkopf, H., Tewolde, E., and Zwicker, W. S. Social choice for AI alignment: Dealing with diverse human feedback. Co RR, abs/2404.10271, 2024. doi: 10.48550/ARXIV.2404.10271. URL https: //doi.org/10.48550/ar Xiv.2404.10271.

Crandall, J. W., Oudah, M., Tennom, Ishowo-Oloko, F., Abdallah, S., Bonnefon, J.-F., Cebrian, M., Shariff, A., Goodrich, M. A., and Rahwan, I. Cooperating with machines. Nature communications, 9(1):233, 2018.

Crawford, V. P. and Sobel, J. Strategic information transmission. Econometrica: Journal of the Econometric Society, pp. 1431 1451, 1982.

Dean, S., Dong, E., Jagadeesan, M., and Leqi, L. Accounting for ai and users shaping one another: The role of mathematical models. ar Xiv preprint ar Xiv:2404.12366, 2024.

del Rio-Chanona, R. M., Laurentsyeva, N., and Wachs, J. Large language models reduce public knowledge sharing on online q&a platforms. PNAS nexus, 3(9):pgae400, 2024.

Esmaeili, S. A., Bhawalkar, K., Feng, Z., Wang, D., and Xu, H. How to strategize human content creation in the era of genai? ar Xiv preprint ar Xiv:2406.05187, 2024.

Feldman, M., Meir, R., and Tennenholtz, M. Competition in the presence of social networks: how many service providers maximize welfare? In International Conference on Web and Internet Economics, pp. 174 187. Springer, 2013.

Frieder, S., Pinchetti, L., Griffiths, R.-R., Salvatori, T., Lukasiewicz, T., Petersen, P., and Berner, J. Mathematical capabilities of chatgpt. Advances in Neural Information Processing Systems, 36, 2024.

Garey, M. R. and Johnson, D. S. Computers and intractability, volume 174. freeman San Francisco, 1979.

Hardin, G. The tragedy of the commons: the population problem has no technical solution; it requires a fundamental extension in morality. science, 162(3859):1243 1248, 1968.

Hemmer, P., Thede, L., V ossing, M., Jakubik, J., and K uhl, N. Learning to defer with limited expert predictions. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pp. 6002 6011, 2023.

Jagadeesan, M., Jordan, M. I., and Haghtalab, N. Competition, alignment, and equilibria in digital marketplaces. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pp. 5689 5696, 2023.

Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., Ishii, E., Bang, Y. J., Madotto, A., and Fung, P. Survey of hallucination in natural language generation. ACM Computing Surveys, 55(12):1 38, 2023.

junyou li, Zhang, Q., Yu, Y., FU, Q., and Ye, D. More agents is all you need. Transactions on Machine Learning Research, 2024. ISSN 2835-8856. URL https://op enreview.net/forum?id=bgz USZ8aeg.

Kalai, A. T. and Vempala, S. S. Calibrated language models must hallucinate. In Proceedings of the 56th Annual ACM Symposium on Theory of Computing, pp. 160 171, 2024.

Karle, H., Peitz, M., and Reisinger, M. Segmentation versus agglomeration: Competition between platforms with competitive sellers. Journal of Political Economy, 128 (6):2329 2374, 2020.

Kasneci, E., Seßler, K., K uchemann, S., Bannert, M., Dementieva, D., Fischer, F., Gasser, U., Groh, G., G unnemann, S., H ullermeier, E., et al. Chatgpt for good? on opportunities and challenges of large language models for education. Learning and individual differences, 103: 102274, 2023.

Katz, M. L. and Shapiro, C. Network externalities, competition, and compatibility. The American economic review, 75(3):424 440, 1985.

Keinan, G. and Ben-Porat, O. Strategic content creation in the age of genai: To share or not to share? ar Xiv preprint ar Xiv:2505.16358, 2025.

Koco n, J., Cichecki, I., Kaszyca, O., Kochanek, M., Szydło, D., Baran, J., Bielaniewicz, J., Gruza, M., Janz, A., Kanclerz, K., et al. Chatgpt: Jack of all trades, master of none. Information Fusion, pp. 101861, 2023.

Koutsoupias, E. and Papadimitriou, C. Worst-case equilibria. In Annual symposium on theoretical aspects of computer science, pp. 404 413. Springer, 1999.

Selective Response Strategies for Gen AI

Laufer, B., Kleinberg, J., and Heidari, H. Fine-tuning games: Bargaining and adaptation for general-purpose models. In Proceedings of the ACM on Web Conference 2024, pp. 66 76, 2024.

Li, K., Patel, O., Vi egas, F., Pfister, H., and Wattenberg, M. Inference-time intervention: Eliciting truthful answers from a language model. Advances in Neural Information Processing Systems, 36, 2024.

Li, X. and Kim, K. Impacts of generative ai on user contributions: evidence from a coding q &a platform. Marketing Letters, pp. 1 15, 2024.

Liu, J., Xia, C. S., Wang, Y., and Zhang, L. Is your code generated by chatgpt really correct? rigorous evaluation of large language models for code generation. Advances in Neural Information Processing Systems, 36, 2024.

Lo, Y. L., de Witt, C. S., Sokota, S., Foerster, J. N., and Whiteson, S. Cheap talk discovery and utilization in multi-agent reinforcement learning. In The Eleventh International Conference on Learning Representations, 2023.

Lu, C., Willi, T., Letcher, A., and Foerster, J. N. Adversarial cheap talk. In International Conference on Machine Learning, pp. 22917 22941. PMLR, 2023.

Mc Intyre, D. P. and Srinivasan, A. Networks, platforms, and strategy: Emerging views and next steps. Strategic management journal, 38(1):141 160, 2017.

Mello, M. M. and Guha, N. Chatgpt and physicians malpractice risk. JAMA Health Forum, 4(5):e231938, May 2023. doi: 10.1001/jamahealthforum.2023.1938. URL https://jamanetwork.com/journals/jam a-health-forum/fullarticle/2805334.

Milgrom, P. R. Good news and bad news: Representation theorems and applications. The Bell Journal of Economics, pp. 380 391, 1981.

Mozannar, H. and Sontag, D. Consistent estimators for learning to defer to an expert. In International conference on machine learning, pp. 7076 7087. PMLR, 2020.

Raghavan, M. Competition and diversity in generative ai. ar Xiv preprint ar Xiv:2412.08610, 2024.

Rietveld, J. and Schilling, M. A. Platform competition: A systematic and interdisciplinary review of the literature. Journal of Management, 47(6):1528 1563, 2021.

Roughgarden, T. Selfish routing and the price of anarchy. MIT press, 2005.

Shin, R. Humiliated lawyers fined $5,000 for submitting chatgpt hallucinations in court: i heard about this new site, which i falsely assumed was, like, a super search

engine , June 2023. URL https://fortune.com/ 2023/06/23/lawyers-fined-filing-chatg pt-hallucinations-in-court/.

Sun, H., Chen, Y., Wang, S., Chen, W., and Deng, X. Mechanism design for LLM fine-tuning with multiple reward models. In Pluralistic Alignment Workshop at Neur IPS 2024, 2024.

Taitler, B. and Ben-Porat, O. Braess s paradox of generative ai. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 39, pp. 14139 14147, 2025.

Tullock, G. Efficient rent seeking. In Buchanan, J. M., Tollison, R. D., and Tullock, G. (eds.), Toward a Theory of the Rent-Seeking Society, pp. 97 112. Texas A and M University Press, College Station, 1980.

Yao, F., Li, C., Nekipelov, D., Wang, H., and Xu, H. Human vs. generative ai in content creation competition: symbiosis or conflict? In Proceedings of the 41st International Conference on Machine Learning, ICML 24. JMLR.org, 2024.

Selective Response Strategies for Gen AI

A. Definitions and Notations

We first define the following function:

f(D, x) = D + x 1 eβa(D)x

eβa(D)x + eβws

Denote q(D, x) = eβa(D)x

eβa(D)x+eβws , therefore f(D, x) can be expressed as f(D, x) = d + (1 xq(D, x)).

Next, we define ε as the discretization operator with respect to ε R. Formally, for any x R, the discretization operator is given by x ε = ε x

B. Proofs Omitted from Section 3

Proof of Observation 3.1. We prove each clause separately.

1. Pareto dominance This is shown in Example 2.2, for which it holds that

U( x) < 2.356.

U(x) > 2.483.

W( x) < 5.73.

W(x) > 6.2.

2. Decreases revenue and increases welfare Let T = 5 and consider the instance a(D) = 1 e 0.4D, γ = 1, β = 3, ws = 0.7 and r(p) = p.

We calculate the revenue and the social welfare induced by x by calculating the proportions for every t [T]. Therefore, the induced revenue is U( x) > 1.6 and the social welfare W( x) < 3.3.

Next, we denote x = (0, 0, . . . , 0), the strategy for which Gen AI never answers. By definition we have that U(x) = 0 < U( x) and W(x) = Tws = 3.5 > W( x).

3. Increases revenue and decreases welfare Let T = 5 and consider the instance a(D) = 1 e 0.4D, γ = 1, β = 3, ws = 0.1 and r(p) is the step function defined as

( 1 p q(4, 1) 0 Otherwise .

We denote x the strategy that satisfies

( 1 t = T 0 Otherwise .

Notice that p1( x) > 0 and therefore for every t [5] it holds that Dt( x) < 4. Thus, U( x) = 0.

The revenue induced by x is equal to the revenue induced at round T. This is true since xt = 0 for every t < T and therefore pt(x) = 0. At round T, the total generated data is DT (x) = T 1 = 4. Thus, U(x) = r(p T (x)) = q(4, 1) > 0.89

Calculating the welfare induced by x can be done by calculating pt( x), resulting in W( x) > 1.17.

Similarly, we can calculate the welfare induced by strategy x. Repeating the same calculation leads to W(x) < 1.122; thus, we can conclude that U( x) < U(x) and W( x) > W(x). This completes the proof of Observation 3.1.

Selective Response Strategies for Gen AI

B.1. Proofs Omitted from Subsection 3.1

Proof of Proposition 3.2. Consider the instance a(D) = 1+D

T , γ = 1, β = 1, ws = 1

T and r(p) is the sigmoid function defined as r(p) = 1 1+e ξ(q(T 1,1) p) , such that ξ = ln(2T M) q(T 1,1) q( T 1

Notice that for every t T it holds that wg t ( x) = a(Dt( x)) = 1+Dt( x)

T = ws. Therefore, we get that pt( x) > 0.5 and Dt < t 1

we now bound the revenue induced by x.

t=1 r(pt(x)) < Tr(q(T 1

Next, we define the scheme that answers only at the last round x = (0, 0, . . . , 0, 1). Notice that the revenue induced by x

is U(x ) = r(q(T 1, 1)) = 0.5. Therefore,

RPAR = maxx U(x)

U( x) > 0.5 Tr(q( T 1

2 , 1)) = 1 + eξ(q(T 1) q( T 1

> eξ(q(T 1) q( T 1

2T = 1 + eln(2T M 1)

Notice that it holds that

Lr = max p [0,1] dr dp = max p [0,1] r(p)(1 r(p))ξ ξ

For T = 10, we get that Lr 15.26 ln(M). This completes the proof of Proposition 3.2.

Proof of Proposition 3.3. Let T R>0 and consider the instance a(D) = D T 3 , γ = 1, β = 1, ws = 1

T and r(p) = p.

Notice that the utility of the users from Gen AI is bounded by

wg t (x) = a(Dt)xt = D

Furthermore, we can bound the proportions by

pt( x) = 1 1 + eβ(ws wg t ) > 1 1 + eβws .

Therefore, the users social welfare satisfies that

wt( x) = wg t ( x)pt( x) + (1 pt( x))ws

T 2 pt( x) + (1 pt( x))ws

T 2 1 1 + eβws + (1 1 1 + eβws )ws

1 + e β T + (1 1

1 + e β T )ws.

Next, denote x = (0, 0, . . . 0), the strategy for which Gen AI does not answer any query. Therefore, by definition it holds that wt( x) = ws.

Selective Response Strategies for Gen AI

We now bound the price of anarchy:

WPAR = maxx W(x)

W( x) W( x)

1+e β T + (1 1

1+e β T )ws

1+e β T + (1 1

1+e β T ) .

Notice that 1 2 ε > 0.5. Next, denote h(T) = β

1+e β T + (1 1

1+e β T ). Observe that h(t) is continuous in T and satisfies the

following properties:

1. h(1) = 1,

2. lim T h(T) 0.5.

Therefore, by the intermediate value theorem, there exists T0 such that h(T0) = 1 2 ε. Furthermore,

1 + e β T β

1 + e β T 1

1 + e β T β T 2 < 0;

hence, for every T > T0 it holds that

WPAR 1 h(T) 1 h(T0) = 1

1 2 ε = 2 ε.

This completes the proof of Proposition 3.3.

Theorem B.1. For every M R 0 there exists an instance I with Po A(I) > M.

Proof of Theorem B.1. Let T R>0 and Consider the instance a(D) = D

T , γ = 1, β = 3, ws = 1

T . We let r(p) be the step function

( 1 p q(T 1, 1) 0 Otherwise .

The purpose of choosing r(p) as a step function is to show that Gen AI s revenue-maximizing strategy is (0, . . . , 0, 1). Notice that we can also represent this function as a sigmoid r(p) 1 1+eξ(q(T 1,1) p) for ξ .

Notice that in each turn, the maximal amount of data that can be generated is 1, which occurs for xt = 0. Therefore, for T 1 rounds, the maximum amount of data that can be generated is T 1, which is induced by the strategy that uses xt = 0 for every t T 1. Answering any query before round T results in r(pt) = 0 for every t [T]. Therefore, Gen AI s optimal strategy is:

( 0 t < T 1 Otherwise .

We now evaluate the welfare for the schemes x and x. We start with x :

t=1 wt(x ) + w T (x ) = (T 1)w1(x ) + w T (x ) (T 1)ws + 1 Tws + 1 = 2.

Selective Response Strategies for Gen AI

We move on to evaluate the social welfare induced by x. First, notice that for every T 1 it holds that

pt( x) p1( x) = q(0, 1) = 1 1 + eβ(ws a(0)) = 1

1 + e β T 1 1 + eβ > 0.04.

Similarly, we develop an upper bound on the proportions:

pt( x) = 1 1 + eβ(ws a(Dt( x))) = 1

1 + e β 1 T Dt( x)

Using the bound on the proportions, we can get a lower bound on the total amount of data at each round

t =1 (1 pt ( x)) > 0.04(t 1).

This allows us to evaluate the minimal welfare induced by strategy x:

t=1 pt( x)wg t ( x) + (1 pt( x))ws >

t=1 pt( x)wg t ( x) > 0.04

t=1 wg t ( x)

t=1 a(Dt( x)) > 0.04

We are now ready to plug everything we calculated so far into the definition of the Po A.

Po A = maxx W(x) minx max U(x) W(x) = maxx W(x)

Therefore, for every T > 4M 0.042 + 1, it holds that

Po A > 0.042

0.042 + 1 1) = M.

This completes the proof of Theorem B.1.

C. Proofs Omitted from Section 4

C.1. Proofs Omitted from Subsection 4.1

Proof of Proposition 4.2. We take the derivative of f(D, x):

d D = 1 xdq(D, x)

= 1 βx2q(D, x) (1 q(D, x)) da(D)

Notice that q(d, x) [0, 1] for every d R 0 and x [0, 1]. Furthermore, the expression q(1 q) has one maximum point at q = 0.5, therefore

Selective Response Strategies for Gen AI

d D = 1 βx2q(D, x) (1 q(D, x)) da(D)

This completes the proof of Proposition 4.2.

Proof of Theorem 4.1. We first show that if y < xτ then Dτ+1(xτ) > Dτ+1(x). By definition of xτ and x it holds that Dt(xτ) = Dt(x) for every t τ. Next, notice that if y < xτ then

pτ(xτ) = xτ τ eβa(dτ (xτ ))xτ τ

eβa(dτ (xτ ))xττ + eβws

= y eβa(dτ (xτ ))y

eβa(dτ (xτ ))y + eβws < xτ eβa(dτ (x))xτ

eβa(dτ (x))xτ + eβws = pτ(x);

therefore, it holds that

Dτ+1(xτ) = Dτ(xτ) + (1 pτ(xτ))

> Dτ(x) + (1 pτ(x)) = Dτ+1(x).

Next, we use the following proposition to show that Dτ+1(xτ) > Dτ+1(x).

Proposition C.1. Let τ [T] and x, x be two selective response strategies such that xt = xt for every t τ. If Dt(x) > Dt( x) and da(D)

β Then for every t τ it holds that Dt(x)t > Dt( x)t and pt(x) pt( x) where inequality holds only if xt = 0.

Thus, by Proposition C.1 it holds that pt(xτ) pt(x). This completes the proof of Theorem 4.1.

Proof of Proposition C.1. We prove our claim by proving a slightly stronger version using induction over the rounds. In addition to the original claim, we also prove that Dt(x) > Dt( x) for every t τ. We start with the base case t = τ. Notice that pτ(x) = q(Dτ(x), xτ) and pτ( x) = q(Dτ( x), xτ).

We now use the following lemma:

Lemma C.2. For every x [0, 1] and D R 0, it holds that q(D, x) satisfies dq(D,x)

Since Dτ(x) > Dτ( x) then from Lemma C.2 it holds that pτ(x) pτ( x). Next, we show that Dτ+1(x) > Dτ+1( x). Notice that Dτ+1( x) = Dτ( x) + (1 pτ( x)) = f(Dτ( x), xτ), similarly Dτ+1(x) = f(Dτ(x), xτ). By Proposition 4.2 it holds that f(D, xτ) is monotonic increasing in D. Therefore, Dτ(x) > Dτ( x) leads to f(Dτ(x) > f(Dτ( x), xτ) and thus Dτ+1(x) > Dτ+1( x).

Assume the claim holds for t 1 > τ, and we prove it holds for t. Since it holds for t 1, then Dt(x) > Dt( x). Therefore, by Lemma C.2 it holds that pt(x) > pt( x). Lastly, by Proposition 4.2 it holds that

Dt+1(x) = f(Dt+1(x), xt) > f(Dt+1( x), xt) = Dt+1( x).

This completes the proof of Proposition C.1.

Proof of Lemma C.2. We take the derivative of q(D, x):

d D = βx2q(D, x)(1 q(D, x))da(D)

As we assume in the model, da(D)

d D 0. Furthermore, q(D, x) [0, 1] for every x [0, 1], and therefore dq(D,x)

d D 0. This completes the proof of Lemma C.2.

Selective Response Strategies for Gen AI

Algorithm 1 Approximately optimal Selective Response (ASR) Input: T, A, ε Output: x

1: V (t, d) 0, π(t, d) 0 for every t [T + 1] and d {0, ε, . . . , T} 2: for t = T . . . 1 do 3: for d {0, ε, . . . , t 1} do 4: U(y) 0 for every y A 5: for y A do 6: p y eβa(d)y

eβa(d)y+eβws 7: d d + (1 p) ε 8: vd(y) r(p) + γV (t + 1, d ) 9: end for 10: V (t, d) maxy vd(y) 11: π(t, d) arg maxy vd(y) 12: end for 13: end for 14: extract x from π starting at t = 1, d = 0 15: Return x

C.2. Proofs Omitted from Subection 4.2

Proof of Observation 4.3. Consider the instance a(D) = 0.7(1 e0.4D) + 0.3, γ = 1, r(p) = p, β = 41 and ws = 0.66. Let T = 3 and observe the revenue for the following schemes:

1. x = (1, 1, 1).

2. x1 = (0, 1, 1).

3. x2 = (0, 0, 1).

4. x3 = (1, 0, 1).

5. x = (0.04, 0.97, 1).

Notice that we do not consider schemes where x3 = 0 since for any such scheme, the scheme which is identical at round t = 1, t = 2 and plays x3 = 1 induces higher revenue; therefore, the revenue difference between x and the other schemes is as follows:

U(x) U( x) > 9.71 10 6.

U(x) U(x1) > 9.71 10 6.

U(x) U(x2) > 7.9 10 6.

U(x) U(x3) > 7.89 10 6.

This completes the proof of Observation 4.3.

Proof of Theorem 4.4. We denote t = (t 1)ε and Ut(x) the accumulated revenue from round t until T following scheme x, formally Ut(x) = PT i=t γi tpi(x). We use the following lemma to show the relationship between V (t, d ε) and Ut(x ).

Lemma C.3. Fix round t [T]. For every d {0, ε, . . . , T} such that |d Dt(x )| t it holds that

V (t, d) > Ut(x ) Lr

Selective Response Strategies for Gen AI

Notice that D1 = 0 by definition, and thus U1(x ) = U(x ). Therefore, Lemma C.3 suggests that

V (1, 0) > U(x ) Lr

We use the following lemma to evaluate the differences between U(x) and V (1, 0).

Lemma C.4. Let (dt)T t=1 be the sequence defined by dt = 0 and dt+1 = dt + (1 xtq(dt, xt)) ε. Then, for every t [T] it holds that dt < Dt(x).

Therefore, by Lemma C.2 it holds that

t=1 r(xtq(dt, xt))

t=1 r(xtq(Dt(x), xt)) =

t=1 r(pt(x)) = U(x).

Thus, we can write:

U(x) V (1, 0) > U(x ) Lr

To complete the proof of Theorem 4.4, we prove the following lemma.

Lemma C.5. It holds that PT i=1 iγi 1 < εT 2.

This completes the proof of Theorem 4.4.

Proof of Lemma C.3. We prove this lemma using backward induction, starting with the base case from round T. For that, we start by bounding the difference in proportions using the following lemma.

Lemma C.6. Let d1, d2 R 0 and y [0, 1]. If d1 < d2 then 0 y(q(d2, y) q(d1, y)) < d2 d1.

In round T it holds that |d DT (x )| < T . Therefore, for every y A it holds that

y |q(d, y) q(DT (x ), y)| < |d DT (x )| < T .

Let d (y) = d + (1 yq(d, y)) ε. Consequently,

|V (T, d) UT (x )| = max y A{r(yq(d, y)) + γV (T + 1, d (y))} UT (x )

= max y A r(yq(d, y)) UT (x )

= max y A r(yq(d, y)) r(x T q(DT (x ), x T ))

= r(max y A yq(d, y)) r(x T q(DT (x ), x T ))

max y A yq(d, y) x T q(DT (x ), x T )

We finished with the base case and move on to the induction step. Assume the lemma is true for t + 1 and we show it holds for round t.

Selective Response Strategies for Gen AI

according to the assumptions in the lemma, it holds that |d Dt(x )| < t, therefore according to Lemma C.6, for every y A it holds that

|r(yq(d, y)) r(yq(Dt(x ), y))| Lr |yq(d, y) yq(Dt(x ), y)| < Lr |d Dt(x )| < Lr t.

We use the next lemma to bound the difference in data at step t + 1.

Lemma C.7. it holds that |f(Dt(x ), y) f(d, y) ε| < t+1.

Lemma C.7 suggests that the condition for the induction step holds, and therefore according to our induction step:

V (t + 1, f(d, x t ) ε) > Ut+1(x ) Lr

i=t+1 iγi (t+1);

vd(x t ) = x t q(d, x t ) + γV (t + 1, f(d, x t ) ε)

> x t q(Dt(x ), x t ) Lr t + γV (t + 1, f(d, x t ) ε)

> x t q(Dt(x ), x t ) Lr t + γ

Ut+1(x ) Lr

i=t+1 iγi (t+1) !

= x t q(Dt(x ), x t ) + γUt+1(x ) Lr t γLr

i=t+1 iγi (t+1)

Finally, it holds that

V (t, d) = max y A vd(y) vd(x t ) > Ut(D, x) Lr

This completes the proof of Lemma C.3.

Proof of Lemma C.6. Since d2 > d1 then according to Proposition 4.2 it holds that

f(d2, y) f(d1, y) > 0;

f(d2, y) f(d1, y) = d2 + (1 yq(d2, y)) d1 (1 yq(d1, y))

= d2 yq(d2, y) d1 + yq(d1, y) > 0.

Rearranging the above inequality, we get that

y(q(d2, a) q(d1, a)) < d2 d1.

Furthermore, from Lemma C.2 it holds that q(d2, a) q(d1, a) and therefore we can summarize

0 y(q(d2, a) q(d1, a)) < d2 d1.

This completes the proof of Lemma C.6.

Selective Response Strategies for Gen AI

Proof of Lemma C.7. We prove that for any D R 0 it holds that it holds that |f(D, y) f(d, y) ε| < t+1.

First, we use the following lemma.

Lemma C.8. Let d1, d2 [0, T] then it holds that f(d1, y) f(d2, y) < d1 d2 .

Therefore, using Lemma C.8 we get

|f(D, y) f(d, y) ε| |f(D, y) f(d, y)| + ε

< |D d| + ε

= (t 1)ε + ε

= tε = t+1.

This completes the proof of Lemma C.7.

Proof of Lemma C.8. Assume without loss of generality that d1 < d2. Therefore, according to Proposition 4.2, for every y A it holds that f(d1, y) < f(d2, y). Furthermore, from Lemma C.2 it holds that q(d1, y) q(d2, a). Thus, we can write:

f(d2, y) f(d1, y) = f(d2, y) f(d1, y)

= d2 + (1 yq(d2, y)) d1 (1 yq(d1, y))

= d2 d1 yq(d2, y) + yq(d1, y)

d2 d1 yq(d1, y) + yq(d1, y)

This completes the proof of Lemma C.8.

Proof of Lemma C.4. By definition and by Proposition 4.2 it holds that:

dt+1 = dt + (1 xtq(dt, xt)) ε = f(dt, xt) ε f(dt, xt) f(Dt(x)) = Dt+1(x).

This completes the proof of Lemma C.4.

Proof of Lemma C.5. Since γ 1 it holds that PT i=1 iγi 1 PT i=1 i. Notice that we now have a sum of an arithmetic series and therefore

i=1 (i 1) = ε

i=0 i = εT(T 1)

This completes the proof of Lemma C.5.

Proof of Theorem 4.6. Denote x = maxx U(x ) and we define the following T different strategies {x(i)}T +1 i=1 such that

( x t δ t i x t Otherwise .

Selective Response Strategies for Gen AI

Notice that by definition x(T + 1) = x . Furthermore, observe that the strategies x(i) and x(i + 1) for every i [T] differ only in round i. The following lemma bound the difference between strategy x(i) and strategy x(i + 1).

Lemma C.9. For every i [T] it holds that

|U(x(i)) x(i + 1)| γi 1

4β + 1 Lrδ.

Observe that

|U(x(1)) U(x(T + 1))|

i=1 |U(x(i)) U(x(i + 1))| .

Therefore, by lemma C.9 we get that

|U(x(1)) U(x(T + 1))|

4β + 1 Lrδ 7β + 1

4 (1 γ)2 Lrδ.

Lastly, notice that U(x ) maxx AT δ U(x ) U(x(1)). Therefore, we can write:

|U(x ) U(x)| =

U(x ) max x AT δ U(x ) + max x AT δ U(x ) U(x)

U(x ) max x AT δ U(x )

max x AT δ U(x ) U(x)

U(x ) U(x(1)) + U(x(1)) max x AT δ U(x )

max x AT δ U(x ) U(x)

|U(x ) U(x(1))| +

max x AT δ U(x ) U(x)

4 (1 γ)2 Lrδ + εLr T 2.

This completes the proof of Theorem 4.6.

Proof of Lemma C.9. By definition, x(i)t = x(i + 1)t for every t < i and therefore Dt(x(i)) = Dt(x(i + 1)).

Next, we use the following lemma:

Lemma C.10. For every D1, D2 [0, T] and x, x [0, 1] it holds that

|q(D, x) q(D, x )| = q(D, x) (1 q(D, x )) 1 eβ(x a(D) xa(D)) .

Notice that in our case, for every |x(i)i x(i + 1)i| < δ. Therefore,

|q(Di(x(i)), x(i)i) q(Di(x(i + 1)), x(i + 1)i)|

= q(Di(x(i)), x(i)i) (1 q(Di(x(i + 1)), x(i + 1)i)) 1 eβa(Di(x(i)))(x(i+1)i x(i)i)

1 eβa(Di(x(i)))(x(i+1)i x(i)i)

Selective Response Strategies for Gen AI

Where the last inequality follows from |1 ex| 7x

4 for every |x| < 1. Next, notice that for every D R 0 and x, x [0, 1] such that |x x | δ it holds that

|xq(D, x) x q(D, x )| = |xq(D, x) (x x + x) q(D, x )|

= |xq(D, x) xq(D, x ) (x x) q(D, x )|

|xq(D, x) xq(D, x )| + |x x| q(D, x )

|xq(D, x) xq(D, x )| + |x x|

Therefore, by Corollary 6.2 it holds that

|U(x(i)) U(x(i + 1))| γi 1 |r(pi(x(i))) r(pi(x(i + 1)))| + Lrγi |pi(x(i)) pi(x(i + 1))|

γi 1Lr |pi(x(i)) pi(x(i + 1))| + Lrγi |pi(x(i)) pi(x(i + 1))|

= γi 1Lr |pi(x(i)) pi(x(i + 1))| 1 1 γ

4β + 1 Lrδ.

This completes the proof of Lemma C.9.

Proof of Lemma C.10. This lemma is a special case of Lemma E.7 and is hence omitted.

C.3. Proofs Omitted from Subsection 4.3

Proof of Theorem 4.7. The proof is constructed in 5 parts. First, we simplify and write our problem explicitly. Then, we define an approximation to our problem and build an MDP to describe it. The third step is to show that our approximation problem can be viewed as an instance of the problem in (Ben-Porat et al., 2024) and thus has an optimal solution. In the last step, we calculate the gap between the optimal solution of the approximated problem and the optimal solution of our original problem.

Step 1. We start by rewriting Problem P1. Notice that the welfare at each round can be written as

wt(x) = pt(x)a(Dt(x)) + (1 pt(x))ws = pt(x) (a(Dt(x)) ws) + ws.

Therefore, the social welfare can be expressed as W(x) = Tws + PT t=1 pt(x) (a(Dt(x)) ws). By denoting W 1 = W Tws we can rewrite our problem as

t=1 r(pt(x))

t=1 pt(x) (a(Dt(x)) ws) W 1.

Step 2. We now build a graph to represent an approximation of our problem. Notice that the maximum amount of data that can be generated in each round is 1 and therefore Dt(x) < T for every t [T] and scheme x. Therefore, given ε > 0, we discretize all the available data values by increments of T

ε . We know describe the components of our graph. Our graph is a deterministic MDP with an underlying layered graph as follows: let S = {S1, . . . ST +1} the set of all states where St = {s0 t, sε t, . . . , sεT t } denote the state in the t th layer where sd t represents the state where Gen AI is in round t

Selective Response Strategies for Gen AI

Figure 3: Example of a constructed graph with a discretization factor of ε = 0.5.

with d data. The set of actions is A, and there are 2 reward functions defined for each state-action pair. First is defined by R(sd t , y) = r(yq(d, y)) ε while the second is W(sd t , y) = yq(d, y) (a(d) ws) ε. Next, we let T (s, y, s ) denote the transition function, which denotes the probability of reaching state s by playing y in state s. The transition function in our MDP is deterministic and defined by

T (sd t , y, sd t ) =

( 1 t = t + 1 and d = d + 1 yq(d, y) ε 0 Otherwise .

In terms of graphs, the states are analogous to vertices, and T(s, y, s ) = 1 specifies an edge from state s to state s . An illustration of this graph for ε = 0.5 is presented in Figure 3.

By the construction of the layered graph, the horizon is T + 1, and Gen AI starts at state s0 1. We define policy π : S A to be the mapping between each state and the action Gen AI should take in that state. For a deterministic MDP, a policy is equivalent to a path τ, which in our case is a sequence of T edges starting from state s0 1 and leading to a state in ST +1. Notice that each edge represents a state and an action from that state, and therefore, path τ can also be defined as a sequence of state-action pairs.

The problem we aim to solve using the graph is the following problem:

(s,y) τ R(s, y) (P3)

(s,y) τ W(s, y) W 1.

Step 3. Recall that a selective response strategy is a vector that specifies the portion of queries Gen AI should answers. We denote πx that follows scheme x, that is πx assigns the same action to all states at round t as xt, formally πx(sd t ) = xt for every t [T] and d {0, ε, . . . T}.

We now introduce some notations that we use in this step. First, we denote Ut(x) and Wt(x) the accumulative revenue and welfare from round t until T, following scheme x. Formally

Ut(x) = r(xtq(Dt(x), xt)) + Ut+1(x),

Wt(x) = xtq(Dt(x), xt) (a(Dt(x)) ws) + Wt+1(x).

We now define the analog of Ut and Wt in our MDP. Let V G(π, s) denote the sum of rewards with respect to reward function R, following policy π and starting at state s in our MDP. Similarly, denote V W (π, s) the sum of rewards with respect to reward function W. Formally

V G(π, sd t ) = R(sd t , π(s)) + V G(π, sd t+1),

V W (π, sd t ) = W(sd t , π(s)) + V W (π, sd t+1).

Selective Response Strategies for Gen AI

We are now ready to compare the values of the revenue and social welfare following a given selective response strategy to those from the MDP. Let x be an arbitrary selective response strategy and M = max{1, Lr}. We use the following lemma.

Lemma C.11. Fix round t [T], then for every d {0, ε, . . . T} such that |d Dt(x)| < (t 1)ε it holds that

V G(πx, sd t ) Ut(x) εM PT i=t i,

V W (πx, sd t ) Wt(x) ε(La + 1) PT i=t i.

Notice that PT i=1 i < T 2 and therefore we can simplify the summations in Lemma C.11.

Given the optimal selective response strategy x , Lemma C.11 suggests that

V G(πx , s0 1) Ut(x ) εMT 2,

V W (πx , s0 1) Wt(x ) (La + 1)εT 2.

We finished Step 2 and now move on to develop the machinery to find the selective response strategy that gives us the guarantees of our theorem.

Step 4. We define the Weight-Constrained Shortest Path (WCSSP) (Garey & Johnson, 1979). Given a weighted graph G = (V, E) with weights {we}e E, costs {ce}e E and a maximum weight W R, the problem is to find the path with the least cost while keeping the total weights below W. Let τ denote a path, and therefore, the WCSSP problem is defined as

e τ ce (P4)

Problem P3 can be seen as an instance of Problem (P4) for by setting:

c(s, y) = R(s, y),

w(s, y) = W(s, y),

To account for the approximation error in the welfare due to calculating it using the MDP, we choose W = W 1 εT 2(L + 1) .

Problem (P4) is a known NP-Hard problem with a reduction to the PARS-MDP problem (Ben-Porat et al., 2024) with a deterministic transition function. The PARSE-MDP problem is defined over an MDP with two reward functions RA, RP : S A R 0 and a budget B R 0. The goal is to construct a new reward function RB : S A R 0 such that the total rewards over the whole MDP is less than B, and the induced policy that maximizes RA + RB also maximizes RP under the constraint. Formally, the PARS-MDP is defined as follows:

max RB V (π, RP ) X

s S,y A RB(s, y) B (P5)

RB(s, y) 0 for every s S, y A(s)

π A(RA + RB)

Selective Response Strategies for Gen AI

where V (π, RP ) is the total sum of rewards from RP following policy π. Therefore, we make the following definitions to represent Problem (P3) as an instance of Problem (P5): First, denote τ A the path that maximizes RA, i.e τ A arg maxτ P

s,y τ RA(s, y). Notice that τ A can be computed using standard methods which run in polynomial time with respect to the problem s parameters. Thus, we refer to τ A as a known parameter and define the parameters of PARS-MDP as follows:

RP (s, y) = R(s, y),

RA(s, y) = W(s, y),

s,y τ A RA(s, y) W 1 εT 2(La + 1) .

Notice that W, R are in increments of ε by the construction of our MDP. Therefore, we can use Theorem (5) from (Ben-Porat et al., 2024) to show that the optimal path of Problem (P3) can be found in polynomial time with respect to the problem s parameters.

Theorem C.12. There is a known algorithm to compute the path τ which induces

s,y τ RP (s, y) = maxτ P

s,y τ RP (s, y),

s,y τ RA(s, y) P

s,y τ A RA(s, y) B.

in time O( |S||A|T

ε log( |A|T

Using the terms from our MDP, the solution from the algorithm in Theorem C.12 guarantees

s,y τ R(s, y) = maxτ P

s,y τ R(s, y),

s,y τ W(s, y) W 1 εT 2(La + 1).

Let τ be the path corresponding to x . Notice that τ guarantees X

s,y τ W(s, y) W(x ) εT 2(La + 1) W 1 εT 2(La + 1).

The path τ is a possible solution of the PARS-MDP and therefore, by Theorem C.12, the path τ guarantees

P s,y τ R(s, y) P s,y τ R(s, y) U(x ) εMT 2,

s,y τ W(s, y) W 1 εT 2(La + 1).

Step 5. Let x be the selective response strategy corresponding to path τ and we compare the revenue and welfare when playing x.

We begin with the following lemma.

Lemma C.13. Fix scheme x Let (dt)T t=1 be the sequence defined by dt = 0 and dt+1 = f ε(dt, xt) then for every t [T] it holds that dt < Dt(x).

Let (dt)T t=1 be the sequence defined by dt = 0 and dt+1 = f ε(dt, xt), then by Lemma C.2 we get that

V G(π x, s0 1) =

t=1 r(yq(dt, xt)) ε

t=1 r(yq(dt, xt))

t=1 r(yq(Dt( x), xt)) =

t=1 r(pt( x)) = U( x).

Selective Response Strategies for Gen AI

Therefore, it holds that U( x) > V G(π x, s0 1) > U(x ) εMT 2.

We move on to evaluate the welfare. Notice that the welfare is not monotonic in pt. Instead of using our previous technique, we use Lemma C.11 and get that

W( x) V W ( x, s0 1) εT 2(La + 1)

W 1 εT 2(La + 1) εT 2(La + 1)

= W 1 2εT 2(La + 1).

This completes the proof of Theorem 4.7.

Proof of Lemma C.11. We begin by showing that there cannot be a large gap between the data accumulated in the original problem and the data according to our MDP following the same scheme. Let t = (t 1)ε and let f ε(d, y) = f(d, y) ε = d + 1 yq(d, y) ε the data in the next round given that in the current round, Gen AI started with d data and played y. We use the following lemma to show that the accumulated data in the MDP cannot be too far from the accumulated in our original problem.

Lemma C.14. Let d {0, ε, . . . T} and fix round t [T]. If |d Dt(x)| < t then |f ε(d, xt) Dt+1(x)| < t+1.

We now use backward induction to prove our lemma, starting at round T. Let d {0, ε, . . . , T} such that |d DT (x)| < T and we use the following lemma:

Lemma C.15. Let d {0, ε, . . . T} and fix round t [T]. If |d Dt(x)| < t then it holds that

| r(xtq(d, xt)) ε r(xtq(Dt(x), xt))| M t+1.

Therefore, by Lemma C.15 we get that

V G(πx, sd T ) UT (x) = | r(x T q(d, x T )) ε r(x T q(DT (x), x T ))| M T +1.

Similarly for V W and WT . We use the following lemma.

Lemma C.16. Let d1, d2 R 0 and any y [0, 1] then it holds that yq(d1, y) a(d1) ws yq(d2, y) a(d2) ws d1 d2 (La + 1).

Therefore, it holds that:

V W (πx, sd T ) WT (x)

= | x T q(d, x T ) (a(d) ws) ε x T q(DT (x), x T ) (a(DT (x)) ws)|

|x T q(d, x T ) (a(d) ws) x T q(DT (x), x T ) (a(DT (x)) ws)| + ε

|d DT (x)| (L + 1) + ε

T (La + 1) + ε

< T +1(La + 1).

We are done with the base case and can continue towards the induction step. Assume the lemma holds for round t + 1, and we prove it for round t.

Selective Response Strategies for Gen AI

We start with the revenue at round t. Let d {0, ε, . . . , T} and denote d = f ε(d, xt). Then, for every d such that |d Dt(x)| < t it holds that

V G(πx, sd t ) Ut(x) = r(xtq(d, xt)) ε + V G(πx, sd t+1) r(xtq(Dt(x), xt)) Ut+1(x)

| r(xtq(d, xt)) ε r(xtq(Dt(x), xt))| + V G(πx, sd t+1) Ut+1(x) .

We use Lemma C.15 to bound the first expression. Furthermore, notice that according to Lemma C.14 it holds that V G(πx, sd t+1) satisfies the conditions of our induction step. Therefore,

V G(πx, sd t ) Ut(x) M t+1 +

i=t+2 M i = M

i=t i+1 = εM

We perform a similar calculation for the welfare:

V W (πx, sd t ) Wt(x)

= xtq(d, xt) (a(d) ws) ε + V W (πx, sd t+1) xtq(Dt(x), xt) (a(Dt(x)) ws) Wt+1(x)

xtq(d, xt) (a(d) ws) ε xtq(Dt(x), xt) (a(Dt(x)) ws) + V W (πx, sd t+1) Wt+1(x)

t+1(La + 1) + (La + 1)

i=t i+1 = (La + 1)ε

This completes the proof of Lemma C.11.

Proof of Lemma C.14. This is a special case of Lemma C.7 and is hence omitted.

Proof of Lemma C.15. By Lemma C.6, it holds that

| r(xtq(d, xt) ε) r(xtq(DT (x), xt))| |r(xtq(d, xt)) r(xtq(DT (x), xt))| + ε

Lr |xtq(d, xt) xtq(DT (x), xt)| + ε

Lr |d DT (x)| + ε

T +1 max{Lr, 1}.

This completes the proof of Lemma C.15.

Proof of Lemma C.16. We prove this by starting with the definition.

yq(d1, y) a(d1) ws yq(d2, y) a(d2) ws

= y q(d1, y) q(d2, y) + q(d2, y) a(d1) ws yq(d2, y) a(d2) ws

yq(d2, y) a(d1) ws yq(d2, y) a(d2) ws

+ y q(d1, y) q(d2, y) a(d1) ws

yq(d2, y) a(d1) ws a(d2) + ws + y q(d1, y) q(d2, y) a(d1) ws

= yq(d2, y) a(d1) a(d2) + y q(d1, y) q(d2, y) a(d1) ws .

Selective Response Strategies for Gen AI

We use Lemma C.6 and therefore yq(d1, y) a(d1) ws yq(d2, y) a(d2) ws

yq(d2, y) a(d1) a(d2) + d1 d2 a(d1) ws

= yq(d2, y) a(d1) a(d2) + d1 d2 a(d1) ws

yq(d2, y)La d1 d2 + d1 d2 a(d1) ws .

Notice that y, q(d2, y), a(d1), ws 1 and thus we get that yq(d1, y) a(d1) ws yq(d2, y) a(d2) ws

La d1 d2 + d1 d2 = (La + 1) d1 d2 .

This completes the proof of Lemma C.16.

Proof of Lemma C.13. This is a special case of Lemma C.4 and is hence omitted.

D. Proofs Omitted from Section 5

Proof of Theorem 5.1. We use the following lemma.

We define h(y, x) = x eβxy

eβxy+eβws y + 1 x eβxy

eβxy+eβws ws and observe that wt(x) = h(a(Dt(x)), xt). We analyze each property separately.

1. If wg t (x) C. From Proposition C.1, for every t > τ it holds that dt(xτ) > dt(x) and therefore wg t (xτ) wg t (x) C.

Notice that

dy = x2β 1 1 + eβ(ws xy) 1 1 + e β(ws xy) (y ws) + x 1 1 + eβ(ws xy)

x2β 1 1 + eβ(ws xy) 1 1 + e β(ws xy) (xy ws) + x 1 1 + eβ(ws xy)

x2β 1 1 + eβ(ws xy) 1 1 + e β(ws xy) (xy ws) + x2 1 1 + eβ(ws xy)

= x2 β 1 1 + eβ(ws xy) 1 1 + e β(ws xy) (xy ws) + 1 1 + eβ(ws xy)

Therefore, for every x > 0, if β 1 1+eβ(ws xy) 1 1+e β(ws xy) (xy ws) + 1 1+eβ(ws xy) > 0 then it holds that dh(y,x)

Next, we define the auxiliary function g(y, x) = x eβy

eβy+eβws y + 1 x eβy

eβy+eβws ws and notice that

dy = xβ 1 1 + eβ(ws y) 1 1 + e β(ws y) (y ws) + x 1 1 + eβ(ws y)

Therefore, for every x > 0 it holds that

sign dg(y, x)

= sign β 1 1 + eβ(ws y) 1 1 + e β(ws y) (y ws) + 1 1 + eβ(ws y)

and by definition of g(y, x) we get that for every x, y such that y = xy and dg(y,x)

dy > 0 then dh(y ,x)

Now, we use the following lemma:

Selective Response Strategies for Gen AI

Lemma D.1. For every x > 0 it holds that

sign dg(y, x)

= sign(y C).

Thus, according to Lemma D.1 it holds that if wg t (x) > C then h(a(Dt(xτ)), xτ t ) > h(a(Dt(x)), xt) and equivalently wt(xτ) > wt(x).

2. For the next two properties, we show that there exists y0 R such that for every x > 0 it holds that dh(y0,x)

First, we rewrite dh(y,x)

dy = x2β 1 1 + eβ(ws xy) 1 1 + e β(ws xy) (y ws) + x 1 1 + eβ(ws xy)

= x 1 1 + eβ(ws xy)

1 + xβ 1 1 + e β(ws xy) (y ws)

< x 1 1 + eβ(ws xy)

1 + xβ 1 1 + e β(ws xy) y

= x 1 1 + eβ(ws xy)

1 + β 1 1 + e β(ws xy) xy .

Notice that x 1 1+eβ(ws xy) > 0 and therefore if 1 + β 1 1+e β(ws xy) xy < 0 then dh(y,x)

Next, observe that

lim y 1 + β 1 1 + e β(ws xy) xy = lim y 1 + βxy =

Let h(yx) = 1 + β 1 1+e β(ws xy) xy. Since dh(y,x)

dy is continuous in y, and h(y, x) is continuous and represents a lower

bound of h(yx), it holds that there exists z0 such that if yz = z0 then dh(y,x)

Next, we use the following observation:

Observation D.2. For every x > 0 and y < z0

x it holds that dh(y,x)

We denote C = z0

x . Using Observation D.2, if a(Dt(xτ)) < C (which means that wg t (xτ) < z0) then h(a(Dt(xτ)), xτ t ) < h(a(Dt(x)), xt).

3. The result h(a(Dt(xτ)), xτ τ) > h(a(Dt(x)), xτ) follows immediately from the previous arguement.

This completes the proof of Theorem 5.1.

Proof of Lemma D.1. Fix x [0, 1] and we denote q(y) = eβy

eβy+eβws . Therefore, g(y) can be written as g(y) = xq(y)y + (1 xq(y))ws. The derivative g(y) is

dy = xd q(y)

dy y + x q(y) xd q(y)

dy ws = xd q(y)

dy (y ws) + x q(y). (5)

Notice that q(y) is a sigmoid function and therefore d q(y)

dy = β q(y) (1 q(y)). Plugging this result in Equation 5 results in

dy = x q(y) (1 q(y)) β(y ws) + x q(y).

Selective Response Strategies for Gen AI

Next, notice that q(y) = eβy

eβy+eβws = 1 1+eβ(ws y) . We denote z = β(y ws) and get

dy = x 1 1 + e z 1 1 + ez z + x 1 1 + e z

= x 1 1 + e z

1 1 + ez z + 1

= x 1 1 + e z z + 1 + ez

= x 1 1 + e z ez+1

(z + 1)e (z+1) + e 1 .

Therefore, to find the y0 that results in dg

dy|y=y0 = 0 is equivalent to finding the solution of

(z + 1)e (z+1) + e 1 = 0.

Denote z = (z + 1) and we have the inverse of the Lambert function

and therefore z = W(e 1), which leads to z0 = W(e 1) 1 and y0 = W(e 1)+1

β + ws = C.

Next, denote h(z) = (z + 1)e (z+1) + e 1 and notice that the sign of dg

dy is determined by the sign of h(z), that is

dy) = sign(h(z)).

The derivative of h(z) is given by

dz = e (z+1) (z + 1)e (z+1) = (1 (z + 1)) e (z+1) = ze (z+1).

Therefore h(z) is an increasing function for z < 0 and a decreasing function for z > 0. Recall that h(z0) = 0 and z0 < 1 < 0 thus h(z) < 0 for every z < z0. Furthermore, h(z) is an increasing function in z [z0, 0), therefore it holds that h(z) > 0 for every z (z0, 0). Lastly, notice that for every z > 0 it holds that z + 1 > 0 and e (z+1) > 0. and as such we can summarize that h(z) > 0 for every z > z0.

This completes the proof of Lemma D.1.

Proof of Observation D.2. Let y = z0

x . First, observe that if dh(y ,x)

dy < 0 then it must hold that y < ws. Furthermore, for any y1 < y < ws we get that

0 > 1 + xβ 1 1 + e β(ws xy ) (y ws) > 1 + xβ 1 1 + e β(ws xy ) (y1 ws)

> 1 + xβ 1 1 + e β(ws xy1) (y1 ws).

E. Proofs Omitted from Section 6

E.1. Proofs Omitted from Subsection 6.2

Proof of Corollary 6.2. First, notice that for every t τ it holds that Dt(x) = Dt(xτ). Next, from Lemma C.8 it holds that for every t > τ, the data satisfies

|Dt(x) Dt(xτ)| |Dτ+1(x) Dτ+1(xτ)| = |pτ(x) pτ(xτ)| .

Selective Response Strategies for Gen AI

Therefore, we can bound the revenue:

U(xτ) U(x) γτ 1 (r(pτ(xτ)) r(pτ(x))) + Lr

t=τ+1 γt (pt(xτ) pt(x)) .

By Lemma C.6 we get that

U(xτ) U(x) γτ 1 (r(pτ(xτ)) r(pτ(x))) + Lr

t=τ+1 γt (Dt(xτ) Dt(x))

γτ 1 (r(pτ(xτ)) r(pτ(x))) + Lr (pτ(x) pτ(xτ))

γτ 1 (r(pτ(xτ)) r(pτ(x))) + Lrγτ pτ(xτ) pτ(x)

This completes the proof of Corollary 6.2.

Theorem E.1. Let x = min{xt | t > τ, xt > 0} and k = β min D [0,T ] da(D)

D 4(1+eβws)2 . If βLa 1, then

U(xτ) U(x) <

γτ 1 (r(pτ(xτ)) r(pτ(x))) + Lrγτ pτ(xτ) pτ(x)

1 γ (1 kx2).

Proof of Theorem E.1. By definition, we get that

U(xτ) U(x) =

t=1 γt 1r(pt(xτ))

t=1 γt 1r(pt(x))

t=1 γt 1 (r(pt(xτ)) r(pt(x)))

t=τ γt 1 (r(pt(xτ)) r(pt(x)))

= γτ 1 (r(pτ(xτ)) r(pτ(x))) +

t=τ+1 γt 1 (r(pt(xτ)) r(pt(x)))

γτ 1 (r(pτ(xτ)) r(pτ(x))) + Lr

t=τ+1 γt 1 (pt(xτ) pt(x)) .

Next, we use the following lemma to get an upper bound on pt(xτ) pt(x).

Lemma E.2. For every t > τ it holds that

0 pt(xτ) pt(x) |pτ(x) pτ(xτ)|

Therefore, according to Lemma E.2, it holds that

U(xτ) U(x) γτ 1 (r(pτ(xτ)) r(pτ(x))) + Lr

t=τ+1 γt 1 |pτ(x) pτ(xτ)|

γτ 1 (r(pτ(xτ)) r(pτ(x))) + |pτ(x) pτ(xτ)| Lr

t=τ+1 γt 1x2 t

Selective Response Strategies for Gen AI

We now simplify the second term using the following lemma.

Lemma E.3. It holds that

t=τ+1 γt 1x2 t

Therefore, we get that

U(xτ) U(x) γτ 1 (r(pτ(xτ)) r(pτ(x))) + |pτ(x) pτ(xτ)| Lr

= γτ 1 (r(pτ(xτ)) r(pτ(x))) + |pτ(x) pτ(xτ)| Lrγτ T τ X

Notice that PT τ t=1 γt 1 1 q2

t 1 is a sum of a geometric series, and therefore it holds that

Thus, we conclude that

U(xτ) U(x) γτ 1 (r(pτ(xτ)) r(pτ(x))) + |pτ(x) pτ(xτ)| Lrγτ

This completes the proof of Theorem E.1.

Proof of Lemma E.2. We start from the left inequality. From Theorem 4.1 it holds that pt(xτ) pt(x) for every t > τ.

We move on to the right inequality. For that, we use Lemma C.6 and get that

pt(xτ) pt(x) < Dt(xτ) Dt(x).

Next, we couple it with the following lemma.

Lemma E.4. For every t > τ it holds that

0 < Dt(xτ) Dt(x) |pτ(x) pτ(xτ)|

Therefore, we conclude that

pt(xτ) pt(x) < Dt(xτ) Dt(x)

|pτ(x) pτ(xτ)|

This completes the proof of Lemma E.2.

Selective Response Strategies for Gen AI

Proof of Lemma E.7. We expand it according to the definition:

q(D1, x1) q(D2, x2) =

eβa(D1)x1 + eβws eβa(D2)x2

eβa(D2)x2 + eβws

eβa(D1)x1 eβa(D2)x2 + eβws eβa(D2)x2 eβa(D1)x1 + eβws

eβa(D1)x1 + eβws eβa(D2)x2 + eβws

eβws eβa(D1)x1 eβa(D2)x2 eβa(D1)x1 + eβws eβa(D2)x2 + eβws

eβwseβa(D1)x1 1 eβ(x2a(D2) x1a(D1)) eβa(D1)x1 + eβws eβa(D2)x2 + eβws

= q(D1, x1) 1 q(D2, x2) 1 eβ(x2a(D2) x1a(D1))

= q(D1, x1) 1 q(D2, x2) 1 eβ(x2a(D2) x1a(D1)) .

This completes the proof of Lemma E.7.

Proof of Lemma E.4. We prove it by induction, starting with the base case at t = τ + 1. By definition,

|Dτ+1(x) Dτ+1(xτ)| = |Dτ(x) pτ(x) Dτ(xτ) + pτ(xτ)| .

Since Dτ(x) = Dτ(xτ) we get that

|Dτ+1(x) Dτ+1(xτ)| = |pτ(x) pτ(xτ)| .

Therefore, we can conclude the base case. Next, assume that the inequality holds for t > τ + 1, and we prove for t + 1.

We use the following lemma:

Lemma E.5. For every t > τ + 1 it holds that

|Dt+1(x) Dt+1(xτ)|

|Dt(x) Dt(xτ)| .

We plug the inequality from our assumption into the inequality of lemma E.5, Therefore, we get that

|Dt+1(x) Dt+1(xτ)|

|Dt(x) Dt(xτ)|

(Dt(xτ) Dt(x))

|pτ(x) pτ(xτ)|

= |pτ(x) pτ(xτ)|

This completes the proof of Lemma E.4.

Selective Response Strategies for Gen AI

Proof of Lemma E.5. By definition,

|Dt+1(x) Dt+1(xτ)| = |Dt(x) Dt(xτ) + pt(xτ) pt(x)| .

Since y < xτ then from Theorem 4.1 it holds that Dt(xτ) > Dt(x) and pt(xτ) > pt(x) for every t > τ. Next, we get an upper bound using the following lemma, which suggests a lower bound for the proportions.

Lemma E.6. For every t > τ, it holds that

q(Dt(xτ), xt) q(Dt(x), xt) q2

4 xβLa (Dt(xτ) Dt(x)) .

Using Lemma E.6, we get that

|Dt+1(x) Dt+1(xτ)| = Dt(xτ) Dt(x) + pt(x) pt(xτ)

Dt(xτ) Dt(x) q2

4 x2 tβLa (Dt(xτ) Dt(x))

= (Dt(xτ) Dt(x))

|Dt(x) Dt(xτ)| .

This completes the proof of Lemma E.5.

Proof of Lemma E.6. From Theorem 4.1, for every t > τ it holds that Dt(xτ) > Dt(x). Therefore, we get that a(Dt(xτ)) > a(Dt(x)). Furthermore, from Proposition C.2 it holds that q(Dt(xτ), xt) q(Dt(x), xt). Thus, we use the following lemma to write q(Dt(xτ), x) q(Dt(x), x) differently:

Lemma E.7. For every D1, D2 [0, T] and x1, x2 [0, 1] it holds that

q(D1, x1) q(D2, x2) = q(D1, x1) 1 q(D2, x2) 1 eβ(x2a(D2) x1a(D1)) .

q(Dt(xτ), x) q(Dt(x), x) = |q(Dt(xτ), x) q(Dt(x), x)| (6)

= q(Dt(xτ), x) (1 q(Dt(x), x)) 1 exβ(a(Dt(x)) a(Dt(xτ ))) .

Notice that q(D, x), 1 q(D, x) q for every D [0, T] and x [0, 1]. Furthermore, it holds that a(D2) a(D1) La

D2 D1 . Therefore,

a(Dt(x)) a(Dt(xτ)) = |a(Dt(x)) a(Dt(xτ))| La |Dt(x) Dt(xτ)| = La(Dt(x) Dt(xτ)).

Notice that La(Dt(x) Dt(xτ)) 0 and therefore 1 exβ(a(Dt(x)) a(Dt(xτ ))) > 1 exβLa(Dt(x) Dt(xτ )) . Plugging everything into Equation (6) results in the following inequality:

q(Dt(xτ), x) q(Dt(x), x) q2 1 exβLa(Dt(x) Dt(xτ )) .

Next, we show that xβLa |Dt(x) Dt(xτ)| 1. For that, we use the following lemma.

Lemma E.8. For every t > τ it holds that |Dt(x) Dt(xτ)| 1.

Selective Response Strategies for Gen AI

Therefore, we get that

xβLa |Dt(x) Dt(xτ)| xβLa βLa βLa 1.

Thus, we can use the inequality |1 eα| |α|

4 for |α| 1 and conclude that

q(Dt(xτ), x) q(Dt(x), x) q2

4 xβLa |Dt(x) Dt(xτ)|

4 xβLa (Dt(xτ) Dt(x)) .

This completes the proof of Lemma E.6.

Proof of Lemma E.8. By definition, we get that

|Dt(x) Dt(xτ)| = |Dt 1(x) + (1 pt 1(x)) Dt 1(xτ) (1 pt 1(xτ))|

= |Dt 1(x) pt 1(x) Dt 1(xτ) + pt 1(xτ)|

= Dt 1(xτ) Dt 1(x) + pt 1(x) pt 1(xτ).

Observe that the proportions satisfies that pt 1(x) pt 1(xτ) 0. Therefore,

|Dt(x) Dt(xτ)| |Dt 1(xτ) Dt 1(x)| .

Thus, by induction it follows that

|Dt(x) Dt(xτ)| |Dτ+1(x) Dτ+1(xτ)|

= |Dτ(x) + (1 pτ(x)) Dτ(xτ) (1 pτ(xτ))|

= |pτ(xτ) pτ(x)| 1.

This completes the proof of Lemma E.8.

Proof of Lemma E.3. Let t > τ be the maximum t [T] such that xt = 0. Therefore,

t=τ+1 γt 1x2 t

t=τ+1 γt 1x2 t

t=t +1 γt 1x2 t

t=τ+1 γt 1x2 t

t=t +1 γt 1 t 1 Y

Selective Response Strategies for Gen AI

We now focus on the second term:

t=t +1 γt 1 t 1 Y

t=t +1 γt 1 t 1 Y

t=t +1 γt 1 t 1 Y

t=t +1 γt 1

Therefore, we conclude that

t=τ+1 γt 1x2 t

t=τ+1 γt 1x2 t

At this point, we iteratively apply it while going backward using backward induction. In each step, we take the latest round t such that x t = 0 and apply the equation above to the first term. Ultimately, we get that

t=τ+1 γt 1x2 t

This completes the proof of Lemma E.3.

F. Simulations

In this section we provide empirical demonstration of the effect of selective responses.

Experimental Model We set our model with the following parameters: T = 50, ws = 0.5, a(D) = 1 e 4D and r(p) = pα for α R. We show our results for varying values of β, γ and α. Ideally, we would compute the optimal selective response strategy for each set of model parameters, but doing so requires using a fine discretization of the range [0, 1] for each selection of xt. To ease our computation for large values of T, we choose the optimal selective response strategy in the set of cutoff strategies. We denote by Xc the set of cutoff strategies, namely, the set of all strategies in which Gen AI remains does not answer until a specific round and fully responds thereafter. Formally, for every strategy x Xc, there exists τ T such that xt = 0 for all t < τ and xt = 1 for all t τ. For each set of model parameters, we calculate the optimal cutoff strategy to maximize Gen AI s revenue and plot the differences in Gen AI revenue and the users social welfare relative to the revenue and welfare induced by the full response strategy.

Experiment Setup We report the induced revenues and welfare from 60 instances. We used a standard PC with intel Core i7-9700k CPU and 16GB RAM for running the simulations. The entire execution took roughly 1 hour.

Selective Response Strategies for Gen AI

1 1.5 2 2.5 3 3.5 4 0.8

(a) Revenue difference

1 1.5 2 2.5 3 3.5 4 0.8

(b) Welfare difference

1 1.5 2 2.5 3 3.5 4 0

(c) Revenue difference

1 1.5 2 2.5 3 3.5 4 0

(d) Welfare difference

Figure 4: Revenue and welfare difference between the Gen AI s optimal cutoff selective response and the full response strategy as a function of the revenue scaling power α and temperature β.

F.1. Results

Figure 4 illustrates how Gen AI s revenue (users social welfare) changes with respect to the discount paramter γ, the proportions scaling power α and the sensitivity parameter β. Light colors indicate high differences in the revenue and social welfare, while darker colors indicate lower differences. Figure 4a and Figure 4c shows the difference between the optimal cutoff strategy and the full response strategy in log scale. Formally maxx Xc U(x) U( x). Figure 4b and Figure 4d illustrate the difference in the users social welfare between Gen AI s optimal cutoff strategy and the full-response strategy. Formally we compute W(x) W( x) such that x maxx Xc U(x).

In our first experiment, we set β = 5 and computed the revenue and social welfare for various combinations of α and γ. As shown in Figure 4a, revenue increases with both α and γ. Observe that decreasing the discount factor γ and increasing the power α can have opposing effects on revenue. A lower γ makes Gen AI more myopic, favoring strategies that maximize immediate gains from the current proportions. Consequently, it tends to adopt strategies close to full response, resulting in minor revenue differences. In contrast, a higher α amplifies the influence of high proportions on revenue. This drives Gen AI to prioritize increasing the proportions, even at the cost of short-term revenue, leading more to a long-term approach.

Figure 4b presents the social welfare as a function of α and γ. Although users welfare does not explicitly depend on Gen AI s discount factor γ, we observe six distinct regions with uniform color. This dependence arises through Gen AI s strategy, which is sensitive to γ. Within each region, Gen AI adopts the same selective response strategy across all combinations of α and γ. As discussed in Section 5, using a selective response when Gen AI is inaccurate improves social welfare. This typically occurs for small values of t, when Gen AI prioritizes future proportions, which is the same underlying reason

Selective Response Strategies for Gen AI

behind the observed increase in revenue as α and γ grow.

Figures 4c and 4d show the differences in revenue and welfare as functions of the temperature β and the power parameter α. When β = 0, users are indifferent between platforms, regardless of the utilities they receive. In this case, the accuracy of Gen AI, and therefore the data it accumulates, has no impact on user decisions. As a result, choosing a selective response can only reduce both Gen AI s revenue and social welfare.

As β increases, users become more sensitive to utility differences, and the value of the data Gen AI accumulates becomes more apparent. This leads to two opposing effects. First, selective response allows Gen AI to influence the amount of data generated in each round, which in turn affects how many users choose Gen AI. Second, when β becomes very large, the effectiveness of selective response decreases. In this case, user behavior resembles a best response dynamic: as long as Gen AI s utility is lower than that of Forum, users prefer Forum and generate data there, regardless of Gen AI s strategy. Once Gen AI s utility exceeds that of Forum, the optimal strategy is to always choose the full response, since users will consistently choose Gen AI.

Notably, the welfare in Figure 4d exhibits the same pattern as the revenue, but in a more nuanced fashion. This is because welfare is influenced by both Gen AI s response strategy and β. From Theorem 5.1, it follows that the threshold C increases with β, and therefore using a selective response when the utility from Gen AI is below C may lead to a decrease in welfare. This results in a cyclic pattern in welfare when α is held constant.