# navigating_towards_fairness_with_data_selection__30b08025.pdf

Navigating Towards Fairness with Data Selection

Yixuan Zhang1, Zhidong Li2, Yang Wang2, Fang Chen2, Xuhui Fan3, Feng Zhou4,5*

1School of Statistics and Data Science, Southeast University, China 2Data Science Institute, University of Technology Sydney, Australia 3School of Computing, Macquarie University, Australia 4Center for Applied Statistics and School of Statistics, Renmin University of China, China 5Beijing Advanced Innovation Center for Future Blockchain and Privacy Computing, China zh1xuan@hotmail.com, {zhidong.li, yang.wang, fang.chen}@uts.edu.au, xuhui.fan@mq.edu.au, feng.zhou@ruc.edu.cn

Machine learning algorithms often struggle to eliminate inherent data biases, particularly those arising from unreliable labels, which poses a significant challenge in ensuring fairness. Existing fairness techniques that address label bias typically involve modifying models and intervening in the training process, but these lack flexibility for large-scale datasets. To address this limitation, we introduce a data selection method designed to efficiently and flexibly mitigate label bias, tailored to more practical needs. Our approach utilizes a zero-shot predictor as a proxy model that simulates training on a clean holdout set. This strategy, supported by peer predictions, ensures the fairness of the proxy model and eliminates the need for an additional holdout set, which is a common requirement in previous methods. Without altering the classifier s architecture, our modality-agnostic method effectively selects appropriate training data and has proven efficient and effective in handling label bias and improving fairness across diverse datasets in experimental evaluations.

Introduction Fairness is a critical and essential problem in real-world applications. In recent years, it has attracted great attention, especially in high-stake domains such as finance (Khandani, Kim, and Lo 2010; Yeh and Lien 2009; Mukerjee et al. 2002), law (Brennan, Dieterich, and Ehret 2009; Lin et al. 2020), recruiting (Faliagka et al. 2012; Bertrand and Mullainathan 2004), school admissions (Moore 1998) and medicine (Kim et al. 2015). Although many fairness-aware learning methods have been proposed recently, they often assume that the data collected for training is representative of the true data distribution. However, these methods are still validated on so-called clean data, which neglects the impact of label bias. Recent studies (Wang, Liu, and Levy 2021; Dai 2020; Zhang et al. 2023; Konstantinov and Lampert 2022) have increasingly focused on the adverse impacts of label bias and have proposed methodologies to improve fairness in this setting. These efforts involve adjusting models and intervening in the training process by considering the amount of bias present in labels during fair learning. The objective of

*Corresponding author. Copyright 2025, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.

these methods is to develop a fair labeling function that reflects the true underlying distribution, thereby improving the robustness of the fairness techniques. However, when handling large-scale datasets, these methods often lack flexibility due to the necessity of training with the complete dataset, which can slow down model convergence and substantially increase training costs. To meet the needs of complex machine learning systems in practical applications, employing data selection methods to filter out a subset of useful data for training proves to be an effective solution. Apart from efficiency improvement, data selection techniques have been very effective in mitigating the impact of noisy data. These methods typically prioritize training with either difficult (Loshchilov and Hutter 2015; Katharopoulos and Fleuret 2018; Jiang et al. 2019) or easy samples (Bengio et al. 2009) based on the training loss. Nevertheless, such a singular filtering strategy limits the ability to handle the diversity of real-world situations since the difficulty of samples often arises from incorrect annotations, inherent ambiguity, or atypical patterns (Mindermann et al. 2022; Deng, Cui, and Zhu 2023). To overcome this limitation, Mindermann et al. (2022) introduces a new data selection criterion, the reducible holdout loss selection (RHOLOSS), based on the impact on generalization loss, which further prevents the selection of redundant or noisy samples. Inspired by this new selection criterion, this paper aims to extend this method to the field of fairness. We revisit the derivation of RHO-LOSS, aligning the predicted posterior distribution of selected samples with the fair data distribution. Instead of relying on a holdout set and training an auxiliary validation model, we establish a more accurate approximation by incorporating zero-shot predictors from pretrained models. This approach eliminates the need for an additional holdout set. To further prevent discrimination leakage in pre-trained models, we implement the peer prediction mechanism during training. This ensures fairness when using the zero-shot predictor to evaluate generalization loss. We conduct comprehensive empirical evaluations on several benchmark datasets. The experiments on the image classification tasks demonstrate the effectiveness of our proposed data selection principle, which can adaptively select fair instances that are less impacted by label bias. Although not explicitly emphasized in this paper, we also offer a solution to reduce the impact of selection bias by resampling the

The Thirty-Ninth AAAI Conference on Artificial Intelligence (AAAI-25)

selected data. It is worth noting that the proposed method is modality-agnostic, as it only provides a principle of data selection and can be compatible with any log-likelihood or cross-entropy-based classifiers. Meanwhile, it achieves faster convergence as we only select good data points for model training. Our method is robust and achieves superior performance with respect to accuracy and fairness metrics under different bias settings. Our contributions are threefold:

We propose a data selection principle that enables the learning of a fair labeling function by selecting fair and balanced instances in the training set to fix both label and selection bias. Notably, unlike most noisy label learning methods, our method does not require noise rate estimation. Our method is general, modality-agnostic, and compatible with any classifier or neural network based on loglikelihood or cross-entropy-based. Our method converges faster and achieves better test accuracy than alternative baselines.

Preliminaries

In this section, we briefly introduce the related background knowledge of online batch selection and present the concept of the data selection principle based on the data impact on the generalization loss, as proposed in Mindermann et al. (2022). Consider a dataset D = {(xi, yi, si)}n i=1, where x represents the non-sensitive features, y {0, 1} is the binary label, and s {0, 1} denotes the sensitive variables. Given a model parameterized by θ, fθ : X RK (K is the number of class), in online batch learning, a batch Bt of size NB is drawn from the training dataset D at each training step t. The objective of online batch selection is to pick samples from Bt based on specific ranking criteria and use them to construct a smaller batch bt with size Nb for model updates. The ranking function represents the criteria of which data should be selected, and previous methods usually pick the hard data points based on the ranking of training loss from highest to lowest. Then, the common gradient descent is performed to minimize the loss using bt, which is denoted as PNb i=1 L(yi, f(xi)). By iteratively doing this, we can select data points that minimize the loss of training set. However, this kind of selection method will tend to pick redundant data points or outliers since it focuses on the highest training loss and lack the flexibility due to the simplistic selection criteria. In contrast, Mindermann et al. (2022) established selection criteria based on the data impact on the model s generalization loss, thus effectively addressing the limitations of previous methods. To revisit this approach, we follow the framework of online batch selection. Let Dt represent the observed data prior to training step t + 1. Given a sample (x, y) drawn from batch Bt+1, we assume, for simplicity, that only one data point is selected at a time. If this data point is chosen, the updated predictive distribution in a Bayesian view will be p(y | x , Dt (x, y)).

This distribution should ideally align with the true datagenerating distribution p(x , y ), and to achieve this goal, one nature way is to minimize the KL divergence between the predictive distribution and the data-generating distribution, E p(x )KL[ p(y | x )||p(y | x , Dt (x, y)], which can be equivalently expressed as:

E p(x ,y )[log p(y | x , Dt (x, y)] + const., (1)

where the const. denotes the negative entropy of the data generating distribution, and is agnostic to the optimization. In order to evaluate the impact of the selected data point on the generalization loss, we utilize the holdout samples D = {(x i , y i )}m i=1 from the data-generating distribution p(x , y ). By leveraging the Monte Carlo to approximate expectations with empirical averages, the aim now becomes:

arg max (x,y) Bt+1

i=1 [log p(y i | x i , Dt (x, y))], (2)

and this optimization is equivalent to select a data point (x, y) Bt+1 that mostly maximizes log p(D | Dt (x, y)), which corresponds to the generalization loss. By applying Bayes rule, we obtain:

p(D | Dt (x, y)) = p(y | x, D , Dt)

p(y | x, Dt) p(D | Dt). (3)

The term p(D | Dt, x) becomes p(D | Dt) is due to a single data point x does not influence the model s update performance. Approximating p(y | x, D , Dt) with p(y | x, D ) and dropping the term p(D | Dt) (independent of (x, y)), the final approximated tractable selection function of RHO-LOSS is given by:

arg max (x,y) Bt+1 log p(y | x, D ) log p(y | x, Dt)

= arg max (x,y) Bt+1 L[y | x, Dt] L[y | x, D ]. (4)

The first term, L[y | x, Dt], represents the training loss using model trained on the training set Dt, while L[y | x, D ] is the irreducible holdout loss using model trained on the holdout set D . Thus, the aim of selecting a data point (x, y) Bt+1 that mostly maximize the log-likelihood on the holdout set (Eq. (2)) can be approximated by Eq. (4).

Method Building on the concept of RHO-LOSS, we now turn our attention to how it connects with fairness and how label bias impacts data selection.

Refining Selection Principle for Fairness We begin by revisiting the derivation of RHO-LOSS and incorporating fairness principles (demographic parity1) to refine its formulation. Consider a holdout dataset D = {x i , y i , s i }m i=1, where all samples are generated from a fair distribution p(x, y, s) = p(y | x)p(x)p(s), meaning labels are not influenced by sensitive attributes. While fairness

1We use demographic parity as an illustration here, other fairness metrics are introduced in the experiment.

has various interpretations (Barocas, Hardt, and Narayanan 2018), we follow a common assumption in fairness literature and it is worth noting that, such a fair distribution is an idealized condition and is not observed in practice. Our objective now is to select a sample (x, y) from batch Bt+1, belonging to the demographic group defined by s, ensuring that the updated predictive distribution closely aligns with the fair data distribution. Similar to the derivation in RHOLOSS, by expanding the KL expression and applying Monte Carlo estimation using additional holdout samples, the optimization problem becomes:

arg max (x,y) Bt+1

s p(s) [log p(D | Dt (x, y), s)] . (5)

We estimate p(s) using Cs

m , where Cs represents the number of samples belonging to demographic group s under fair distribution. For demographic group s, using Bayes rule, the updated predictive distribution is as follows:

p(D | Dt (x, y), s) = p(y | x, D , Dt, s)

p(y | x, Dt, s) p(D | Dt).

(6) By plugging Eq. (6) into Eq. (5) and omitting the term that is irrelevant to the selected sample, we reformulate the objective as:

arg max (x,y) Bt+1

m [log p(y | x, D , Dt, s)

log p(y | x, Dt, s)]. (7)

The second term in Eq. (7) is straightforward to compute, but the first term is challenging to estimate because it involves both the training and holdout data. To address this, RHO-LOSS approximates the term with log p(y | x, D ), while Deng, Cui, and Zhu (2023) approximates it by finding a lower bound. We adopt the lower bound approximation, which is expressed as follows:

log p(y | x, D , Dt, s) Ep(θ|D ) log p(y | x, θ, s), (8)

where θ is the model parameter. For completeness, we include the full derivation in the Appendix. Substituting Eq. (8) into Eq. (7), the resulting objective is as follows:

arg max (x,y) Bt+1

m αEp(θ|D ) log p(y | x, θ, s)

log p(y | x, Dt, s)] , (9)

where α [0, 1] is a scaling factor that determines the tradeoff between the original inequality and the lower bound. Although we obtain this lower bound to approximate the first term in Eq. (7), it cannot be directly calculated. To further make it tractable, we build on the framework outlined by Deng, Cui, and Zhu (2023) to approximate this lower bound using zero-shot predictors as a proxy for the validation model in Mindermann et al. (2022), thereby eliminating the need to gather additional holdout data. Zero-shot predictors have demonstrated promising transfer performance across a wide range of downstream tasks due to training on

extensive datasets. Consequently, we use the following approximation:

Ep(θ|D ) log p(y | x, θ, s) log p(y | f(x), s), (10)

where f represents a zero-shot predictor derived from a pre-trained model used as a validation model. The approximation is considered reasonable because the expectation is taken over the posterior of θ, and it is assumed that the dataset used to train the zero-shot predictor is sufficiently large, resulting in a very narrow posterior distribution. Consequently, we can directly extract the posterior mean, represented by f, and substitute it into log p(y | x, θ; s) as an effective approximation of Ep(θ|D ) log p(y | x, θ; s). Then, substitute Eq. (10) into Eq. (9), the selection function with fairness considerations can be reformulated as:

arg max (x,y) Bt+1

h L[y | x, Dt, s] αL[y, f(x), s] i .

(11) The final objective is structured similarly to RHO-LOSS: the first term denotes the training loss, while the second term approximates the holdout loss via a zero-shot predictor.

Fair Data Selection with Peer Prediction Mechanism In this section, we present our revised selection approach to enhance fairness. Previous methods rely on a clean holdout set for training validation models, which is not always feasible in practice. Instead, we use zero-shot predictors as a proxy model, eliminating the need for a holdout set (see Eq. (11)). Despite the pre-trained model s robust ability to effectively extract and utilize fundamental patterns from large datasets, it may still inherit and propagate label bias, failing to adequately reflect the data-generating distribution of the current task, especially in terms of fairness. This misalignment can lead to persistent discrimination in the proxy model s approximations, mainly because most pre-trained models neglect to emphasize fairness during training, resulting in inherent biases. Even models developed with fairness considerations can introduce biases in downstream tasks due to distribution shifts (Jiang et al. 2023; Schrouff et al. 2022; Chowdhury and Chaturvedi 2023). To address these issues and tackle inherent label bias, we incorporate a peer prediction mechanism into our zero-shot predictor.

Peer Prediction Mechanism The correlated agreement type (Dasgupta and Ghosh 2013; Shnayder et al. 2016) of peer prediction mechanism involves two agents: one that provides noisy labels and another that mimics the Bayes optimal classifier. Liu and Guo (2020) crafted a scoring function that encourages truthful reporting by ensuring that the optimal classifier maximizes its score with accurate predictions. By minimizing the negative scoring function as a loss (referred to as peer loss), the resulting classifier closely approximates the Bayes optimal classifier, effectively addressing challenges posed by noisy labels. The loss function is formulated as: L[yi | xi] γL[yi2 | xi1], where xi1 and yi2 are independently sampled from the training set, excluding the (xi, yi) pair, and γ [0, 1] is a parameter that makes the loss robust to imbalanced labels.

Final Fair Data Selection Objective Drawing on the principles of the peer prediction mechanism, we implement a similar strategy to eliminate label bias and hence ensure fairness. We define Ds and Ds as the subsets containing sensitive information for S = s and S = s , respectively. In a fair setting, predictions should be independent of s. To achieve this, we modify the original peer loss by sampling xi1 from Ds and yi2 from Ds , and the corresponding random variables are the pair of (Xi1, Yi2). The intuition behind this design is that we assume the bias exists, and we construct such cross-group pairings to create a biased version. This setup allows us to estimate the bias rate conditioned on different demographic groups using the peer loss function. Since this loss function involves pairing two randomly selected instances, it inherently contains some randomness. To stabilize the results, we calculate the expectation across these randomly selected instances, and the equivalent expectation version is (details in the Appendix):

L[yi, f(xi), s] γEY |Ds [L[Y, f(xi), s]]. (12)

Combining Eq. (12) into Eq. (11) and simplifying Eq. (11), with the weights Cs

m summing to 1, we obtain the final selection function:

arg max (x,y) Bt+1 L[y | x, Dt, s] + (1 α)L[y, f(x), s]

γEY |Ds [L[Y, f(x), s]]. (13)

Resampling to Deal with Selection Bias In addition to addressing label bias, our implementation also tackles selection bias, which contributes to label imbalance. Assuming selection bias impacts the statistical independence between demographic groups S and target labels Z (Kamiran and Calders 2012), we resample based on the discrepancies between the actual counts of individuals in demographic group s = i with clean label z = j (Cs,z) and their expected counts (E[Cs,z]). We estimate E[Cs,z] using p(s)p(z)N, where p(s) and p(z) are the empirically measured probabilities. To correct imbalances, we upsample subgroups where Cs,z < E[Cs,z] and downsample those where Cs,z > E[Cs,z]. Although the clean label Z is unobserved, we assume that the labels of selected instances are fair and treat them accordingly. This resampling strategy is applied to each selected batch to ensure data balance.

Implementation The training process is outlined in Algorithm 1. We employ a zero-shot predictor with a derived expectation version of the adapted peer loss as a surrogate model to simulate the estimated loss on the holdout set. For each instance in batch Bt+1, we evaluate Eq. (13) and select the top-Nb instances to form a smaller batch bt+1. To address selection bias, we further pick samples based on the severity of selection bias to ensure balance among different demographic groups. We then update the parameters θ of the target model using this selected data.

Why the fair selection principle pick instances less influenced by label bias? Our data selection method integrates a zero-shot predictor with a peer prediction mechanism to address label bias, using the derived loss as a validation

Algorithm 1: Fair data selection to address label bias issue

1: Input: training set D, Nb, NB, T, α, γ, zero-shot predictor f and a target model fθ. 2: Initialize θ0. 3: for t in 0, , T do 4: Randomly select NB instances to construct Bt+1; 5: For each sample (xi, yi, si) in Bt+1, estimate and compute the objective in Eq. (13); 6: Select top-Nb samples to construct bt+1; 7: Drop instances from bt+1 if Cs,z > E[Cs,z], otherwise, bootstrap; 8: Perform gradient descent and update θ with resampled data. 9: end for

measure against a fair distribution. By evaluating each data point s impact on this validation loss, we avoid instances likely discriminated by sensitive information and exclude irrelevant outliers. This approach also conceptually relates to the Query By Committee (QBC) algorithm (Seung, Opper, and Sompolinsky 1992; Freund et al. 1997), which selects examples based on classifier disagreement (here, between the current model and f). Following Cheng et al. (2021); Zhang et al. (2021) (detailed in the Appendix), we demonstrate that our method effectively promotes fairness and mitigates label bias through a breakdown of Eq. (12):

E D h L[Y, f(X), S] γEY |Ds L[Y, f(x), s] i

= ED s L[Y, f(X), S] | {z } fair model

s [S] P(S = s)P(Z = i)[ Du L[j, f(X), s]]

| {z } Penalty on noisy loss

j [K] ps j h EDX|S=1L[j, f(X)] EDX|S=0L[j, f(X)] i ,

| {z } Penalty for the disagreement between demographic groups (14) where D denotes the observed distribution, which contains label bias, while D represents the underlying clean fair distribution. s = 1 P(Y = 1 | Z = 0, S = s) P(Y = 0 | Z = 1, S = s), ps j = P(Y = j, S = 0) P(Y = j, S = 1), T s ij = P(Y = j | Z = i, S = s), Du = ED|Z=i,S=s(U s ij γP(Y = j | S = s)) and U s ij = T s ij if i = j, and U s ij = T s jj ED|Z=j,S=s T s jj if i = j. The equation s first term captures the model s clean loss on the fair distribution. The second term adds penalties for noisy losses, adjusting labels to account for observed demographic disparities. The third term imposes penalties for demographic discrepancies, ensuring fair performance across all groups.

Experiment In the subsequent sections, we first describe our experimental setup, covering datasets, baselines, and evaluation met-

Method/Dataset LFW+a(0.2) LFW+a(0.4) Celeb A(0.2) Celeb A(0.4)

ACC DP ACC DP ACC DP ACC DP CLIP 90.8 0.11 83.2 0.21 62.2 0.60 61.6 0.61

Grad Norm 77.5 0.7 0.02 0.01 75.6 0.8 0.07 0.01 76.4 2.8 0.28 0.07 74.0 1.9 0.26 0.02 Grad Norm IS 82.6 1.4 0.43 0.13 75.4 0.2 0.78 0.01 82.6 2.1 0.40 0.14 75.5 3.4 0.42 0.07 Uniform 89.0 0.9 0.03 0.01 79.8 5.1 0.07 0.04 82.0 1.3 0.29 0.08 81.0 3.1 0.31 0.09 RHO-LOSS 89.5 0.5 0.06 0.02 83.9 1.8 0.08 0.20 80.5 0.9 0.22 0.07 79.7 1.7 0.28 0.04 Ours-s 90.0 1.3 0.02 0.01 86.2 0.7 0.08 0.05 85.5 0.6 0.21 0.02 84.5 0.6 0.21 0.01 Ours 90.9 0.6 0.01 0.00 88.7 0.7 0.04 0.01 86.5 0.6 0.21 0.02 85.2 1.7 0.20 0.01

Table 1: Test accuracy (%) and fairness violation ( DP) on Celeb A and LFW+a with various symmetrical label bias amount of 20% and 40%. We report results in the format of mean standard deviation.

Method/Dataset LFW+a(0.2) LFW+a(0.4) Celeb A(0.2) Celeb A(0.4)

p% ratio DEO p% ratio DEO p% ratio DEO p% ratio DEO CLIP 88.9 0.03 77.4 0.07 38.3 0.32 37.3 0.33

Grad Norm 92.7 1.5 0.03 0.01 91.1 0.6 - 77.6 7.3 0.39 0.02 76.0 8.9 0.55 0.23 Grad Norm IS 90.4 2.6 0.01 0.01 90.2 0.5 - 31.9 0.0 - 32.5 0.0 - Uniform 97.9 1.9 0.01 0.01 90.1 5.2 0.14 0.09 76.3 3.9 0.52 0.33 68.6 5.9 0.77 0.10 RHO-LOSS 92.3 2.8 0.03 0.01 90.4 2.4 - 80.8 5.9 0.44 0.14 81.0 0.1 0.42 0.01 Ours 98.3 0.4 0.01 0.00 94.6 3.7 0.14 0.08 84.1 5.3 0.38 0.15 77.2 1.1 0.35 0.02

Table 2: Other fairness measure with p%-rule and DEO on Celeb A and LFW+a with various symmetrical label bias amount of 20% and 40%. - denotes the invalid measure of DEO due to low accuracy.

rics. Next, we compare our methods against existing stateof-the-art data selection techniques across various image classification tasks (Celeb Faces Attributes (Celeb A) (Liu et al. 2015) and modified Labeled Faces in the Wild Home (LFW+a) (Wolf, Hassner, and Taigman 2011)), considering different amounts of label bias. We examine our selection criteria through detailed ablation studies. Benchmark Datasets. We evaluate the performance of our proposed method using two image datasets: Celeb A and LFW+a. The Celeb A dataset is utilized to discern the label Heavy Makeup, considering gender ( Female ) as the sensitive variable where biases have been noted towards female. In the LFW+a dataset, we augment each image with additional attributes like gender and race (same in Celeb A), aiming to classify the identity s gender. The sensitive variable here is Wavy Hair , where literature has shown a strong correlation regarding males. Each dataset is divided into training, validation, and test sets. Baselines. To evaluate our method s effectiveness and robustness, we compare our method with several selection methods on the image tasks. These include uniform sampling (Uniform), gradient norm selection (Grad Norm, which selects data points with high gradient norms) (Katharopoulos and Fleuret 2018), and gradient norm with importance sampling (Grad Norm IS) (Katharopoulos and Fleuret 2018). Additionally, we compare it with RHOLOSS (Mindermann et al. 2022). We implement two variants of our method: one includes resampling of bt+1 (Ours), and the other does not (Ours-s).

Evaluation Metrics. We use accuracy to evaluate prediction performance and measure fairness violation with DP = |E( ˆY = 1 | S = 1) E( ˆY = 1 | S = 0)|. A lower DP indicates less fairness violation. We also conduct experiments on the difference of equal opportunity (DEO) (Hardt, Price, and Srebro 2016), which is defined as DEO = |E( ˆY = 1 | Y = 1, S = 1) E( ˆY = 1 | Y = 1, S = 0)|, and the p%-rule, p% = min( P ( ˆY =1|S=0)

P ( ˆY =1|S=1), P ( ˆY =1|S=1)

P ( ˆY =1|S=0)). A lower DEO suggests less fairness violation, while a lower p%-rule indicates higher fairness violation. Setup. In our experiments addressing label bias, we introduce symmetrical label biases of 20% and 40%. We use the Adam W optimizer (learning rate 0.001, weight decay 0.01), we set a batch size of Nb = 32 and a batch ratio Nb NB = 0.1, consistent with the RHO-LOSS setup. For the LFW+a dataset, we employ Res Net-18 (He et al. 2016), and for Celeb A, we use Res Net-50 across all methods, along with a zero-shot predictor based on CLIP-RN50. We vary α and γ within the set {0.1, 0.3, 0.5, 0.7, 0.9}. Results are averaged over three random trials. All experiments are performed with GPUs (NVIDIA Ge Force RTX 3090 with 86GB memory).

Comparison Results Results are displayed in Table 1 for 20% and 40% bias amount. In the meantime, we report the fairness measure using p%-rule and DEO in Table 2. Our proposed method consistently demonstrates the highest accuracy and minimal fairness violation as bias increases. The outcomes illustrate

LFW+a(0.2) LFW+a(0.4) Dataset

Discriminated Samples (%)

Ours RHO-LOSS Uniform

Celeb A(0.2) Celeb A(0.4) Dataset

Discriminated Samples (%)

Ours RHO-LOSS Uniform

Figure 1: Proportion of selected instances discriminated by label bias using the proposed method (Ours), RHO-LOSS, and Uniform Sampling. The left plot corresponds to the LFW+a dataset, and the right plot corresponds to the Celeb A dataset. Overall, we can observe that the proposed method has the lowest rate of discriminated sample selection.

the importance of addressing label bias to prevent heightened bias in the output. While other baselines work well with low bias, they are not robust when the bias amount increases. Interestingly, we find that gradient norm selection has the worst performance, even worse than uniform sampling, especially for large label bias amount. This phenomenon shows that selecting data by high variance will tend to pick dirty points that are affected by the sensitive information.

Analysis of Properties of Selected Data In this section, we analyze our proposed method by evaluating the proportion of fair instances to the selected data. Fig. 1 reveal a pattern: as the amount of bias increases, the proportion of selected instances that are fair starts to decline due to the increased difficulty in distinguishing fair instances from unfair ones. However, our proposed selection method still maintains the highest ratio of selected fair instances, and significantly exceeds that of uniform sampling. This observation reinforces our method s superiority over other data selection methods in different label bias settings.

Ablation Studies In this section, we conduct ablation studies on zero-shot predictor, model architecture, and important hyperparameters. Zero-shot Predictor. In our experiments, we initially used CLIP-RN50 as the proxy model and have now extended testing to include Vi T-B/16 (Dosovitskiy et al. 2021) and the validation model from RHO-LOSS. Results in Table 3 show consistent performance across different zero-shot backbones. For Vi T-B/16 and RHO-LOSS models, baseline accuracies are 55% and 65% under 20% label bias, and 52% and 64% under 40% label bias, respectively. Vi T-B/16 accuracies are slightly lower than RN50, but our data selection method with peer prediction maintains a comparable performance on Celeb A at a 0.4 bias rate. RHO-LOSS model performance is slightly higher than RN50, but results are similar. This confirms the robustness of our method, which performs effectively regardless of the zero-shot predictor used. Backbone. We test our proposed method with different Backbone architecture on Celeb A dataset, including

Component Celeb A(0.2) Celeb A(0.4) ACC DP ACC DP

Zero-shot Predictor

CLIP-RN50 85.5 0.6 0.21 0.02 85.2 1.7 0.23 0.06 Vi T-B/16 86.4 0.2 0.20 0.03 84.5 1.2 0.19 0.02 R-V 85.6 0.4 0.23 0.05 85.5 0.9 0.23 0.04

Backbone Model

Res Net-18 84.8 1.4 0.22 0.02 84.1 0.2 0.22 0.01 Res Net-50 86.5 0.6 0.21 0.02 85.2 1.7 0.20 0.01 Dense Net-121 85.3 0.2 0.21 0.01 84.3 0.4 0.19 0.02

Table 3: Test accuracy (%) and fairness violations ( DP) for a variant zero-shot predictor (R-V denotes the validation model used in RHO-LOSS) and the backbone of our method.

the variant of Res Net (Res Net-18 and Res Net-50) and the Dense Net-121 (Huang et al. 2017). The results are displayed in Table 3. From the results, we can see enhance model complexity generally improves both predictive performance and fairness, though these improvements are not significant. Overall, the impact of the model s structure on its performance is not obvious, which implies our proposed method is robust to the backbone structure. Hyperparameters. We then analyze the effects of three hyperparameters: γ, α, and the selection ratio. In the first plot in Fig. 2, we plot the difference of test accuracy and fairness violation with γ varying, we set α = 0.1. We can see that the accuracy slightly improves as γ increases and the fairness violation appears downward trend as γ increases. This align with the effect of γ that controls the fairness level (fairness and accuracy should improve simultaneously when the data is unbiased (Wick, panda, and Tristan 2019)). In the second plot in Fig. 2, we set γ = 0.3 and test different values of α, we can see a similar trend for accuracy, but for the fairness violation, remains fairly stable. For the selection ratio, by default, is 10% in the experiment. In the third plot in Fig. 2, we plot the change of accuracy and fairness violation w.r.t. the selection ratio. We can see the accuracy increases when the selection ratio increases from 0.1 to 0.3 and then begins to drop. This demonstrates that increasing the selection ratio also increases the likelihood of picking data points that are not fair enough. However, due to the effect of the second term in Eq. (14) acting as the fair regularizer, the fairness violation shows minimal variation.

Convergence and Accuracy

We conduct an experiment about convergence speed by testing the epochs required to reach target test accuracy on the two image datasets. Interestingly, due to our proposed data selection procedure, it has a faster convergence speed. In Table 4, we can see that the proposed method converges much faster than uniform sampling and RHO-LOSS. With fewer epochs, our proposed method can achieve the target accuracy point. In the meantime, as shown in Table 1, the proposed method has the highest test accuracy compared to the other two methods. These results also align with Fig. 1, and can be explained as the proposed method is able to pick the fairest instances and therefore improve both accuracy and fairness at the same time.

0.2 0.4 0.6 0.8 50

Accuracy (%)

0.2 0.4 0.6 0.8 40

Accuracy (%)

0.2 0.4 0.6 0.8 Selection Ratio

Accuracy (%)

Figure 2: Ablation studies on critical hyperparameters, including γ, α, and selection ratio, on the Celeb A dataset with a 40% label bias amount. We use blue to denote accuracy (left axis) and purple to denote fairness violation (right axis).

Dataset ACC Uniform RHO-LOSS Ours

LFW+a(0.2) 70% 41 32 21 80% 73 65 55

LFW+a(0.4) 70% 77 69 23 80% - 87 78

Celeb A(0.2) 70% 65 33 25 80% 117 57 42

Celeb A(0.4) 70% 94 46 29 80% 125 - 85

Table 4: Epochs required to reach target test accuracy.

Related Work

Fair Learning with Label Bias Fairness remains a critical and essential concern in real-world applications. A primary source of unfairness is label bias, which is typically modeled as fair (clean) labels being systematically flipped for individuals from certain demographic groups (Wick, panda, and Tristan 2019), i.e., if S is a sensitive attribute and Z a fair label, P(Y = i | Z = j, S) occurs with i = j and i, j 0, 1. A growing number of studies are exploring fair learning in settings with noisy labels to address this issue. For example, Wang, Liu, and Levy (2021) applied group-dependent label noise and derived fairness constraints on corrupted data. Jiang and Nachum (2020) propose a reweighting method to correct instances affected by label bias. Building on these, Dai (2020) presents a framework to understand the combined effects of label bias and data distribution shifts in the context of fairness from a fundamental perspective. These works share a common framework that considers the noise rate of labels to improve the robustness of fair learning methods. However, these approaches involve modifying models and intervening in the training process, which limits their flexibility for complex systems and largescale datasets. To overcome this limitation, we adopt a data selection framework that enhances both flexibility and efficiency, making it suitable for practical applications.

Data Selection Methods As data sizes increase, using all available data for training becomes inefficient. Therefore, data selection methods have been developed to selectively train on only the most useful data, thereby en-

hancing efficiency and reducing computational costs. Previous approaches like curriculum learning (Bengio et al. 2009) typically choose data points from easy to hard. This method can lead to redundancy, as once such data points are learned, they should not be learned again. Other methods select data based on high training loss or high prediction uncertainty (Loshchilov and Hutter 2017; Kawaguchi and Lu 2019; Jiang et al. 2019; Coleman et al. 2019; Loshchilov and Hutter 2015). A common issue with these approaches is their tendency to choose outliers or noisy points when focusing on high loss or uncertainty. To address this issue, new selection methods that assess the data s impact on generalization loss, derived from a holdout set (Killamsetty et al. 2020; Mindermann et al. 2022), have been developed. Addressing the problem of less principled approximations in these methods and circumventing the need for a holdout set, Deng, Cui, and Zhu (2023) implemented a Bayesian approach, enhancing the validity of approximations and eliminating the reliance on a holdout set through the use of a zero-shot predictor. Inspired by this work, we also utilize the peer prediction mechanism to ensure the fairness of the zero-shot predictor.

Conclusions

This paper addresses label bias in fairness for large-scale datasets by proposing a fair data selection strategy that aligns the RHO-LOSS criterion with a fair distribution and uses a zero-shot predictor to eliminate the need for a clean holdout set. The approach enhances model fairness and accuracy while ensuring the selected data better represents a fair distribution through peer prediction mechanism, suitable for training robust models across various real-world applications. The proposed selection method is constructed from three aspects: (1) deriving a tractable selection function to pick data less affected by label bias, (2) eliminating the need for an additional holdout set previously required for validation, and (3) incorporating a peer prediction mechanism to ensure the fairness of the validation model. Experiments demonstrate the method s effectiveness in mitigating label bias, achieving faster convergence, and higher test accuracy compared to alternatives. While this work primarily focuses on fairness within the defined scope, future research will explore more complex settings, including out-of-distribution (OOD) scenarios and the role of confounders in fairness.

Acknowledgments We sincerely thank the reviewers for their careful reading of the manuscript and their valuable suggestions. This work was supported by the NSFC Project (No. 62106121), the MOE Project of Key Research Institute of Humanities and Social Sciences (22JJD110001), the fundamental research funds for the central universities, and the research funds of Renmin University of China (24XNKJ13).

References Barocas, S.; Hardt, M.; and Narayanan, A. 2018. Fairness and Machine Learning Limitations and Opportunities. Bengio, Y.; Louradour, J.; Collobert, R.; and Weston, J. 2009. Curriculum Learning. In Proceedings of the 26th Annual International Conference on Machine Learning, ICML 09, 41 48. New York, NY, USA: Association for Computing Machinery. ISBN 9781605585161. Bertrand, M.; and Mullainathan, S. 2004. Are Emily and Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination. American Economic Review, 94(4): 991 1013. Brennan, T.; Dieterich, W.; and Ehret, B. 2009. Evaluating the Predictive Validity of the Compas Risk and Needs Assessment System. Criminal Justice and Behavior, 36(1): 21 40. Cheng, H.; Zhu, Z.; Li, X.; Gong, Y.; Sun, X.; and Liu, Y. 2021. Learning with Instance-Dependent Label Noise: A Sample Sieve Approach. In International Conference on Learning Representations. Chowdhury, S. B. R.; and Chaturvedi, S. 2023. Sustaining fairness via incremental learning. In Proceedings of the Thirty-Seventh AAAI Conference on Artificial Intelligence and Thirty-Fifth Conference on Innovative Applications of Artificial Intelligence and Thirteenth Symposium on Educational Advances in Artificial Intelligence, AAAI 23/IAAI 23/EAAI 23. AAAI Press. ISBN 978-157735-880-0. Coleman, C.; Yeh, C.; Mussmann, S.; Mirzasoleiman, B.; Bailis, P.; Liang, P.; Leskovec, J.; and Zaharia, M. 2019. Selection Via Proxy: Efficient Data Selection For Deep Learning. Co RR, abs/1906.11829. Dai, J. 2020. Label Bias, Label Shift: Fair Machine Learning with Unreliable Labels. Dasgupta, A.; and Ghosh, A. 2013. Crowdsourced judgement elicitation with endogenous proficiency. In Proceedings of the 22nd International Conference on World Wide Web, WWW 13, 319 330. New York, NY, USA: Association for Computing Machinery. ISBN 9781450320351. Deng, Z.; Cui, P.; and Zhu, J. 2023. Towards Accelerated Model Training via Bayesian Data Selection. In Thirtyseventh Conference on Neural Information Processing Systems. Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; Uszkoreit, J.; and Houlsby, N. 2021. An Image is Worth 16x16 Words: Transformers for Image

Recognition at Scale. In International Conference on Learning Representations. Faliagka, E.; Ramantas, K.; Tsakalidis, A.; and Tzimas, G. 2012. Application of Machine Learning Algorithms to an online Recruitment System. Freund, Y.; Seung, H. S.; Shamir, E.; and Tishby, N. 1997. Selective Sampling Using the Query by Committee Algorithm. Mach. Learn., 28(2 3): 133 168. Hardt, M.; Price, E.; and Srebro, N. 2016. Equality of Opportunity in Supervised Learning. Co RR, abs/1610.02413. He, K.; Zhang, X.; Ren, S.; and Sun, J. 2016. Deep Residual Learning for Image Recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, 770 778. IEEE Computer Society. Huang, G.; Liu, Z.; van der Maaten, L.; and Weinberger, K. Q. 2017. Densely Connected Convolutional Networks. In CVPR, 2261 2269. IEEE Computer Society. ISBN 9781-5386-0457-1. Jiang, A. H.; Wong, D. L.; Zhou, G.; Andersen, D. G.; Dean, J.; Ganger, G. R.; Joshi, G.; Kaminsky, M.; Kozuch, M.; Lipton, Z. C.; and Pillai, P. 2019. Accelerating Deep Learning by Focusing on the Biggest Losers. Co RR, abs/1910.00762. Jiang, H.; and Nachum, O. 2020. Identifying and Correcting Label Bias in Machine Learning. In Chiappa, S.; and Calandra, R., eds., Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, volume 108 of Proceedings of Machine Learning Research, 702 712. PMLR. Jiang, Z.; Han, X.; Jin, H.; Wang, G.; Chen, R.; Zou, N.; and Hu, X. 2023. Chasing Fairness Under Distribution Shift: A Model Weight Perturbation Approach. In Thirty-seventh Conference on Neural Information Processing Systems. Kamiran, F.; and Calders, T. 2012. Data preprocessing techniques for classification without discrimination. Knowledge and Information Systems, 33(1): 1 33. Katharopoulos, A.; and Fleuret, F. 2018. Not All Samples Are Created Equal: Deep Learning with Importance Sampling. In Dy, J.; and Krause, A., eds., Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, 2525 2534. PMLR. Kawaguchi, K.; and Lu, H. 2019. Ordered SGD: A New Stochastic Optimization Framework for Empirical Risk Minimization. In International Conference on Artificial Intelligence and Statistics. Khandani, A. E.; Kim, A. J.; and Lo, A. W. 2010. Consumer credit-risk models via machine-learning algorithms. Journal of Banking & Finance, 34(11): 2767 2787. Killamsetty, K.; Sivasubramanian, D.; Ramakrishnan, G.; of Texas at Dallas, R. I. U.; of Technology Bombay Institution One, I. I.; and Two, I. 2020. GLISTER: Generalization based Data Subset Selection for Efficient and Robust Learning. In AAAI Conference on Artificial Intelligence. Kim, S.-E.; Paik, H. Y.; Yoon, H.; Lee, J.; Kim, N.; and Sung, M.-K. 2015. Sexand gender-specific disparities in

colorectal cancer risk. World journal of gastroenterology : WJG, 21: 5167 5175. Konstantinov, N.; and Lampert, C. H. 2022. Fairness-Aware PAC Learning from Corrupted Data. Journal of Machine Learning Research, 23(160): 1 60. Lin, Z. J.; Jung, J.; Goel, S.; and Skeem, J. 2020. The limits of human predictions of recidivism. Science Advances, 6(7): eaaz0652. Liu, Y.; and Guo, H. 2020. Peer Loss Functions: Learning from Noisy Labels without Knowing Noise Rates. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, volume 119 of Proceedings of Machine Learning Research, 6226 6236. PMLR. Liu, Z.; Luo, P.; Wang, X.; and Tang, X. 2015. Deep Learning Face Attributes in the Wild. In 2015 IEEE International Conference on Computer Vision (ICCV), 3730 3738. Loshchilov, I.; and Hutter, F. 2015. Online Batch Selection for Faster Training of Neural Networks. Co RR, abs/1511.06343. Loshchilov, I.; and Hutter, F. 2017. Decoupled Weight Decay Regularization. In International Conference on Learning Representations. Mindermann, S.; Brauner, J. M.; Razzak, M. T.; Sharma, M.; Kirsch, A.; Xu, W.; H oltgen, B.; Gomez, A. N.; Morisot, A.; Farquhar, S.; and Gal, Y. 2022. Prioritized Training on Points that are Learnable, Worth Learning, and not yet Learnt. In Chaudhuri, K.; Jegelka, S.; Song, L.; Szepesvari, C.; Niu, G.; and Sabato, S., eds., Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, 15630 15649. PMLR. Moore, J. S. 1998. An Expert System Approach to Graduate School Admission Decisions and Academic Performance Prediction. Omega, 26(5): 659 670. Mukerjee, A.; Biswas, R.; Kalyanmoy, Y.; Amrit, D.; and Mathur, P. 2002. Multi-objective Evolutionary Algorithms for the Risk-return Trade-off in Bank Loan Management. International Transactions in Operational Research, 9. Schrouff, J.; Harris, N.; Koyejo, O. O.; Alabdulmohsin, I.; Schnider, E.; Opsahl-Ong, K.; Brown, A.; Roy, S.; Mincu, D.; Chen, C.; Dieng, A.; Liu, Y.; Natarajan, V.; Karthikesalingam, A.; Heller, K. A.; Chiappa, S.; and D Amour, A. 2022. Diagnosing failures of fairness transfer across distribution shift in real-world medical settings. In Oh, A. H.; Agarwal, A.; Belgrave, D.; and Cho, K., eds., Advances in Neural Information Processing Systems. Seung, H. S.; Opper, M.; and Sompolinsky, H. 1992. Query by committee. In Proceedings of the Fifth Annual Workshop on Computational Learning Theory, COLT 92, 287 294. New York, NY, USA: Association for Computing Machinery. ISBN 089791497X. Shnayder, V.; Agarwal, A.; Frongillo, R.; and Parkes, D. C. 2016. Informed Truthfulness in Multi-Task Peer Prediction. In Proceedings of the 2016 ACM Conference on Economics and Computation, EC 16, 179 196. New York,

NY, USA: Association for Computing Machinery. ISBN 9781450339360. Wang, J.; Liu, Y.; and Levy, C. 2021. Fair Classification with Group-Dependent Label Noise. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, FAcc T 21, 526 536. New York, NY, USA: Association for Computing Machinery. ISBN 9781450383097. Wick, M.; panda, s.; and Tristan, J.-B. 2019. Unlocking Fairness: A Trade-off Revisited. In Wallach, H.; Larochelle, H.; Beygelzimer, A.; d'Alch e-Buc, F.; Fox, E.; and Garnett, R., eds., Advances in Neural Information Processing Systems, volume 32, 8783 8792. Curran Associates, Inc. Wolf, L.; Hassner, T.; and Taigman, Y. 2011. Effective Unconstrained Face Recognition by Combining Multiple Descriptors and Learned Background Statistics. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(10): 1978 1990. Yeh, I.; and Lien, C.-H. 2009. The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients. Expert Systems with Applications, 36: 2473 2480. Zhang, Y.; Zhou, F.; Li, Z.; Wang, Y.; and Chen, F. 2021. Bias-tolerant Fair Classification. In Balasubramanian, V. N.; and Tsang, I., eds., Proceedings of The 13th Asian Conference on Machine Learning, volume 157 of Proceedings of Machine Learning Research, 840 855. PMLR. Zhang, Y.; Zhou, F.; Li, Z.; Wang, Y.; and Chen, F. 2023. Fair Representation Learning with Unreliable Labels. In Ruiz, F.; Dy, J.; and van de Meent, J.-W., eds., Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, volume 206 of Proceedings of Machine Learning Research, 4655 4667. PMLR.