# generative_wellintentioned_networks__54b7bde6.pdf

Generative Well-intentioned Networks

Justin Cosentino, Jun Zhu Dept. of Comp. Sci. & Tech., Institute for AI, THBI Lab, BNRist Center, State Key Lab for Intell. Tech. & Sys., Tsinghua University, Beijing, China justin@cosentino.io, dcszj@mail.tsinghua.edu.cn

We propose Generative Well-intentioned Networks (GWINs), a novel framework for increasing the accuracy of certainty-based, closed-world classiﬁers. A conditional generative network recovers the distribution of observations that the classiﬁer labels correctly with high certainty. We introduce a reject option to the classiﬁer during inference, allowing the classiﬁer to reject an observation instance rather than predict an uncertain label. These rejected observations are translated by the generative network to high-certainty representations, which are then relabeled by the classiﬁer. This architecture allows for any certainty-based classiﬁer or rejection function and is not limited to multilayer perceptrons. The capability of this framework is assessed using benchmark classiﬁcation datasets and shows that GWINs signiﬁcantly improve the accuracy of uncertain observations.

1 Introduction

An essential aspect of any machine learning system is understanding what the model does not know. Despite achieving state-of-the-art performance across a wide array of problem domains, current deep learning techniques do not actually capture model uncertainty. Core settings in which standard deep learning approaches have been deployed, such as medical diagnoses, autonomous vehicles, and critical systems, rely on accurate estimates of uncertainty [16, 10]. Though traditional Bayesian probability theory offers mathematical tools to reason about model uncertainty, such approaches do not scale to the high dimensional feature spaces found in many deep learning tasks. The need for principled uncertainty estimates from deep learning architectures has given rise to the ﬁeld of Bayesian deep learning (see e.g., [35]) and many deep learning techniques have been interpreted through a Bayesian lens with the development of advanced inference algorithms [36, 39], providing novel methods for obtaining uncertainty estimates from deep learning models [21, 11, 12, 13, 22].

One may be able to measure epistemic uncertainty uncertainty in model prediction due to the lack of knowledge using Bayesian neural networks [25, 29], but the question of how to best utilize uncertainty estimates still remains. In this paper, we propose Generative Well-intentioned Networks (GWINs), a novel framework that leverages these uncertainty estimates to increase the generalizability and accuracy of certainty-based classiﬁers. Rather than make low-certainty predictions, a model can reject an observation to achieve an arbitrarily high accuracy [5]. However, a model that refuses to classify is not particularly useful. Borrowing ideas from the ﬁelds of classiﬁcation with rejection and generative networks, we allow a classiﬁer to reject uncertain observations and then, using a generative network, transform them into representations that the classiﬁer labels correctly with high certainty. Informally, one can view the classiﬁer as intuition and the generative network as critical thinking : given a new observation that we can not quickly reason about with prior knowledge, we apply critical thinking to reformulate the problem by relating it to information we already know to be true. We show that the generative network G is able to recover the distribution of observations

Corresponding author.

33rd Conference on Neural Information Processing Systems (Neur IPS 2019), Vancouver, Canada.

that classiﬁer C labels correctly with high certainty and that this reformulation process signiﬁcantly increases classiﬁer accuracy on the rejected observation subset.

The rest of this paper is organized as follows. We introduce the necessary background regarding Generative Adversarial Networks (GANs) and rejection-based classiﬁcation in Section 2. Our proposed GWIN framework is formally deﬁned in Section 3 and a sample GWIN implementation is detailed in Section 4. We then empirically evaluate the effectiveness of the proposed framework in Section 5. Lastly, we discuss related works in Section 6.

2 Preliminaries

2.1 Generative Adversarial Networks

Generative Adversarial Networks (GANs) [17] are generative models that make use of an adversarial process between two networks to learn a distribution: a generator network G produces synthetic data given some noise vector z while a discriminator network D discriminates between the generator s output and samples from the true data distribution. The goal of the generator is to produce samples that fool the discriminator. Formally, this adversarial game results in the following minimax objective:

min G max D E x Pr[log(D(x))] + E x Pg[log(1 D( x))], (1)

where Pr is the real data distribution and Pg is the generated distribution implicitly deﬁned by x = G(z). z is a random noise vector sampled from a simple noise distribution p, i.e., z p(z). With enough capacity, the discriminator will reach an optimum given G so that Pr = Pg [17].

It is well known that GANs suffer from training instability [33], suggesting that the divergences which GANs usually minimize are the cause of such training difﬁculties [2]. The Wasserstein GAN (WGAN) proposes the use of the Earth-Mover distance to deﬁne its objective function:

min G max D D E x Pr[D(x)] E x Pg[D( x))], (2)

where D is the set of 1-Lipschitz functions. The Wasserstein GAN with gradient penalty (WGAN-GP) [19] further builds on this work, providing a ﬁnal objective function with desirable properties:

min G max D E x Pg[D( x)] E x Pr[D(x)] + λ E ˆx Pˆx

h ( ˆx D(ˆx) 2 1)2i . (3)

Lastly, GANs can be extended to conditional models by conditioning both the discriminator and generator on auxiliary information y [27]. By providing y as additional input to each network, the original GAN objective function presented in Equation (1) becomes:

min G max D E x Pr[log(D(x, y))] + E z Pz[log(1 D(G(z, y), y))]. (4)

In this work, we build upon a conditional implementation of the WGAN with gradient penalty.

2.2 Classiﬁcation with Reject

Entirely orthogonal to the ﬁeld of generative networks is the study of classiﬁcation with rejection. The problem of classiﬁcation with rejection can be informally deﬁned as giving the classiﬁer the option to reject an observation instance instead of predicting its label. Depending on the setting, the classiﬁer may incur some small cost for rejection, though this cost is typically less than that of a random prediction. The motivation behind rejection-based classiﬁcation is to avoid misclassiﬁcation in high risk situations, such as medical diagnoses, when the classiﬁer has low certainty that its prediction will be correct. Early works explored the inherent tradeoff between error rate and rejection rate [4, 5], while more recent works have explored the binary classiﬁcation setting [37, 3, 6]. We borrow the basic idea of threshold rejection from these works: given some threshold τ, one rejects an observation instance if certainty in correct prediction is less than τ.

Recent work also explored the reject option in the context of deep learning [14, 15]. Though we opt for the simplicity of the thresholded reject option described above, it is worth noting that these methodologies could also be used within the Generative Well-intentioned Network framework.

Classifier Reject? r(ci, y i) y i

Figure 1: The inference process for some new observation xi. If classiﬁer C labels the input y i with certainty ci and rejects the query, the conditional GWIN translates the given query to the classiﬁer s conﬁdent distribution. The transformed query x i is then relabeled by the classiﬁer, i.e., C(G(xi, z)). The variable z denotes a random noise vector. The top half of this ﬁgure outlines the expected interface of the rejection-based classiﬁer. Aside from requiring the model to emit a certainty metric ci and label y i, no strong assumptions are made about the classiﬁer. Since the classiﬁer is ﬁxed during generative training, it need not be a perceptron-based model. The rejection function r : {(c, y )} {reject, y } determines if the given observation is rejected or labeled.

3 Generative Well-intentioned Network Framework

We propose a novel framework that leverages uncertainty estimates and generative networks to increase the accuracy of certainty-based models during inference. The framework consists of three core components:

1. A pretrained, certainty-based classiﬁer C that emits a prediction y i with certainty ci when labeling a new observation xi, i.e., (y i, ci) = C(xi)

2. A rejection function r : {(c, y )} {reject, y } that allows the classiﬁer to reject an uncertain instance rather than predicting its label

3. A conditional generative network G that transforms an observation xi and noise vector z to a new representation x i, i.e., x i = G(xi, z)

A key feature of this framework is that it can be used together with any certainty-based classiﬁer and does not modify the classiﬁer structure at any point during the generative training process. Assuming that the classiﬁer and rejection function provide the interface illustrated in Figure 1, any classiﬁer or rejection function can be used within this framework.

Given this ﬁxed, certainty-based classiﬁer C, the conditional GWIN G learns distribution Pc, where Pc represents the distribution of observations from the original data distribution Pr that C labels correctly with high certainty. The goal of G is to generate a new observation x Pc from (x, y) Pr that the classiﬁer will label as ground truth y with high certainty. During inference, the classiﬁer can choose to reject observation x if uncertain that it will label x correctly. This observation is then passed to G, along with a noise vector z, to generate a transformed sample for reclassiﬁcation. The inference process is illustrated in Figure 1 and examples of the transformation process using a Wasserstein GWIN are shown in Figure 2.

Similarly to the classiﬁer and the rejection function, we do not place any strong restrictions on the generative framework. We propose a Wasserstein GWIN in Section 4 as one potential approach. Though the Wasserstein network makes use of adversarial procedure, we refer to these generative networks as well-intentioned since they aim to maximize the accuracy and certainty of the provided classiﬁer.

Old Label: 9 Certainty: 60.19%

New Label: 5 Certainty: 96.60%

Old Label: 9 Certainty: 67.98%

New Label: 7 Certainty: 99.86%

Old Label: 5 Certainty: 78.60%

New Label: 0 Certainty: 99.95%

Figure 2: A visual representation of the GWIN transformation using example images from the MNIST Digits dataset. With a certainty threshold of τ = 0.8, the classiﬁer rejects the observations on the left, which would had been labeled incorrectly were the classiﬁer forced to predict. These observations are then transformed into the representations on the right using the Wasserstein GWIN described in Section 4.3. When relabeling the generated images, i.e., C(G(x, z)), the classiﬁer labels correctly with high-certainty.

4 Wasserstein Generative Well-intentioned Network

We outline a sample GWIN implementation, as deﬁned in Section 3, based on the Wasserstein GAN [2]. We utilize a Bayesian Neural Network classiﬁer and a simple τ-threshold rejection function. Section 5 evaluates this proposed implementation.

4.1 Classiﬁer

The GWIN is paired with a Bayesian neural network [29] using a Le Net-5 architecture [23]. A detailed description of the classiﬁer s architecture is in the appendix. The network is implemented using Tensor Flow Probability [7], which provides clean abstractions for Bayesian variational inference. The model uses the Flipout estimator [40] to minimize the Kullback-Leibler divergence up to a constant, also known as the negative Evidence Lower Bound (ELBO).

We approximate prediction certainty using Monte Carlo sampling to draw class probabilities from the model. We treat the median prediction of these draws as the certainty metric for each class and the mean prediction value as the prediction score. The class with the highest prediction value and its certainty metric are then provided to the rejection function.

Recall from Section 3 that the GWIN Framework is model-agnostic for certainty-based classiﬁers. Thus, experiments do not focus on improving the classiﬁer or rejection function, but rather analyze how the GWIN improves accuracy for a ﬁxed classiﬁer. In the appendix, we show that the GWIN still improves classiﬁer performance for a stronger Bayesian neural network.

4.2 Rejection Function

We use a simple τ-threshold rejection rule, where τ [0, 1]:

r(ci, y i) = y i, if ci τ reject, otherwise. (5)

The choice of τ is made at time of inference, meaning that this rejection function can be tuned after the generative network has been trained for optimal accuracy. Setting τ = 0 rejects no values and is equivalent to using only the base classiﬁer, while setting τ = 1 rejects all values and is equivalent to preprocessing all input with the GWIN.

4.3 Wasserstein GWIN with Gradient Penalty

The Wasserstein GWIN with gradient penalty (WGWIN-GP) is based on the Wasserstein GAN with gradient penalty [19]. The architectures of both the critic and generator closely follow the original WGAN-GP models and a detailed description of these architectures is in the appendix. In this subsection, we detail core modiﬁcations to the original model.

Loss with Transformation Penalty The WGWIN-GP introduces a new loss function with a transformation penalty that encourages the conditional generator to produce images that the classiﬁer will label correctly. Given some (xi, yi) training observation, the generator should produce x i that the classiﬁer labels as yi. This penalty is the loss of the classiﬁer when labeling the transformed observations in the current training batch, denoted Loss(C(x )). We include a penalty coefﬁcient λLoss. All experiments in this paper use λLoss = 10, which we found to work well across experiments. Equation (6) shows the loss function for the GWIN:

L = E x Pg[D(x , y)] E x Pc[D(x, y)] | {z } WGAN Loss

+ λGP E ˆx Pˆ x[(|| ˆx D(ˆx, y)||2 1)2] | {z } WGAN-GP Penalty

+ λLoss E x Pg[Loss(C(x ))] | {z } Transformation Penalty

Critic Training on Conﬁdent Subset The WGAN-GP critic is typically trained on both generated data x Pg and real data x Pr. However, we want the GWIN to generate images from the classiﬁer s conﬁdent distribution. Thus, we preﬁlter the training data to create a conﬁdent distribution Pc containing all images that the classiﬁer labels correctly with certainty of at least τ . The critic is then trained exclusively on samples drawn from Pc and Pg. Note that τ is not necessarily the same certainty threshold used in the rejection function. We set τ to some arbitrarily high certainty, e.g., 0.95, so that the rejection function can be tuned without needing to retrain the generative model.

Since the WGWIN-GP will encounter observations from Pr during inference, only the critic samples from Pc. During training, the generator samples from the entire real distribution Pr.

A Conditional Generative Model The WGWIN-GP is trained as a conditional GAN. Conditional generative networks are often class conditioned to generate an example of a speciﬁc class, and the same conditioning information is given to both the critic and generator. However, as the WGWIN-GP will not have access to the ground truth label during inference, the generator is conditioned on the entire observation x. We want the critic to discriminate between certain and uncertain observations. Since x is not guaranteed to be from Pc, we condition the critic on a one-hot representation of the ground truth label y in an effort to generate images that are representative of the original observation s class. Thus the generator is tasked with translating observations to new images that are from the given class in the conﬁdent distribution.

One can achieve conditioning by concatenating the conditional information with the input [27] or with a feature vector at some hidden layer within the network [32, 42]. Though other conditioning methods exists, such as modifying the discriminator s loss function to also maximize the log likelihood of the correct class [30] or projection-based approaches [28], we opted to condition the generator using input-based concatenation and to condition the critic using hidden-layer concatenation for simplicity.

Algorithm1 shows the new WGWIN-GP training algorithm.

5 Evaluation

We evaluate the WGWIN-GP using the training procedure outlined in Section 4 and the inference method illustrated in Figure 1. We compare test accuracy of the base Bayesian neural network, denoted BNN, the Bayesian neural network with reject, denoted BNN w/Reject, and the Bayesian neural network when paired with the WGWIN-GP, denoted BNN+GWIN. BNN w/Reject allows the classiﬁer to reject observations without needing to relabel while the BNN+GWIN uses the WGWIN-GP to transform and relabel the rejected subset.

The BNN trained for 30 epochs using a learning rate of 0.001 and batch size of 128. The GWIN trained for 200,000 iterations using the default hyperparameters listed in Algorithm 1. Both the generator and critic used a learning rate of 0.0001 and batch size of 128. We perform inference using various certainty thresholds τ {0.10, 0.30, 0.50, 0.70, 0.80, 0.90, 0.95, 0.99}. The BNN uses 10 Monte Carlo samples to determine prediction certainty.

Given the non-deterministic nature of both the Bayesian neural network and the generative network, all experimental results are averaged over 10 runs. We trained and evaluated the models using NVIDIA Ge Force GTX TITAN X GPUs.

Algorithm 1: WGWIN with gradient and transformation penalty. We use default values of λGP = 10, λLoss = 10, ncritic = 5, α = 0.0001, β1 = 0.5, β2 = 0.9, certainty preprocessing threshold τ = 0.95 and the ﬁxed classiﬁer C described in Section 4.1. Require :The penalty coefﬁcients λGP and λLoss, the number of critic iterations per generator iteration ncritic, the batch size m, Adam hyperparameters α, β1, β2, certainty preprocessing threshold τ , and classiﬁer C. Require :initial critic parameters w0, initial generator parameters θ0 1: Build conﬁdent data distribution Pc from training data Pr using classiﬁer C and threshold τ

2: while θ has not converged do

3: for t = 1, . . . , ncritic do

4: for i = 1, . . . , m do

5: Sample conﬁdent data (x, y) Pc, latent variable z p(z), and a random number ϵ U[0, 1].

6: x Gθ(x, z)

7: ˆx ϵx + (1 ϵ)x

8: L(i) Dw(x , y) Dw(x, y) + λGP (|| ˆx Dw(ˆx, y)||2 1)2

10: w Adam( w 1

m Pm i=1 L(i), w, α, β1, β2)

11: end for

12: Sample a batch of training data {(x, y)(i)}m i=1 Pr and latent variables {z(i)}m i=1 p(z)

13: θ Adam( θ 1

m Pm i=1 Dw(Gθ(x, z), y) + λLoss(Loss(C(Gθ(x, z)))), θ, α, β1, β2)

14: end while

5.1 Datasets

We use two different datasets in our experiments: the MNIST handwritten digits [23] dataset and the Fashion-MNIST clothing dataset [41]. Both datasets consist of 60,000 training images and 10,000 test images. We further split both training sets into a 50,000 image training set and 10,000 image validation set. Each example is a 28x28x1 grayscale image associated with a label from one of ten classes. Images are preprocessed by normalizing grayscale values to [0, 1].

Building the certain distribution Pc ﬁlters each dataset a varying amount. The average size of the high certainty training dataset is 47,948 for MNIST Digits and 31,760 for MNIST Fashion.

5.2 Results

Figure 3 and Figure 4 illustrate the mean accuracy for varying certainty rejection thresholds on each dataset while Table 1 and Table 2 present exact accuracy values on the rejected subset. At every certainty threshold, the GWIN+BNN outperforms the BNN on uncertain observations by up to 35% on MNIST Digits and 20% on MNIST Fashion. As the certainty threshold increases, we see the size of the rejected subset increase and the relative gains from the GWIN transformation decrease. However, this is expected as we begin to reject observations that the BNN already labels correctly with higher certainty. Figure 5 shows the change in certainty of the ground truth label at varying certainty rejection thresholds. Though the GWIN increases certainty in the ground truth label in the majority of observations, it is possible for the GWIN to map an observation to a lower-certainty representation. This suggests that one must carefully tune the rejection function and certainty metrics to minimize the number of correct instances that are mistranslated.

6 Related Work

Classiﬁers and inference networks have been paired with generative adversarial networks in the past, but the goal of these models has been to either learn a mapping from data to latent representations or improve class-conditional generation [8, 9, 24]. Though GWINs also contain an additional classiﬁcation network, the objective of the generative network is not solely image synthesis or

Table 1: Test set accuracy for MNIST Digits on rejected observations using GWIN transformation for the given certainty threshold τ. BNN and BNN+GWIN denote accuracy for the rejected subset using only the BNN and the BNN with GWIN reformulation, respectively. With no rejections (τ = 0), the BNN had an accuracy of 98.0%. Overall Acc. is the change in accuracy while % Error denotes the percent change in error rate for the entire subset when the GWIN is applied to rejected queries. All results are presented as the mean over 10 runs.

τ % Reject BNN Acc. BNN+GWIN Acc. Rejected Acc. Overall Acc. % Error

0.50 0.39 40.23 8.51 75.59 4.22 35.36 8.66 0.14 0.04 6.98 2.08 0.70 1.83 54.48 2.21 85.07 2.63 30.59 2.64 0.56 0.06 27.55 2.66 0.80 2.74 58.91 1.49 86.30 1.85 27.39 2.03 0.75 0.06 36.36 1.93 0.90 4.39 68.79 2.38 86.95 0.97 18.16 2.55 0.80 0.13 40.26 4.19 0.95 6.04 73.48 1.66 89.34 0.85 15.86 2.07 0.96 0.13 47.45 4.09 0.99 11.00 83.54 0.88 92.55 0.49 9.02 0.94 0.99 0.10 49.45 3.16

(a) Rejected subset accuracy (b) Overall test set accuracy

Figure 3: Test set accuracy for MNIST Digits using GWIN transformation for the given certainty threshold τ. Figure 3a shows BNN and BNN+GWIN accuracy on the rejected subset. % Reject represents the percent of the 10,000 observations rejected by the classiﬁer for the current certainty threshold. Figure 3b shows the accuracy of the BNN and BNN+GWIN on the entire test set. All results are presented as the mean over 10 runs and error bars show standard deviation.

uncovering latent factors, but rather is to reprocess observations in order to increase the classiﬁer s generalizability and accuracy.

To the best of our knowledge, Defense-GAN is the only other instance of pairing a GAN with a classiﬁcation network to increase performance during inference [34]. Defense-GAN serves as a defense against adversarial examples by using a GAN to denoise perturbed images prior to classiﬁcation. A WGAN is ﬁrst trained to capture the unperturbed training distribution. Before to labeling a new observation x, the image is projected onto the range of the generator by minimizing the reconstruction error,

min z ||G(z) x||2 2,

using L steps of gradient descent for R different samples of z.

Though both Defense-GAN and GWINs use WGAN-based implementations to improve classiﬁer inference, there are a number of differences between these two generative models that stem from the differences in the problems the attempt to solve:

Defense-GAN aims to denoise adversarial examples by projecting images back to the real data set while minimizing reconstruction loss. However, this assumes that there exists a denoised equivalent of each observation in the real dataset. GWINs, on the other hand, use a conditional WGAN in order to create high-certainty representations of the same class as the original observation.

Table 2: Test set accuracy for MNIST fashion on rejected observations using GWIN transformation for the given certainty threshold τ. BNN and BNN+GWIN denote accuracy for the rejected subset using only the BNN and the BNN with GWIN reformulation, respectively. With no rejections (τ = 0), the BNN had an accuracy of 87.4%. Overall Acc. denotes the change in accuracy while % Error denotes the percent change in error rate for the entire subset when the GWIN is applied to rejected queries. All results are presented as the mean over 10 runs.

τ % Reject BNN Acc. BNN+GWIN Acc. Rejected Acc. Overall Acc. % Error

0.50 4.18 40.52 2.36 59.43 2.30 18.91 3.61 0.79 0.17 6.22 1.24 0.70 15.25 52.08 1.55 66.95 0.67 14.87 1.78 2.27 0.30 18.08 1.98 0.80 21.21 57.87 0.89 69.16 0.47 11.29 0.87 2.39 0.19 19.25 1.32 0.90 30.29 64.14 0.66 73.18 0.73 9.04 0.83 2.74 0.29 21.63 1.85 0.95 37.30 68.93 0.49 76.06 0.43 7.14 0.61 2.66 0.25 21.15 1.61 0.99 51.97 76.55 0.30 81.34 0.26 4.79 0.34 2.49 0.19 19.94 1.30

(a) Rejected subset accuracy (b) Overall test set accuracy

Figure 4: Test set accuracy for MNIST Fashion using GWIN transformation for the given certainty threshold τ. Figure 4a shows BNN and BNN+GWIN accuracy on the rejected subset. % Reject represents the percent of the 10,000 observations rejected by the classiﬁer for the current certainty threshold. Figure 4b shows the accuracy of the BNN and BNN+GWIN on the entire test set. All results are presented as the mean over 10 runs and error bars show standard deviation.

Defense-GAN preprocesses all input to the classiﬁer, incurring the cost of the R L generations to label each observation. GWINs only transform rejected observations and require at most a single pass through the generator. We include notes on transformation latency for MNIST experiments in the appendix.

GWINs make stronger assumptions about the classiﬁer than Defense-GAN, requiring a certainty metric and reject function, but can be used for any classiﬁcation task and are not limited to adversarial robustness.

GWINs use the ﬁxed classiﬁer during training, while Defense-GAN is trained independently.

The novel contribution of GWINs is using the generative network to learn Pc of a certainty-based classiﬁer. The WGWIN-GP is just one possible implementation of this idea; though Defense-GAN is structured differently to address adversarial examples, one could imagine a similar method being applied as a new GWIN implementation. We leave this for future work.

Similarly to both Defense GAN and GWINs, Mag Net [26] is a framework that contains a detector network that learns to differentiate between normal and adversarial examples and a reformer network that moves adversarial examples towards the manifold of normal examples in order to protect against adversarial examples with small perturbations. Though this seems to be the second closest model to GWINs, Mag Net relies on auto-encoders and also focuses on increasing a model s robustness to adversarial examples rather than making use of classiﬁer certainty to label novel examples from the normal manifold.

(a) MNIST Digits (b) MNIST Fashion Figure 5: Change in rejected sample certainty of the ground truth label for varying certainty rejection thresholds τ. Outliers are those values that fall outside of 1.5IQR and are denoted with diamonds.

Other common strategies for denoising adversarial examples do not translate well to the uncertaintyrejection paradigm. Network distillation [31] trains a classiﬁer such that it is nearly impossible to generate adversarial examples using gradient-based attacks. However, novel observations that might make a classiﬁer uncertain in its predictions are not necessarily generated in an adversarial manner and thus we have no need to mask the network s gradients. Adversarial training [18] is speciﬁc to the attack generating the adversarial examples and does not necessarily generalize well to other attacks. Methods that generate additional training data, similarly to hallucination methods in the few-shot learning domain [1, 20, 38], aim to increase the robustness of a classiﬁer during training by generating out-of-distribution training data while our method assumes a ﬁxed, pretrained classiﬁer and uses generative methods to translate novel, out-of-distribution examples to the conﬁdent distribution during inference. Since the GWIN framework learns representations that the classiﬁer labels correctly with high conﬁdence, these generative denoising methods can easily be paired with our framework: a classiﬁer is trained using the aforementioned techniques and the GWIN is then used to transform any novel examples that the new classiﬁer is not entirely robust to. Similarly to Defense GAN and Mag Net, the ﬂexibility and additive nature of our frameworks means that we can easily build atop these existing denoising methodologies. Since noise only represents a subset of out-of-distribution observations, we cannot rely entirely on denoising techniques to address classiﬁer robustness. GWINs take a step towards a generalizable, principled framework for rethinking uncertain examples and leveraging classiﬁer uncertainty.

7 Conclusion

In this work, we outlined Generative Well-intentioned Networks (GWINs), a novel framework leveraging uncertainty and generative networks to increase classiﬁer accuracy. We proposed a high level architecture making use of certainty-based classiﬁers, a rejection function, and a generative network. We deﬁned a baseline implementation, the Wasserstein GWIN with gradient penalty (WGWIN-GP), and empirically showed that the WGWIN-GP outperforms the base Bayesian neural network at all certainty thresholds. This paper has demonstrated the viability of the GWIN framework and we hope that our work leads to further study of the use of generative networks to aid classiﬁer inference.

Acknowledgements

This work was supported by the National Key Research and Development Program of China (No. 2017YFA0700904), NSFC Projects (Nos. 61620106010, 61621136008, 61571261), Beijing NSF Project (No. L172037), Beijing Academy of Artiﬁcial Intelligence (BAAI), Tiangong Institute for Intelligent Computing, the JP Morgan Faculty Research Program, and the NVIDIA NVAIL Program with GPU/DGX Acceleration.

[1] Antreas Antoniou, Amos Storkey, and Harrison Edwards. Data augmentation generative adversarial networks, 2017.

[2] Martin Arjovsky, Soumith Chintala, and Léon Bottou. Wasserstein gan, 2017.

[3] Peter Bartlett and Marten Wegkamp. Classiﬁcation with a reject option using a hinge loss. Journal of Machine Learning Research, 9(8):1823 1840, 2008.

[4] Chi-Keung Chow. An optimum character recognition system using decision functions. IRE Transactions on Electronic Computers, (4):247 254, 1957.

[5] Chi-Keung Chow. On optimum recognition error and reject tradeoff. IEEE Transactions on information theory, 16(1):41 46, 1970.

[6] Corinna Cortes, Giulia De Salvo, and Mehryar Mohri. Learning with rejection. In International Conference on Algorithmic Learning Theory, pages 67 82. Springer, 2016.

[7] Joshua V. Dillon, Ian Langmore, Dustin Tran, Eugene Brevdo, Srinivas Vasudevan, Dave Moore, Brian Patton, Alex Alemi, Matt Hoffman, and Rif A. Saurous. Tensorﬂow distributions, 2017.

[8] Jeff Donahue, Philipp Krähenbühl, and Trevor Darrell. Adversarial feature learning. In International Conference on Learning Representations, 2017.

[9] Vincent Dumoulin, Ishmael Belghazi, Ben Poole, Olivier Mastropietro, Alex Lamb, Martin Arjovsky, and Aaron Courville. Adversarially learned inference. In International Conference on Learning Representations, 2017.

[10] Yarin Gal. Uncertainty in Deep Learning. Ph D thesis, University of Cambridge, 2016.

[11] Yarin Gal and Zoubin Ghahramani. Bayesian convolutional neural networks with bernoulli approximate variational inference, 2015.

[12] Yarin Gal and Zoubin Ghahramani. Dropout as a bayesian approximation: Representing model uncertainty in deep learning. In International Conference on Machine Learning, pages 1050 1059, 2016.

[13] Yarin Gal and Zoubin Ghahramani. A theoretically grounded application of dropout in recurrent neural networks. In Advances in Neural Information Processing Systems, pages 1019 1027, 2016.

[14] Yonatan Geifman and Ran El-Yaniv. Selective classiﬁcation for deep neural networks. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS 17, pages 4885 4894, USA, 2017. Curran Associates Inc.

[15] Yonatan Geifman and Ran El-Yaniv. Selective Net: A deep neural network with an integrated reject option. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 2151 2159, Long Beach, California, USA, 09 15 Jun 2019. PMLR.

[16] Zoubin Ghahramani. Probabilistic machine learning and artiﬁcial intelligence. Nature, 521(7553):452, 2015.

[17] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Advances in Neural Information Processing Systems, pages 2672 2680, 2014.

[18] Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. Explaining and harnessing adversarial examples, 2014.

[19] Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C Courville. Improved training of wasserstein gans. In Advances in Neural Information Processing Systems, pages 5767 5777, 2017.

[20] Bharath Hariharan and Ross Girshick. Low-shot visual recognition by shrinking and hallucinating features. In Proceedings of the IEEE International Conference on Computer Vision, pages 3018 3027, 2017.

[21] Durk P Kingma, Tim Salimans, and Max Welling. Variational dropout and the local reparameterization trick. In Advances in Neural Information Processing Systems, pages 2575 2583, 2015.

[22] Balaji Lakshminarayanan, Alexander Pritzel, and Charles Blundell. Simple and scalable predictive uncertainty estimation using deep ensembles. In Advances in Neural Information Processing Systems, pages 6402 6413, 2017.

[23] Yann Le Cun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278 2324, 1998.

[24] Chongxuan Li, Tauﬁk Xu, Jun Zhu, and Bo Zhang. Triple generative adversarial nets. In Advances in Neural Information Processing Systems, pages 4088 4098, 2017.

[25] David JC Mac Kay. A practical bayesian framework for backpropagation networks. Neural computation, 4(3):448 472, 1992.

[26] Dongyu Meng and Hao Chen. Magnet: A two-pronged defense against adversarial examples. In Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, CCS 17, pages 135 147, New York, NY, USA, 2017. ACM.

[27] Mehdi Mirza and Simon Osindero. Conditional generative adversarial nets, 2014.

[28] Takeru Miyato and Masanori Koyama. c GANs with projection discriminator. In International Conference on Learning Representations, 2018.

[29] Radford M Neal. Bayesian learning for neural networks. Ph D thesis, University of Toronto, 1995.

[30] Augustus Odena, Christopher Olah, and Jonathon Shlens. Conditional image synthesis with auxiliary classiﬁer gans. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 2642 2651. JMLR. org, 2017.

[31] Nicolas Papernot, Patrick Mc Daniel, Xi Wu, Somesh Jha, and Ananthram Swami. Distillation as a defense to adversarial perturbations against deep neural networks. In 2016 IEEE Symposium on Security and Privacy (SP), pages 582 597. IEEE, 2016.

[32] Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee. Generative adversarial text to image synthesis. In Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, ICML 16, pages 1060 1069. JMLR.org, 2016.

[33] Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. Improved techniques for training gans. In Advances in Neural Information Processing Systems, pages 2234 2242, 2016.

[34] Pouya Samangouei, Maya Kabkab, and Rama Chellappa. Defense-gan: Protecting classiﬁers against adversarial attacks using generative models. In International Conference on Learning Representations, 2018.

[35] Jiaxin Shi, Jianfei Chen, Jun Zhu, Shengyang Sun, Yucen Luo, Yihong Gu, and Yuhao Zhou. Zhu Suan: A library for Bayesian deep learning. ar Xiv: 1709.05870, 2017.

[36] Jiaxin Shi, Shengyang Sun, and Jun Zhu. A spectral approach to gradient estimation for implicit distributions. In International Conference on Machine Learning (ICML), 2018.

[37] Francesco Tortorella. An optimal reject rule for binary classiﬁers. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), pages 611 620. Springer, 2000.

[38] Yu-Xiong Wang, Ross Girshick, Martial Hebert, and Bharath Hariharan. Low-shot learning from imaginary data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7278 7286, 2018.

[39] Ziyu Wang, Tongzheng Ren, Jun Zhu, and Bo Zhang. Function space particle optimization for bayesian neural networks. In International Conference on Learning Representations (ICLR 2019), 2019.

[40] Yeming Wen, Paul Vicol, Jimmy Ba, Dustin Tran, and Roger Grosse. Flipout: Efﬁcient pseudoindependent weight perturbations on mini-batches. In International Conference on Learning Representations, 2018.

[41] Han Xiao, Kashif Rasul, and Roland Vollgraf. Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms, 2017.

[42] Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, and Dimitris N Metaxas. Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, pages 5907 5915, 2017.