# circlegan_generative_adversarial_learning_across_spherical_circles__3c3fb118.pdf

Circle GAN: Generative Adversarial Learning across Spherical Circles

Woohyeon Shim POSTECH Ci TE wh.shim@postech.ac.kr

Minsu Cho POSTECH CSE & GSAI mscho@postech.ac.kr

We present a novel discriminator for GANs that improves realness and diversity of generated samples by learning a structured hypersphere embedding space using spherical circles. The proposed discriminator learns to populate realistic samples around the longest spherical circle, i.e., a great circle, while pushing unrealistic samples toward the poles perpendicular to the great circle. Since longer circles occupy larger area on the hypersphere, they encourage more diversity in representation learning, and vice versa. Discriminating samples based on their corresponding spherical circles can thus naturally induce diversity to generated samples. We also extend the proposed method for conditional settings with class labels by creating a hypersphere for each category and performing class-wise discrimination and update. In experiments, we validate the effectiveness for both unconditional and conditional generation on standard benchmarks, achieving the state of the art.

1 Introduction

Generative Adversarial Networks (GANs) [9] aim at learning to produce high-quality data samples, as observed from a target dataset. To this end, it trains a generator, which synthesizes samples, adversarially against a discriminator, which classiﬁes the samples into real or fake ones; the discriminator s gradient in backpropagation guides the generator to improve the quality of the generated samples. As the gradient is difﬁcult to stabilize, recent methods [2, 10, 18, 31] have suggested using the Lipschitz continuous space for the discriminator so that its gradient is bounded under some constant with respect to input. In a similar spirit, recent work [22] introduces a hypersphere as an embedding space, which enjoys its boundedness of distances between samples and also their gradients, showing its superior performance against the precedent models.

The most important qualities of the generated samples are realness and diversity. In conventional GAN frameworks including [22], the discriminator can be viewed as evaluating realness based on a prototype representation, i.e., the closer a generated sample is to the prototype of real samples, the more realistic it is evaluated as. The single prototype, however, may not be sufﬁcient for capturing all modes of real samples, and thus recent work [1, 7, 19, 26] attempts to tackle this issue by employing multiple discriminators, i.e., multiple prototypes. But, training with multiple discriminators requires higher memory footprints and computation costs, and also introduces another hyperparameter, i.e., the number of discriminators. Conditional GANs [17, 21] use additional information of class labels to increase the coverage of modes across different categories. However, the class label does not ensure the intra-class diversity [21] while being beneﬁcial to learn representative features of the class. Hence, the problem of mode collapse remains even in conditional models.

In this paper, we address the sample diversity issue by learning a structured hypersphere embedding space for the discriminator in GANs. Our discriminator learns to populate realistic samples around a great circle, which is the largest spherical circle, while pushing unrealistic samples toward the poles perpendicular to the great circle. In doing so, since the longer circles occupy the larger area on the

34th Conference on Neural Information Processing Systems (Neur IPS 2020), Vancouver, Canada.

hypersphere, they encourage more diversity in representation learning of realistic samples. As the result, multiple modes of real data distribution are effectively represented to guide the generator in GAN training. We also extend the proposed method for conditional settings with class labels by creating a hypersphere for each category and performing class-wise sample discrimination and update. In experimental evaluation, we validate the effectiveness of our approach for both unconditional and conditional generation tasks on the standard benchmarks, including STL10, CIFAR10, CIFAR100, and Tiny Imagenet, achieving the state of the art.

2 Related Work

2.1 Generative Adversarial Networks

Previous work for improving GANs concentrates on addressing the difﬁculty of training. These studies have been conducted in different aspects such as network architectures [10, 13, 14, 23], objective functions [12, 20] and regularization techniques [2, 10, 18, 31], which impose the Lipschitz constraint to the discriminator. Sphere GAN [22] has shown that using hypersphere as an embedding space affords the stability in GAN training by the boundedness of distances between samples and their gradients. Our work also adopts a hypersphere embedding space, but proposes a different strategy in structuring and learning the hypersphere, which will be discussed in details.

The most relevant line of GAN research to ours is on the lack of sample diversity. In many cases, GAN objectives can be satisﬁed with samples of a limited diversity, and no guarantee exists that a model in training reaches to generate diverse samples. To tackle the lack of diversity, a.k.a. mode collapse, several approaches are proposed from different perspectives. Chen et al. [5] and Karras et al. [14] modulate normalization layers using a noise vector that is transformed through a sequence of layers. Yang et al. [28] penalize the mode collapsing behavior directly to the generator by maximizing the gradient norm with respect to the noise vector. Yamaguchi and Koyama [27] regularize the discriminator to have local concavity on the support of the generator function to increase its entropy monotonically at every stage of the training. Liu et al. [16] propose a spectral regularization to prevent spectral collapse when training with the spectral normalization, which is shown to be closely linked to the mode collapse. Our approach is very different from these in combating mode collapse.

2.2 Conditional GANs

The most straightforward way to improve the performance of GANs is to incorporate side information of the samples, typically class labels. Depending on how the class information is used, the models are categorized into two types: projection-based models and classiﬁer-based models.

Projection-based models [17] discriminate samples by projecting them onto two embeddings, meaning real and fake , for the corresponding class and measuring the discrepancy of the projection values. As the network architectures are combined with attention modules [29] and increased to highcapacity [4], these models improve synthesizing large-scale and high-ﬁdelity images, achieving the state-of-the-art performance. However, since they do not learn class embeddings through an explicit classiﬁer, it may not be easy for the methods to learn the class-speciﬁc features.

In contrast, classiﬁer-based models [21, 24] use an auxiliary classiﬁer with a discriminator to explicitly learn class-speciﬁc features for generation. However, when incorporating the auxiliary classiﬁer, the diversity of generated samples often degrades severely because the generator focuses on generating easily-classiﬁable samples [8, 30]. To prevent the generator from being over-conﬁdent to some samples, Zhou et al. [30] introduces an adversarial loss on the classiﬁer. But, the loss often deteriorates the classiﬁer and hinders the models from scaling up to large datasets (e.g., Image Net). While Gong et al. [8] propose a scalable model using twin auxiliary classiﬁers, they require one more classiﬁer to prevent the other from degeneration.

Unlike these methods, we address the mode-collapse issue based on the hypersphere-based discriminators by integrating only a single auxiliary classiﬁer without any adversarial loss.

Figure 1: Overall architecture of Circle GAN. Given a randomly sampled noise vector z, the generator synthesizes fake samples and the discriminator discriminates real samples from the fake samples. The discriminator produces the embeddings vemb, projects them onto the hypersphere by centering and ℓ2normalization (v), and discriminates them according to a score function based on their corresponding spherical circles. In training, Circle GAN performs adversarial learning based on the proposed score functions (sreal and sdiv). See text for details.

3 Proposed Approach

In the framework of GANs, our method, Circle GAN, learns a discriminator that projects samples on a hypersphere and then scores them using their corresponding spherical circles of the hypersphere. The overall architecture is illustrated in Fig. 1. The discriminator of Circle GAN is composed of feature extractor f and score function s: D(x) = s(f(x)) = s(v). The key idea is to leverage the geometric characteristics of the hypersphere in scoring the quality of samples.

In the following, we ﬁrst introduce Circle GAN for unconditional settings and then extend it to conditional settings. We also discuss our method in comparison to recent hypersphere-based method [22].

3.1 Circle GAN

Let samples X = {x}n i=1 be drawn from both data distribution Pdata and the generator distribution Pgen. The discriminator D ﬁrst embeds the samples X to Vemb = {vemb i }n i=1 and then projects them on a unit hypersphere with a learnable center c: vi = (vemb i c)/ vemb i c . We denote the sets of sample embeddings on the hypersphere by V = {vi}n i=1 . Given a unit pivotal vector p, which is learnable, we can deﬁne the set of spherical circles Ωp that is perpendicular to p. Then, each point vi (i.e., sample embedding) on the hypersphere is assigned to a corresponding circle ωm(i) Ωp, where the function m represents an injective mapping from V to Ωp. As shown in the right of Fig. 1, using the sample embedding vi and the pivotal vector p, we deﬁne its projected vector vproj i and its rejected vector vrej i :

vproj i = vi, p p, vrej i = vi vproj i , (1)

where , indicates the inner product between two vectors. In a nutshell, each vi corresponds to a spherical circle ωm(i) Ωp that is identiﬁed with vproj i . Note that both center c and pivotal vectors p are learned in training to adapt the hypersphere to the sample embeddings.

Our strategy in learning the discriminator is to populate real samples around the longest circle in Ωp, i.e., the great circle, while pushing fake samples toward the shortest circles in Ωp, i.e., the top or bottom pole. Since longer circles occupy the larger area on the hypersphere, they may allow more diversity in representation learning, and vice versa. In this sense, discriminating real and fake samples based on their corresponding circles can naturally induce diversity to generated samples.

We propose to measure the realness score for sample embedding vi based on the proximity of its corresponding circle to the great circle, which is computed by

sreal(vi) = vproj i 2/σproj, (2)

where the score is normalized with its standard deviation σproj to ﬁx the scale consistently through

the course of training and σproj is computed as q

P|V| i=1 vproj i 2 2/|V|. On the other hand, we deﬁne the diversiﬁability score for vi by the radius of corresponding circle which can be quantiﬁed by

sdiv(vi) = vrej i 2/σrej, (3)

where σrej is computed as q

P|V| i=1 vrej i 2 2/|V|.

Note that the diversiﬁability score increases along with the realness score. Thus we use the realness score function sreal as the discriminator output so that it guides the generator in Circle GAN training. The discrimination based on the spherical circles increases the diversity of realistic samples while suppressing the diversity of unrealistic samples, which improves training the generator. This is supported by the experimental results in Sec. 4.2 and 4.3.

We design two other variants for scoring that explicitly combine both the realness score and the diversiﬁability score:

sadd(vi) = vproj i 2/σproj + vrej i 2/σrej, (4a)

smult(vi) = arctan σproj

σrej vrej i 2 vproj i 2

The former performs a simple addition of realness and diversiﬁability scores, and the latter measures an angle between the pivotal vector and sample embedding. The effects of these two variants will be demonstrated in our experiment.

To train the model based on the proposed score functions, we adopt the relativistic averaged loss [12] considering its robustness and simplicity:

Ladv D = Ex Pdata log sigmoid τ(D(x) Ey Pgen[D(y)])

Ey Pgen log sigmoid τ(Ex Pdata[D(x)] D(y)) , (5)

where τ adjusts the range of score difference for the sigmoid functions and is set to 5 for sadd and sreal and 10 for smult. For the adversarial loss of generator Ladv G , we set it as the inverse of the discriminator loss by changing the source of the samples.

In the following, we introduce two additional losses to improve the training dynamics. The center estimation loss improves adapting hypersphere to the sample embeddings (vemb i ) and the radius equalization loss increases discriminative power of the hypersphere by enforcing one-to-one mappings from the centered embeddings (vemb i c) to spherical embeddings (v).

Center estimation loss. The center c is deﬁned as a point that minimizes the sum of square of the distances to all the sample points. Thus, we optimize this directly by measuring the norms of the centered embeddings but proceed separately from the original objectives not to disrupt the adversarial learning. Here, we use Huber loss rather than ℓ2-loss.

Lcenter = 1

i=1 Lhuber( vemb i c 2), (6)

where Lhuber is deﬁned as 0.5x2 if x 1, otherwise x 0.5.

Radius equalization loss. We equalize the radius of centered embeddings through the discriminator. First, we compute the target radius R by taking square root of averaged squared norm of the centered embeddings. Then, we penalize the difference in radiuses using Huber loss.

i=1 Lhuber(R vemb i c 2), (7)

where R is computed by q

1 |V| P|V| i=1 vi c 2 2.

The total losses for the unconditional settings are LD = Ladv D + Lreg D and LG = Ladv G .

3.2 Extension to Conditional GANs

We extend our method for the conditional settings and improve the sample diversity of conditional GANs within each class. The key idea is to create multiple hyperspheres for target categories and perform adversarial learning in a class-wise manner. To be concrete, we create a center and a pivotal vector for each class: {c}L l=1 and {p}L l=1, where L is the number of categories. The embeddings are translated with their corresponding center vectors, and the scores for adversarial learning are measured based on their corresponding pivotal vectors. For the rest, in computing non-trainable parameters (e.g., standard deviations in Eqs. 2, 3, 4 and target radius in Eq. 7) and comparing scores in the loss in Eq. 5, we take all samples and associate them together irrespective of class labels due to the limited batch sizes.

We incorporate an auxiliary classiﬁer C to learn the class-speciﬁc features. The auxiliary classiﬁer is a simple linear layer attached to the discriminator and is trained without any adversarial loss to predict the class of a sample x. With this classiﬁer, the generator makes the sample class-speciﬁc by maximizing the probability of its corresponding label y :

Lcls D = E(x,y) Pdata[ log C(x)y], Lcls G = E(x,y) Pgen[ log C(x)y]. (8)

The total losses for the conditional settings are LD = Ladv D + Lreg D + Lcls D and LG = Ladv G + Lcls G . The overall algorithm with the proposed components is presented in Algorithm 1.

Algorithm 1: Training Circle GAN

Given: generator parameters ψg, discriminator parameters ψd, pivotal vectors {p}L l=1, center vectors {c}L l=1 while ψg not converged do

Compute embeddings Vemb, V using X sampled from Pdata and Pgen Compute Lcenter by Eq. 6 Update center cl Adam( cl Lcenter, cl) Compute realness sreal(v) and diversiﬁability sdiv(v) by Eq. 2, 3 Compute scores sreal(v), sadd(v) or smult(v) by Eq. 2, 4 Compute discriminator loss LD by Eq. 5, 7, 8 Update discriminator and pivot [ψd, pl] Adam( ψd,pl LD, [ψd, pl]) Compute embeddings Vemb, V using X sampled from Pd and Pg Compute realness sreal(v) and diversiﬁability sdiv(v) by Eq. 2, 3 Compute scores sreal(v), sadd(v) or smult(v) by Eq. 2, 4 Compute generator loss LG by Eq. 5, 8 Update generator ψg Adam( ψg LG, ψg) end

3.3 Comparison to Sphere GAN

Figure 2: Circle GAN (ours) vs. Sphere GAN [22].

As ours, Sphere GAN [22] also employs a hypersphere as the embedding space for the discriminator, but constructs it with a different projection function and learns with a different objective, as shown in Fig. 2. For hypersphere projection, Circle GAN uses translation and ℓ2normalization while Sphere GAN uses inverse stereographic projection (ISP). The translation and ℓ2-normalization tends to distribute samples evenly onto the hypersphere, but ISP concentrates a large portion of samples around the north pole and maps only a small portion around the south pole. This biased projection may prevent the discriminator from fully exploiting the space in learning. Furthermore, while Circle GAN learns the center and the pivotal point adaptively to sample embeddings, Sphere GAN uses a ﬁxed coordinate system ﬁxed with the north pole N and the

origin O. As demonstrated in Sec. 4.2, our projection method shows better performance than ISP in practice. As for the objective in training, the score function smult φi is analogous to the Sphere GAN objective [22]; both maximize angles between a reference vector and sample embeddings. However, while Circle GAN maximizes the angles to the great circle, Sphere GAN does it to the opposite point of the reference vector. The use of circles turns out to make a signiﬁcant difference in sample diversity and quality, as shown in our experiments. Finally, Circle GAN easily extends to conditional settings by creating multiple hyperspheres, whereas Sphere GAN is less ﬂexible due to its specialized projection with the ﬁxed reference vector on the hypersphere.

4 Experiments

We conduct experiments in both unconditional and conditional settings of GANs to demonstrate the effectiveness of the proposed methods. In all the experiments, we use the Res Net-based architecture for both the discriminator and the generator with some modiﬁcations from the original model [10]. The modiﬁcations and the training details are presented in the supplementary A. In Section 4.1, we describe the metrics to evaluate the realness and diversity of the generated samples. Then, in Section 4.2 and 4.3, we validate our approach by performing unconditional and conditional generation tasks on standard benchmark datasets. To further investigate the scalability of our model, we provide more results on the large-scale dataset, Image Net, in the supplementary B.

4.1 Evaluation Metrics

The common evaluation metrics for image generation are Inception Score (IS) [24] and Frechet Incéption Distance (FID) [11]. IS measures how distinctively each sample is classiﬁed as a certain class and how similar the class distribution of generated samples is to that of the target dataset. FID measures a distance between the distribution of real data and that of generated samples in an embedding space, where the embeddings are assumed to be Gaussian distributed. While these are easy to calculate and correlate well with human assessment of generated samples, there have been some concerns about them. First, IS is computed based on class probabilities over Imagenet classes, and thus the evaluation on other datasets cannot be accurate since their class distributions are different from that of Imagenet [30]. Second, IS is highly sensitive to small changes in weights of classiﬁers [3] and image samples, as shown in our STL10 experiments of Sec. 4.2. Third, FID does not penalize a sample with a similar but different identity, e.g., a clear cat image generated by a dog label, which would be problematic particularly for conditional generations. Hence, in addition to IS and FID, we use two other metrics [25], GAN-train and GAN-test, for evaluation on conditional settings.

GAN-train and GAN-test evaluate diversity and quality of images, respectively. They can easily adapt to each target dataset and also consider mislabeled images in evaluation. GAN-train trains a classiﬁer using generated images and then measures its accuracy on a validation set of the target dataset. This score is analogous to recall (the diversity of images) since the score would increase if the generated samples cover different modes of the target dataset. In contrast, GAN-test trains a classiﬁer using a training set of the target dataset and then measures its accuracy on the generated images. This measure is not related to diversity, but to precision since high-quality samples even from a single mode can achieve a high score. To sum up, we evaluate the models of unconditional GANs with IS and FID, and along with these scores, we use GAN-train and GAN-test for conditional GANs. To measure IS and FID, we use 50K images for all experiments following original implementations.1

4.2 Unconditional GANs

For unconditional generation task, we evaluate our methods on CIFAR10 and STL10 [6]. CIFAR10 and STL10 consists of 50K 32 32 and 100K 96 96 images of 10 classes, respectively. We resize STL images to 48 48 before training, following the experimental protocol of [18, 22]. We compare our methods with two Lipschitz-based models [10, 18] and one hypersphere-based model [22] (Table 1a). The best and second-best results are highlighted with red and blue colors, respectively.

Circle GAN models achieve the best and second-best performance in terms of all metrics on the datasets, except for IS on STL10 where Sphere GAN performs the best. We suspect that the exception

1IS: https://github.com/openai/improved-gan, FID: https://github.com/bioinf-jku/TTUR

Table 1: Unnconditional GAN results on CIFAR10 and STL10.

(a) Comparison on CIFAR10 and STL10.

Model CIFAR10 STL10 IS( ) FID( ) IS( ) FID( ) real images 11.2 3.43 26.1 17.9 WGAN-GP [10] 7.76 22.2 9.06 42.6 SNGAN [18] 8.22 21.7 9.10 40.1 Sphere GAN [22] 8.39 17.1 9.55 31.4 Circle GAN (sreal) 8.54 12.2 9.18 27.0 Circle GAN (sadd) 8.55 12.3 9.24 27.5 Circle GAN (smult) 8.47 12.9 8.82 30.1

(b) Ablation study on CIFAR10.

Methods FID( ) Circle GAN (smult) 12.9 - radius equalization 14.6 - center estimation 15.2 - circle learning 15.8 - score normalization 16.8 - ℓ2-projection 20.4

is due to the sensitivity of IS, as also reported in [3]; IS on the test set of STL10 is 14.8, which is signiﬁcantly lower than 26.1 on the training set, but for other datasets such as CIFAR10 and CIFAR100, IS values are similar between train and test sets (CIFAR10: 11.2 vs. 11.3, CIFAR100: 14.8 vs. 14.7).

To further compare Circle GAN to Sphere GAN, we conduct ablation studies on CIFAR10 with the model using angle smult for score function (Table 1b). Each component in the table is subsequently removed from the full Circle GAN model to see its effect. Here we use the FID metric, which is more stable. First and second, we remove radius equalization and center estimation losses, respectively. Third, we replace the Circle GAN objective with that of Sphere GAN, which maximizes the angle to the opposite of pivotal point. Forth, we remove the score normalization. Fifth, we replace ℓ2normalization with ISP of Sphere GAN. The results show that the proposed components consistently improve FIDs. In particular, replacing ℓ2-normalization with ISP signiﬁcantly deteriorates FID, which implies that ISP of Sphere GAN may be problematic due to the embedding bias of samples. Circle GAN not only outperforms Sphere GAN with a large margin, but also easily extends to conditional settings as demonstrated in the next experiment.

4.3 Conditional GANs

We conduct conditional generation experiments on CIFAR10, CIFAR100 [15] and Tiny Imagenet.2 CIFAR100 consists of 50K 32 32 images of 100 classes and Tiny Image Net consists of 100K 64 64 images of 200 classes. We compare our models with a projection-based model [17] and also with two classiﬁer-based models [30, 21], one with adversarial losses on class probability [30] and the other without the losses [21]. We present the results in Table 2, 3, 4 for CIFAR10, CIFAR100, and Tiny Imagenet, respectively. The numbers inside the parentheses indicate the results of our models without the auxiliary classiﬁer C. All the Circle GAN models outperform the other models in terms of all metrics across all the datasets. On the datasets with more classes and more diverse samples with higher resolution, the performance gains over other models become greater. It demonstrates the advantage of class-wise hypersphere placing realistic samples around the great circle in a class-wise manner.

Table 2: Conditional GAN results on CIFAR10.

Model IS( ) FID( ) GAN-train( ) GAN-test( ) real images 11.2 3.43 92.8 100 AC+WGAN-GP [10] 8.27 13.7 79.5 85.0 Proj. SNGAN [17] 8.47 10.4 82.2 87.3 AMGAN [30] 8.79 7.62 81.0 94.5 Circle GAN (sreal) 9.08 (8.91) 5.72 (7.47) 87.0 (84.0) 96.1 (84.5) Circle GAN (sadd) 9.01 (8.80) 5.90 (8.09) 86.8 (82.6) 96.6 (82.9) Circle GAN (smult) 9.22 (8.83) 5.83 (12.2) 86.3 (83.5) 96.8 (83.5)

2https://tiny-imagenet.herokuapp.com/

Table 3: Conditional GAN results on CIFAR100.

Model IS( ) FID( ) GAN-train( ) GAN-test( ) real images 14.8 3.92 69.4 100.0 AC+WGAN-GP [10] 9.10 15.6 26.7 40.4 Proj. SNGAN [17] 9.30 15.6 45.0 59.4 AMGAN [30] 10.2 16.5 23.2 70.8 Circle GAN (sreal) 11.8 (9.93) 7.43 (9.45) 54.7 (48.6) 93.9 (58.5) Circle GAN (sadd) 11.9 (10.13) 7.35 (8.99) 55.6 (49.9) 92.5 (57.7) Circle GAN (smult) 11.9 (9.98) 8.62 (9.10) 54.0 (47.4) 91.0 (58.0)

The GAN-train and GAN-test results show that Circle GAN produces higher quality and more diverse samples than other models. In particular for the GAN-test, it remains almost the same at the highest level regardless of the datasets, which implies that every sample captures important features of its corresponding class to be classiﬁed correctly by the pre-trained classiﬁer. We attribute this to the auxiliary classiﬁer since all the scores drop signiﬁcantly on all datasets when the classiﬁer is detached. The overall scores without auxiliary classiﬁer are similar to the projection-based models, which have no classiﬁer. Hence, this supports the use of auxiliary classiﬁer for learning the class-speciﬁc features in discriminator and correspondingly to update samples in generator.

However, the auxiliary classiﬁer does not beneﬁt all the other classiﬁer-based models [10, 30], instead they perform signiﬁcantly worse as the number of classes or the resolution increases. Speciﬁcally, AMGAN [30], whose discriminator is trained with an adversarial loss on class probability, shows decent performance on CIFAR10, but dramatically degrades on CIFAR100 and fails to train on Tiny Imagenet. The other classiﬁer-based model having no adversarial loss also [10, 21] shows competitive results on CIFAR10 and CIFAR100, but poor GAN-train and GAN-test scores on Tiny Imagenet. These results suggest that utilizing spherical circles is highly effective and ﬂexible to integrate the auxiliary classiﬁer in conditional GANs.

Table 4: Conditional GAN results on Tiny Imagenet.

Model IS( ) FID( ) GAN-train( ) GAN-test( ) real images 33.3 4.59 59.2 100.0 AC+WGAN-GP [10] 10.3 32.5 0.4 0.7 Proj. SNGAN [17] 9.38 42.0 22.6 19.3 Circle GAN (sreal) 21.6 15.5 30.4 94.6 Circle GAN (sadd) 20.8 15.6 28.8 94.9 Circle GAN (smult) 20.8 17.5 30.4 93.4

For qualitative evaluation, we sample images and obtain t-SNE using pre-trained classiﬁer for Circle GAN and Proj. SNGAN [17], which is the most competitive algorithm on Tiny Imagenet. We use 5 classes of images where the classes are selected by the author not to be overlapped in nature: goldﬁsh, yorkshire-terrier, academic gown, birdhouse, schoolbus (Fig. 3). The results demonstrate that the images synthesized from Circle GAN correspond to their classes and almost overlapped to the train and the validation sets of the dataset at the 2D t-SNE space, but the images from Proj. SNGAN are vague in perceiving them as instances of their corresponding classes. Also, the distribution of t-SNE itself certiﬁes that the samples from Proj. SNGAN are far apart from the support of the dataset. The more qualitative results can be found in supplementary material C.

5 Conclusion

In this paper, we have demonstrated that learning and discriminating sample embeddings using their corresponding spherical circles on a hypersphere is highly effective in generating diverse samples of high quality. The proposed method provides the state-of-the-art performance in unconditional generation and also extends to conditional setups with class labels by creating hyperspheres for the classes. The impressive performance gain over the recent methods on standard benchmarks demonstrates the effectiveness of the proposed approach.

(a) Circle GAN (ours) images and embeddings

(b) Proj. SNGAN [17] images and embeddings

Figure 3: Circle GAN vs. Projection-SNGAN on Tiny Imagenet. For 5 classes (goldﬁsh, yorkshireterrier, academic gown, birdhouse, schoolbus), generated images and their 2D embeddings from t-SNE are visualized. For t-SNE, we train a classiﬁer using the training set and use it for embedding the generated images. For comparison, we also use 250 images randomly taken from train and validation sets, respectively.

Broader Impact

This work addresses the problem of generative modeling and adversarial learning, which is a crucial topic in machine learning and artiﬁcial intelligence; b) the proposed technique is generic and does not have any direct negative impact on society; c) the proposed model improves sample diversity, thus contributing to reducing biases in generated data samples.

Acknowledgments

This research was supported by Basic Science Research Program (NRF-2017R1E1A1A01077999) and Next-Generation Information Computing Development Program (NRF-2017M3C4A7069369), through the National Research Foundation of Korea (NRF) funded by the Ministry of Science and ICT (MSIT), and also by Institute for Information & communications Technology Promotion (IITP) grant funded by the Korea government (MSIP) (No. 2019-0-01906, Artiﬁcial Intelligence Graduate School Program (POSTECH)).

[1] I. Albuquerque, J. Monteiro, T. Doan, B. Considine, T. Falk, and I. Mitliagkas. Multi-objective training of generative adversarial networks with multiple discriminators. In Proceedings of the 36th International Conference on Machine Learning (ICML), 2019.

[2] M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein gan. In Proceedings of the 34th International Conference on Machine Learning (ICML), 2017.

[3] S. Barratt and R. Sharma. A note on the inception score. ar Xiv preprint ar Xiv:1801.01973, 2018.

[4] A. Brock, J. Donahue, and K. Simonyan. Large scale gan training for high ﬁdelity natural image synthesis. In Proceedings of International Conference on Learning Representations (ICLR), 2019.

[5] T. Chen, M. Lucic, N. Houlsby, and S. Gelly. On self modulation for generative adversarial networks. In Proceedings of International Conference on Learning Representations (ICLR), 2019.

[6] A. Coates, A. Ng, and H. Lee. An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the Fourteenth International Conference on Artiﬁcial Intelligence and Statistics (AISTATS), pages 215 223, 2011.

[7] T. Doan, J. Monteiro, I. Albuquerque, B. Mazoure, A. Durand, J. Pineau, and R. D. Hjelm. On-line adaptative curriculum learning for gans. In Proceedings of the AAAI Conference on Artiﬁcial Intelligence, volume 33, pages 3470 3477, 2019.

[8] M. Gong, Y. Xu, C. Li, K. Zhang, and K. Batmanghelich. Twin auxiliary classiﬁers gan. In Advances in Neural Information Processing Systems (Neur IPS), 2019.

[9] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Advances in Neural Information Processing Systems (Neur IPS), pages 2672 2680, 2014.

[10] I. Gulrajani, F. Ahmed, M. Arjovsky, V. Dumoulin, and A. C. Courville. Improved training of wasserstein gans. In Advances in Neural Information Processing Systems (Neur IPS), pages 5767 5777, 2017.

[11] M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, and S. Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Advances in Neural Information Processing Systems (Neur IPS), pages 6626 6637, 2017.

[12] A. Jolicoeur-Martineau. The relativistic discriminator: a key element missing from standard gan. In Proceedings of International Conference on Learning Representations (ICLR), 2019.

[13] T. Karras, T. Aila, S. Laine, and J. Lehtinen. Progressive growing of gans for improved quality, stability, and variation. In Proceedings of International Conference on Learning Representations (ICLR), 2018.

[14] T. Karras, S. Laine, and T. Aila. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4401 4410, 2019.

[15] A. Krizhevsky and G. Hinton. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009.

[16] K. Liu, W. Tang, F. Zhou, and G. Qiu. Spectral regularization for combating mode collapse in gans. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), pages 6382 6390, 2019.

[17] T. Miyato and M. Koyama. cgans with projection discriminator. In Proceedings of International Conference on Learning Representations (ICLR), 2018.

[18] T. Miyato, T. Kataoka, M. Koyama, and Y. Yoshida. Spectral normalization for generative adversarial networks. In Proceedings of International Conference on Learning Representations (ICLR), 2018.

[19] B. Neyshabur, S. Bhojanapalli, and A. Chakrabarti. Stabilizing gan training with multiple random projections. ar Xiv preprint ar Xiv:1705.07831, 2017.

[20] S. Nowozin, B. Cseke, and R. Tomioka. f-gan: Training generative neural samplers using variational divergence minimization. In Advances in Neural Information Processing Systems (Neur IPS), pages 271 279, 2016.

[21] A. Odena, C. Olah, and J. Shlens. Conditional image synthesis with auxiliary classiﬁer gans. In Proceedings of the 34th International Conference on Machine Learning (ICML), pages 2642 2651, 2017.

[22] S. W. Park and J. Kwon. Sphere generative adversarial network based on geometric moment matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 4292 4301, 2019.

[23] A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. In Proceedings of International Conference on Learning Representations (ICLR), 2016.

[24] T. Salimans, I. Goodfellow, W. Zaremba, V. Cheung, A. Radford, and X. Chen. Improved techniques for training gans. In Advances in Neural Information Processing Systems (Neur IPS), pages 2234 2242, 2016.

[25] K. Shmelkov, C. Schmid, and K. Alahari. How good is my gan? In Proceedings of the European Conference on Computer Vision (ECCV), pages 213 229, 2018.

[26] J. Wu, Z. Huang, D. Acharya, W. Li, J. Thoma, D. P. Paudel, and L. V. Gool. Sliced wasserstein generative models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3713 3722, 2019.

[27] S. Yamaguchi and M. Koyama. Distributional concavity regularization for gans. In Proceedings of International Conference on Learning Representations (ICLR), 2019.

[28] D. Yang, S. Hong, Y. Jang, T. Zhao, and H. Lee. Diversity-sensitive conditional generative adversarial networks. In Proceedings of International Conference on Learning Representations (ICLR), 2019.

[29] H. Zhang, I. Goodfellow, D. Metaxas, and A. Odena. Self-attention generative adversarial networks. In Proceedings of the 36th International Conference on Machine Learning (ICML), 2019.

[30] Z. Zhou, H. Cai, S. Rong, Y. Song, K. Ren, W. Zhang, Y. Yu, and J. Wang. Activation maximization generative adversarial nets. In Proceedings of International Conference on Learning Representations (ICLR), 2018.

[31] Z. Zhou, J. Liang, Y. Song, L. Yu, H. Wang, W. Zhang, Y. Yu, and Z. Zhang. Lipschitz generative adversarial nets. In Proceedings of the 36th International Conference on Machine Learning (ICML), 2019.