# gans_for_semisupervised_opinion_spam_detection__355d66a0.pdf

GANs for Semi-Supervised Opinion Spam Detection

Gray Stanton1 and Athirai A. Irissappane 2

1Department of Statistics, Colorado State University 2School of Engineering and Technology, University of Washington, Tacoma gray.stanton@colostate.edu, athirai@uw.edu

Online reviews have become a vital source of information in purchasing a service (product). Opinion spammers manipulate reviews, affecting the overall perception of the service. A key challenge in detecting opinion spam is obtaining ground truth. Though there exists a large set of reviews, only a few of them have been labeled spam or non-spam. We propose spam GAN, a generative adversarial network which relies on limited labeled data as well as unlabeled data for opinion spam detection. spam GAN improves the state-of-the-art GAN based techniques for text classiﬁcation. Experiments on Trip Advisor data show that spam GAN outperforms existing techniques when labeled data is limited. spam GAN can also generate reviews with reasonable perplexity.

1 Introduction

Opinion spam is a widespread problem in e-commerce, social media, travel sites, movie review sites, etc. [Jindal et al., 2010]. Statistics show that more than 90% of consumers read reviews before making a purchase 1. The likelihood of purchase is also reported to increase with more reviews. Opinion spammers exploit such ﬁnancial gains by providing spam reviews which inﬂuence readers and thereby affect sales. We consider the problem of identifying spam reviews as a classiﬁcation problem, i.e., a review can be classiﬁed as spam or non-spam. One of the main challenges in identifying spam reviews is the lack of labeled data, i.e., spam and non-spam labels [Rayana and Akoglu, 2015]. While there exists a corpus of online reviews, only few of them are labeled. This is mainly because manual labeling is often time consuming, costly and subjective [Li et al., 2018]. Research shows that unlabeled data, when used in conjunction with small amounts of labeled data can produce considerable improvement in learning accuracy [Ott et al., 2011]. There is very limited research on using semi-supervised learning techniques for opinion spam detection [Crawford et al., 2015]. The existing semi-supervised learning approaches [Li et al., 2011; Hern andez et al., 2013; Li et al., 2014] for identifying opinion spam use pre-deﬁned set of features for training their classiﬁer. In this paper, we will

1https://learn.g2crowd.com/customer-reviews-statistics.

use deep neural networks which will automatically discover features needed for spam classiﬁcation [Le Cun et al., 2015]. Deep generative models have shown promising results for semi-supervised learning [Kumar et al., 2017]. Speciﬁcally, Generative Adversarial Networks (GANs) which have the ability to generate samples very close to real data have achieved state-of-the art results. However, most research on GANs are for images (continuous values) and not text data (discrete values) [Fedus et al., 2018]. GANs operate by training two neural networks which play a min-max game: discriminator tries to discriminate real training samples from fake ones and generator tries to generate fake training samples to fool the discriminator. The drawbacks with GANs are: 1) when data is discrete, the gradient from the discriminator may not be useful for improving the generator, because the slight change in weights brought forth by the gradients may not correspond to a suitable discrete mapping in the dictionary [Husz ar, 2015]; 2) the discrimination is based on the entire sentence not parts of it, leading to the sparse rewards problem [Yu et al., 2017]. Very few GAN-based methods (Seq GAN [Yu et al., 2017], Step GAN [Tuan and Lee, 2018], Mask GAN [Fedus et al., 2018]) exists for text generation (not classiﬁcation). However, they are limited by the length of the sentence that can be generated, e.g., Mask GAN considers 40 words per sentence. These approches are unsuitable for most online reviews which are relatively lengthy, e.g., the Trip Advisor review dataset used in our experiments has sentences with median length 132. The only existing GAN-based approach for text classiﬁcation, CS-GAN [Li et al., 2018] is not optimal for spam detection due to the length of reviews, subtlety of classiﬁcation, lack of labeled data (CS-GAN is supervised) and computation time. In this paper, we propose spam GAN, a semi-supervised GAN based approach for classifying opinion spam. spam GAN uses both labeled instances and unlabeled data to correctly learn the input distribution, resulting in better prediction accuracy for comparatively longer reviews. spam GAN consists of 3 different components: generator, discriminator, classiﬁer which work together to not only classify spam reviews but also generate samples close to the train set. We conduct experiments on Trip Advisor dataset and show that spam GAN outperforms existing works when using limited labeled data. Following are the main contributions of this paper: 1) To the best of our knowledge, we are the ﬁrst to explore the potential of GANs for spam detection; 2) spam GAN improves the state-

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

of-the-art GAN based models for text classiﬁcation as it leverages both labeled, unlabeled data in a semi-supervised manner (see Sec. 2 for details); 3) most existing research (non-deep learning methods) on opinion spam manually identify heuristics/features for classifying spamming behavior, however, in our GAN based approach, the features are learned by the neural network; 4) experiments show that spam GAN outperforms state-of-the art methods in classifying spam when limited labeled data is used; 5) spam GAN can generate spam/non-spam reviews very similar to the training set which can be used for synthetic data generation in cases with limited ground truth.

2 Related Work

Existing opinion spam detection techniques are mostly supervised methods based on pre-deﬁned features. [Jindal and Liu, 2008] used logistic regression with product, review and reviewer-centric features. [Ott et al., 2011] used n-gram features to train a Naive Bayes and SVM classiﬁer. [Feng et al., 2012], [Mukherjee et al., 2013], [Li et al., 2015] used part-ofspeech tags and context free grammar parse trees, behavioral features, spatio-temporal features, respectively. Neural network methods for spam detection consider the reviews as input without speciﬁc feature extraction. GRNN [Ren and Ji, 2017] used a gated recurrent neural network to study the contexual information of review sentences. DRIRCNN [Zhang et al., 2018] used a recurrent network for learning the contextual information of the words in the reviews. DRI-RCNN extends RCNN [Lai et al., 2015] by learning embedding vectors with respect to both spam and non-spam labels. As RCNN, DRI-RCNN use neural networks, we will compare with these supervised methods in our experiments. Few semi-supervised methods for opinion spam detection exist. [Li et al., 2011] used co-training with Naive-Bayes classiﬁer on reviewer, product and review features. [Hern andez et al., 2013; Li et al., 2014] used only positively labeled samples along with unlabeled data. [Rayana and Akoglu, 2015] used review features, timestamp, ratings as well as pairwise markov random ﬁeld network of reviewers and product to build a supervised algorithm along with semi-supervised extensions. Other un-supervised methods for spam detection [Xu et al., 2015] exists, but, they are out of the scope of this work. With respect to GANs for text classiﬁcation, Seq GAN [Yu et al., 2017] addresses the problem of sparse rewards by considering sequence generation as a Reinforcement Learning problem (RL). Monte Carlo Tree Search (MCTS) is used to overcome the issue of sparse rewards, however, it is computationally intractable. Step GAN [Tuan and Lee, 2018] and Mask GAN [Fedus et al., 2018] use the actor-critic [Konda and Tsitsiklis, 2000] method to learn the rewards, but, they are limited by length of the sequence. Further, all of them focus on sentence generation. CSGAN [Li et al., 2018] deals with sentence classiﬁcation and incorporates a classiﬁer in its architecture, but performance signiﬁcantly degrades with sentence length as it uses MCTS and character-level embeddings. spam GAN differs from CSGAN in using the actor-critic reinforcement learning method for sequence generation and word-level embeddings, suitable for longer sentences. The RL architecture in spam GAN helps to mutually bootstrap the gen-

Figure 1: spam GAN Architecture

erator and classiﬁer while the discriminator and generator are competing. To handle longer sentences, our RL architecture (inspired from step GAN) has the advantage of requiring only a single pass of the generated sentence through the discriminator and classiﬁer per example, reducing training time.

In this section, we will present the problem set-up, the three components of spam GAN as well as their interactions through a sequential decision making framework.

3.1 Problem Set-up Let DL be the set of reviews labeled spam or non-spam. Given the cost of labeling, we hope to improve classiﬁcation performance by also using DU, a signiﬁcantly larger set of unlabeled reviews2. Let D = DL DU be a combination of labeled and unlabeled sentences for training3. Each training sentence y1:T = {y1, y2, . . . yt, . . . , y T } consists of a sequence of T word tokens, where yt Y represents the tth token in the sentence and Y is a corpus of tokens used. For sentences belonging to DL, we also include a class label belonging to one of the 2 classes c C : {spam, non-spam}. To leverage both the labeled and unlabeled data, we include three components in spam GAN: the generator G, the discriminator D, and the classiﬁer C as shown in Fig. 1. The generator, for a given class label, learns to generate new sentences (we call them fake4 sentences) similar to the real sentences in the train set belonging to the same class. The discriminator learns to differentiate between real and fake sentences, and informs the generator (via rewards) if the generated sentences are unrealistic. This competition between the generator and discriminator improves the quality of the generated sentence. We know the class labels for the fake sentences produced by the generator as they are controlled [Hu et al., 2017], i.e., constrained by class labels {spam, non-spam}. The classiﬁer is trained using real labeled sentences from DL and fake sentences produced by the generator, thus improving its ability to generalize beyond the small set of labeled sentences. The classiﬁer s performance on fake sentences is also used as feedback to improve the generator: better classiﬁcation accuracy

2DU includes both spam/non-spam reviews. 3Training (see Alg. 1) can use only DL or both DL and DU. 4Fake sentences are those produced by the generator. Spam sentences are deceptive sentences with class label spam. Generator can generate fake sentences belonging to {spam or non-spam} class.

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

results in more rewards. While the discriminator and generator are competing, the classiﬁer and generator are mutually bootstrapping. As the 3 components of spam GAN are trained, the generator produces sentences very similar to the training set while the classiﬁer learns the characteristics of spam and non-spam sentences in order to identify them correctly.

3.2 Generator If PR(y1:T , c) is the true joint distribution of sentences y1:T and classes c C from the real training set, the generator aims to ﬁnd a parameterized conditional distribution G(y1:T |z, c, θg) that best approximates the true distribution. The generated fake sentence is conditioned on the network parameters θg, noise vector z, and class label c, which are sampled from the priors Pz, Pc, respectively. The context vector (consisting of z, c) is concatenated to the generated sentence at every timestep [Tuan and Lee, 2018], so that the actual class labels for each generated fake sentence is retained. While sampling from G(y1:T |z, c, θg), the word tokens are generated auto-regressively, decomposing the distribution over token sequences into the ordered conditional sequence,

G(y1:T |z, c, θg) =

t=1 G(yt|y1:t 1, z, c, θg) (1)

During pre-training, we use batches of real sentences from D and minimize the cross-entropy of the next token conditioned on the preceding ones. Speciﬁcally, we minimize the loss (Eqn. 2) over real sentence-class pairs (y1:T , c) from DL as well as unlabeled real sentences from DU with randomlyassigned class labels drawn from the class prior distribution.

t=1 log G(yt|y1:t 1, z, c, θg) (2)

During adversarial training, we treat sequence generation as a sequential decision making problem [Yu et al., 2017]. The generator acts as a reinforcement learning agent, trained to maximize the expected rewards using policy gradients, where rewards are feedback obtained from discriminator, classiﬁer for the generated sentences (See Sec. 3.5). For implementing the generator, we use a unidirectional multi-layer recurrent neural network with gated recurrent units as the base cell.

3.3 Discriminator The discriminator D, with parameters θd predicts if a sentence is real (sampled from PR) or fake (produced by the generator) by computing a probability score D(y1:T |θd) that the sentence is real. Like [Tuan and Lee, 2018] instead of computing the score at the end of the sentence, the discriminator produces scores QD(y1:t 1, yt) for every timestep, which are then averaged to produce the overall score.

D(y1:T |θd) = 1

t=1 QD(y1:t 1, yt) (3)

QD(y1:t 1, yt) is the intermediate score for timestep t and is based solely on the preceding partial sentence y1:t. In a setup reminiscent of Q-learning, we consider QD(y1:t 1, yt) to be the estimated value for the state s = y1:t 1 and action a = yt. Thus, the discriminator provides estimates for the

true state-action values without the additional computational overhead of using MCTS rollouts. We train the discriminator like traditional GANs by maximizing the score D(y1:T |θd) for real sentences and minimizing it for fake ones. This is achieved by minimizing the loss L(D),

L(D)= E y1:T PR log D(y1:T |θd) + E y1:T G log (1 D(y1:T |θd))

We also include a discrimination critic Dcrit [Konda and Tsitsiklis, 2000] which is trained to approximate the score QD(y1:t 1, yt) from the discriminator network, for the next token yt based on the preceding partial sentence y1:t 1. The approximated score VD(y1:t 1) will be used to stabilize policy gradient updates for the generator during adversarial training.

VD(y1:t 1) = E yt

QD(y1:t 1, yt) (5)

Dcrit is trained to minimize the sequence mean-squared error between VD(y1:t 1) and the actual score QD(y1:t 1, yt).

L(Dcrit) = E y1:T

QD(y1:t 1, yt) VD(y1:t 1) 2 (6)

The discriminator network is implemented as a unidirectional Recurrent Neural Network (RNN) with one dense output layer which produces the probability that a sentence is real at each timestep, i.e., QD(y1:t 1, yt). For the discrimination critic, we have a additional output dense layer (different from the one that computes QD(y1:t 1, yt)) attached to the discriminator RNN, which estimates VD(y1:t 1) for each timestep.

3.4 Classiﬁer Given a sentence y1:T , the classiﬁer C with parameters θc predicts if the sentence belongs to class c C. Like the discriminator, it assigns a prediction score at each timestep QC(y1:t 1, yt, c) for the partial sentence y1:t, which identiﬁes the probability the sentence belongs to class c. The intermediate scores are then averaged to produce the overall score:

C(y1:T , c|θc) = 1

t=1 QC(y1:t 1, yt, c) (7)

The classiﬁer loss LC is based on: 1) L(CR), the crossentropy loss on true labeled sentences computed using the overall classiﬁer sentence score; 2) L(CG) the loss for the fake sentences. Fake sentences are considered as potentially-noisy training examples, so we not only minimize cross-entropy loss but also include Shannon entropy H(C(c|y1:T , θC)).

LC = L(CR) + L(CG) (8)

L(CR) = E (y1:T ,c) PR(y,c)

log C(c|y1:T , θc)

L(CG) = E c Pc,y1:T G[ log C(c|y1:T , θc)

βH(C(c|y1:T , θC))]

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

In L(CG), β, the balancing parameter, inﬂuences the impact of Shannon entropy. Including H(C(c|y1:T , θC)), for minimum entropy regularization [Hu et al., 2017], allows the classiﬁer to predict classes for generated fake sentences more conﬁdently. This is crucial in reinforcing the generator to produce sentences of the given class during adversarial training. Like in discriminator, we include a classiﬁcation critic Ccrit to estimate the classiﬁer score QC(y1:t 1, yt, c) for yt based on the preceding partial sentence y1:t 1,

VC(y1:t 1,c) = E yt[QC(y1:t 1, yt, c)] (9)

The implementation of the classiﬁer is similar to the discriminator. We use a unidirectional recurrent neural network with a dense output layer producing the predicted probability distribution over classes c C. The classiﬁcation critic is also an alternative head off the classiﬁer RNN with an additional dense layer estimating VC(y1:t 1,c) for each timestep. We train this classiﬁcation critic by minimizing L(Ccrit),

L(Ccrit) = E y1:T

QC(y1:t 1, yt, c) VC(y1:t 1,c) 2 (10)

3.5 Reinforcement Learning Component We consider a sequential decision making framework in which the generator acts as a reinforcement learning agent. The current state of the agent is the generated tokens st = y1:t 1 so far. The action yt is the next token to be generated, which is selected based on the stochastic policy G(yt|y1:t 1, z, c, θg). The reward the agent receives for the generated sentence y1:T of a given class c is determined by the discriminator and classiﬁer. Speciﬁcally, we take the overall scores D(y1:T |θd) (Eqn.3) and C(y1:T , c|θc) (Eqn. 7) and blend them in a manner reminiscent of the F1 score, producing the sentence reward,

R(y1:T ) = 2 D(y1:T |θd) C(y1:T , c|θc)

D(y1:T |θd) + C(y1:T , c|θc) (11)

This reward R(y1:T ) is for the entire sentence delivered during the ﬁnal timestep, with reward for every other timestep being zero [Tuan and Lee, 2018]. Thus, the generator agent seeks to maximize the expected reward, given by,

L(G) = E y1:T G

R(y1:T ) (12)

To maximize L(G), the generator parameters θg are updated via policy gradients [Sutton et al., 2000]. Speciﬁcally, we use the advantage actor-critic method to solve for optimal policy [Konda and Tsitsiklis, 2000]. The expectation in Eqn. 12 can be re-written using rewards for intermediate timesteps from the discriminator and classiﬁer. The intermediate scores from the discriminator, QD(y1:t 1, yt) and the classiﬁer, QC(y1:t 1, yt, c), are combined as shown in Eqn. 13 and the combined values serve as estimators for Q(y1:t, c), the expected reward for sentence y1:t. To reduce variance in the gradient estimates, we replace Q(y1:t, c) by the advantage function Q(y1:t, c) V (y1:t 1, c), where V (y1:t 1, c) is given by Eqn. 13. We use α = T t in Eqn. 14 to increase

Algorithm 1: spam GAN

1 Input: Labeled dataset DL, Unlabeled dataset DU 2 Parameters: Network parameters θg θd θc θdcrit θccrit 3 Perform pre-training as described in Sec. 3.6

4 for Training-epochs do

5 for G-Adv-epochs do

6 sample batch of classes c from P(c)

7 generate batch of fake sentences y1:T G given c

8 for t 1 : T do

9 compute Q(y1:t, c), V (y1:t 1, c) using Eqn. 13

10 update θg using policy gradient θg L(G) in Eqn. 14

11 for G-MLE-epochs do

12 sample batch of real sentences from DL, DU 13 Update θg using MLE in Eqn. 2

14 for D-epochs do

15 sample batch of real sentences from DL, DU 16 sample batch of fake sentences from G

17 update discriminator using θd L(D) from Eqn. 4

18 compute QD(y1:t 1, yt), VD(y1:t 1) for fake sents.

19 update Dcrit using θdcrit L(Dcrit) from Eqn. 6

20 for C-epochs do

21 sample batch of real sentences-class pairs from DL 22 sample batch of fake sentence-class pairs from G

23 update classiﬁer using θc L(C) from Eqn. 8

24 compute QC(y1:t 1, yt, c),VC(y1:t 1,c) on fake sents

25 update Ccrit using θccrit L(Ccrit) from Eqn. 10

the importance of initially-generated tokens while updating θg. α is a linearly-decreasing factor which corrects the relative lack of conﬁdence in the initial intermediate scores from the discriminator and classiﬁer.

Q(y1:t, c) = 2 QD(y1:t 1, yt) QC(y1:t 1, yt, c)

QD(y1:t 1, yt) + QC(y1:t 1, yt, c)

V (y1:t 1, c) = 2 VD(y1:t 1) VC(y1:t 1,c)

VD(y1:t 1) + VC(y1:t 1,c)

During adversarial training, we perform gradient ascent to update the generator using the gradient equation shown below,

θg L(G) = E y1:T

t α Q(y1:t, c) V (y1:t 1, c)

θg log G(yt|y1:t 1, z, c, θg) (14)

3.6 Pre-Training Before beginning adversarial training, we pre-train the different components of spam GAN. The generator G is pre-trained using maximum likelihood estimation (MLE) [Grover et al., 2018] by updating the parameters via Eqn 2. Once the generator is pre-trained, we take batches of real sentences from the labeled dataset DL, the unlabeled dataset DU and fake sentences sampled from G(y1:T |z, c, θg) to pre-train the discriminator minimizing the loss L(D) in Eqn. 4. The classiﬁer C is pre-trained solely on real sentences from the labeled dataset DL. It is trained to minimize the cross-entropy loss L(CR) on

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

Method 10% Labeled 30% 50% 70% 90% 100%

spam GAN-0% 0.700 0.02 0.811 0.02 0.838 0.01 0.845 0.01 0.852 0.02 0.862 0.01 spam GAN-50% 0.678 0.03 0.797 0.03 0.839 0.02 0.845 0.02 0.857 0.02 0.856 0.01 spam GAN-70% 0.695 0.05 0.780 0.03 0.828 0.02 0.850 0.01 0.841 0.02 0.844 0.02 spam GAN-100% 0.681 0.02 0.783 0.02 0.831 0.01 0.837 0.01 0.843 0.02 0.845 0.01 Base classiﬁer 0.722 0.03 0.786 0.02 0.791 0.02 0.829 0.01 0.824 0.02 0.827 0.02 DRI-RCNN 0.647 0.10 0.757 0.01 0.796 0.01 0.834 0.18 0.835 0.02 0.846 0.01 RCNN 0.538 0.09 0.665 0.14 0.733 0.09 0.811 0.03 0.834 0.02 0.825 0.02 Co-Train (Naive Bayes) 0.655 0.01 0.740 0.01 0.738 0.02 0.743 0.01 0.754 0.01 0.774 0.01 PU Learn (Naive Bayes) 0.508 0.02 0.713 0.03 0.816 0.01 0.826 0.01 0.838 0.02 0.843 0.02

Table 1: Accuracy (Mean Std) for Different % Labeled Data

real sentences and their labels. The critic networks Dcrit and Ccrit are trained by minimizing their loses L(Dcrit) (Eqn. 6) and L(Ccrit) (Eqn. 10). Such pre-training addresses the problem of mode collapse [Guo et al., 2018] to a satisfactory extent.

3.7 spam GAN algorithm Alg. 1 describes spam GAN in detail. After pre-training, we perform adversarial training for Training-epochs (Lines 425). We create a batch of fake sentences using generator G by sampling classes c from prior Pc (Lines 6-7). We compute Q(y1:t, c), V (y1:t 1, c) using Eqn. 13 for every timestep (Line 9). The generator is then updated using policy gradient in Eqn. 14 (Line 10). This process is repeated for G-Adv-epochs. Like [Li et al., 2017] the training robustness is greatly improved when the generator is updated using MLE via Eqn 2 on sentences from D (Lines 11-13). We then train the discriminator using real sentences from DL, DU as well as fake sentences from the generator (Lines 15-16). The discriminator is updated using Eqn. 4 (Line 17). We also train the discrimination critic, by computing QD(y1:t 1, yt), VD(y1:t 1) for the fake sentences and updating the gradients using Eqn. 6 (Line 18-19). This process is repeated for D-epochs. We perform a similar set of operations for the classiﬁer (Lines 20-25).

4 Experiments

We use the Trip Advisor labeled dataset [Ott et al., 2011] 5, consisting of 800 truthful reviews on Chicago hotels and 800 deceptive reviews obtained from Amazon Mechanical Turk. We remove a small number of duplicate truthful reviews to get a balanced labeled dataset of 1596 reviews. We augment the labeled set with 32, 297 unlabeled Trip Advisor reviews for Chicago hotels 6. All reviews are converted to lower-case and tokenized at word level, with a vocabulary Y of 10000 7. The maximum sequence length T = 128 words, close to the median review length of the full dataset. Y also includes tokens: start , end which are added to the beginning, end of each sentence, respectively; pad for padding sentences smaller than T (longer sentences are truncated, ensuring a consistent sentence length); unk replaces out-of-vocabulary words. In spam GAN, the generator consists of 2 GRU layers of 1024 units each and an output dense layer providing logits for

5http://myleott.com/op-spam.html 6http://times.cs.uiuc.edu/ wang296/Data/index.html 7Vocabulary includes all words from labeled data and most frequently occurring words from unlabeled data.

Figure 2: Comparison of spam GAN-50 with Other Approaches

the 10, 000 tokens. The generator, discriminator and classiﬁer are trained using ADAM optimizer. All use variational dropout=0.5 between recurrent layers and word embeddings with dimension 50. For generator, learning rate = 0.001, weight decay =1 10 7. Gradient clipping is set to a maximum global norm of 5. The discriminator contains 2 GRU layers of 512 units each and a dense layer with a single scalar output and sigmoid activation. The discrimination critic is implemented as an alternative dense layer. Learning rate =0.0001 and weight decay =1 10 4. The classiﬁer is similar to discriminator. We set balancing coefﬁcient β = 1. The train time of spam GAN using a Tesla P4 GPU was 1.5 hrs. We use a 80/20 train-test split on labeled data. We compare spam GAN with 2 supervised methods: 1) DRI-RCNN [Zhang et al., 2018]; 2) RCNN [Lai et al., 2015] and 2 semi-supervised methods: 3) Co-Training [Li et al., 2011] with Naive Bayes; 4) PU Learning [Hern andez et al., 2013] with Naive Bayes (SVM performed poorly) using only spam and unlabeled reviews. We conduct experiments with 10, 30, 50, 70, 90, 100% (proportions) of labeled data. To analyze the impact of unlabeled data, we also adjoin differing amounts of unlabeled data to the labeled data: spam GAN-0 (no unlabeled data), spam GAN-50 (50% unlabeled data), spam GAN-70 (70% unlabeled) and spam GAN-100. Co-Train, PU-Learn results are for 50% unlabeled data. We also show the performance of our base classiﬁer (without generator, discriminator, trained on real labeled data to minimize L(CR)). All experiments are repeated 10 times and the mean, standard deviation are reported.

Inﬂuence of Labeled Data Table. 1 shows the classiﬁcation accuracy on the test data. Spam GAN models, in general, outperform other approaches, especially when the % of labeled data is limited. When we merely use 10% of labeled data, spam GAN-0, spam GAN-

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

Method 10% Labeled 30% 50% 70% 90% 100%

spam GAN-0% 0.718 0.02 0.812 0.02 0.840 0.01 0.848 0.02 0.854 0.02 0.868 0.01 spam GAN-50% 0.674 0.05 0.797 0.03 0.843 0.01 0.848 0.02 0.860 0.02 0.863 0.01 spam GAN-70% 0.702 0.05 0.784 0.03 0.830 0.02 0.856 0.01 0.848 0.02 0.854 0.01 spam GAN-100% 0.684 0.03 0.788 0.03 0.839 0.02 0.844 0.01 0.846 0.02 0.850 0.01 Base classiﬁer 0.731 0.03 0.795 0.03 0.803 0.02 0.829 0.01 0.832 0.02 0.838 0.02 DRI-RCNN 0.632 0.07 0.754 0.02 0.779 0.00 0.812 0.03 0.817 0.03 0.833 0.02 RCNN 0.638 0.01 0.715 0.01 0.754 0.02 0.776 0.05 0.820 0.03 0.833 0.02 Co-Train (Naive Bayes) 0.637 0.02 0.698 0.01 0.680 0.02 0.677 0.01 0.712 0.01 0.726 0.01 PU-Learn (Naive Bayes) 0.050 0.02 0.636 0.05 0.815 0.02 0.837 0.02 0.844 0.02 0.852 0.01

Table 2: F1-Score (Mean Std) for Different % Labeled Data

50, spam GAN-70, spam GAN-100 achieve an accuracy of 0.70, 0.678, 0.695, 0.681, respectively, higher than supervised approaches DRI-RCNN (0.647), R-CNN (0.538) and semisupervised approaches Co-train (0.655), PU-learning (0.508). Even without unlabeled data spam GAN-0 gets good results because the mutual bootstrapping between generator and classiﬁer allows the classiﬁer to explore beyond the small labeled training set using the fake sentences produced by the generator. Our base classiﬁer has a higher value (0.722) than spam GAN models as GANs needs more samples to train, in general. The accuracy of all approaches increases with % of labeled data. We select spam GAN-50 as a representative for comparison in Fig. 2. Though the difference in accuracy between spam GAN-50 and others reduces as the % of labeled data increases, spam GAN-50 still performs better than others with an accuracy of 0.856 when all labeled data are considered. Table. 2 shows the F1-score. We can again see that spam GAN-0, spam GAN-50 and spam GAN-70 perform better than the others, especially when the % of labeled data is small.

Inﬂuence of Unlabeled Data While unlabeled data is used to augment the classiﬁer s performance, Fig. 3 shows that F1-score slightly decreases when the % unlabeled data increases, especially for spam GAN-100. In our case, as unlabeled data is much larger than the labeled, the generator does not entirely learn the importance of the sentence classes during pre-training (when the unlabeled sentence classes are randomly assigned), which causes problems for the classiﬁer during adversarial training. However, when no unlabeled data is used, the generator easily learns to generate sentences conditioned on classes paving way for mutual bootstrapping between classiﬁer and generator. We also attribute the drop in performance to the difference in distribution of data between the unlabeled Trip Advisor reviews and the handcrafted reviews from Amazon Mechanical Turk (unlabled data can improve performance only under assumptions about data distributions [Wasserman and Lafferty, 2008]).

Perplexity of Generated Sentence We also compute the perplexity of the sentences produced by the generator (the lower the value the better). Fig. 4 shows that as the % of unlabeled data increases (spam GAN0 to spam GAN-100), the perplexity of the sentences decreases. Spam GAN-100, Spam GAN-70 achieve a perplexity of 76.4, 76.5, respectively. Fig. 3, Fig. 4 show that using unlabeled data improves the generator in producing realistic sentences but does not fully help to differentiate between the

Figure 3: Inﬂuence of Unlabeled Data on F1-Score

Figure 4: Inﬂuence of Unlabeled Data on Perplexity

classes which again, can be attributed to the difference in the data distribution between the labeled and unlabeled data. Following is a sample (partial) spam sentence produced by the generator: Loved this hotel but i decided to the hotel in a establishment didnt look bad ...the palmer house was anyplace that others said in the reviews.. . We notice that spam sentences use more conservative choice of words, focusing on adjectives, reviewer, and attributes of the hotel, while nonspam sentences speak more about the trip in general.

5 Conclusion and Future Work

We propose spam GAN, an approach for detecting opinion spam with limited labeled data. spam GAN also helps to generate reviews similar to the training set. Experiments show that spam GAN outperforms state-of-the-art supervised and semisupervised techniques when labeled data is limited. We further plan to conduct experiments on Yelp Zip data (overcoming the data distribution issue of Mechanical Turk reviews). As the overall spam GAN architecture is agnostic to the implementation details of classiﬁer, we plan to use a more sophisticated design for the classiﬁer than a simple recurrent network.

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)

[Crawford et al., 2015] Michael Crawford, Taghi M Khoshgoftaar, Joseph D Prusa, Aaron N Richter, and Hamzah Al Najada. Survey of review spam detection using machine learning techniques. Journal of Big Data, 2(1):23, 2015.

[Fedus et al., 2018] William Fedus, Ian Goodfellow, and Andrew M Dai. Maskgan: Better text generation via ﬁlling in the . ICLR, 2018.

[Feng et al., 2012] Song Feng, Ritwik Banerjee, and Yejin Choi. Syntactic stylometry for deception detection. In ACL, 2012.

[Grover et al., 2018] Aditya Grover, Manik Dhar, and Stefano Ermon. Flow-gan: Combining maximum likelihood and adversarial learning in generative models. In AAAI, 2018.

[Guo et al., 2018] Jiaxian Guo, Sidi Lu, Han Cai, Weinan Zhang, Yong Yu, and Jun Wang. Long text generation via adversarial training with leaked information. In AAAI, 2018.

[Hern andez et al., 2013] Donato Hern andez, Rafael Guzm an, Manuel M ontes y Gomez, and Paolo Rosso. Using pulearning to detect deceptive opinion spam. In Workshop on computational approaches to subjectivity, sentiment and social media analysis, pages 38 45, 2013.

[Hu et al., 2017] Zhiting Hu, Zichao Yang, Xiaodan Liang, Ruslan Salakhutdinov, and Eric P Xing. Toward controlled generation of text. ar Xiv preprint ar Xiv:1703.00955, 2017.

[Husz ar, 2015] Ferenc Husz ar. How (not) to train your generative model: Scheduled sampling, likelihood, adversary? ar Xiv preprint ar Xiv:1511.05101, 2015.

[Jindal and Liu, 2008] Nitin Jindal and Bing Liu. Opinion spam and analysis. In WSDM, 2008.

[Jindal et al., 2010] Nitin Jindal, Bing Liu, and Ee-Peng Lim. Finding unusual review patterns using unexpected rules. In CIKM, 2010.

[Konda and Tsitsiklis, 2000] Vijay R Konda and John N Tsitsiklis. Actor-critic algorithms. In NIPS, 2000.

[Kumar et al., 2017] Abhishek Kumar, Prasanna Sattigeri, and Tom Fletcher. Semi-supervised learning with gans: manifold invariance with improved inference. In NIPS, 2017.

[Lai et al., 2015] Siwei Lai, Liheng Xu, Kang Liu, and Jun Zhao. Recurrent convolutional neural networks for text classiﬁcation. In AAAI, 2015.

[Le Cun et al., 2015] Yann Le Cun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553):436, 2015.

[Li et al., 2011] Fangtao Li, Minlie Huang, Yi Yang, and Xiaoyan Zhu. Learning to identify review spam. In IJCAI, 2011.

[Li et al., 2014] Huayi Li, Bing Liu, Arjun Mukherjee, and Jidong Shao. Spotting fake reviews using positive-unlabeled learning. Computaci on y Sistemas, 18(3):467 475, 2014.

[Li et al., 2015] Huayi Li, Zhiyuan Chen, Arjun Mukherjee, Bing Liu, and Jidong Shao. Analyzing and detecting opinion spam on a large-scale dataset via temporal and spatial patterns. In AAAI-ICWSM, 2015. [Li et al., 2017] Jiwei Li, Will Monroe, Tianlin Shi, S ebastien Jean, Alan Ritter, and Dan Jurafsky. Adversarial Learning for Neural Dialogue Generation. 2017. [Li et al., 2018] Yang Li, Quan Pan, Suhang Wang, Tao Yang, and Erik Cambria. A generative model for category text generation. Information Sciences, 450:301 315, 2018. [Mukherjee et al., 2013] Arjun Mukherjee, Vivek Venkataraman, Bing Liu, and Natalie Glance. What yelp fake review ﬁlter might be doing? In AAAI-ICWSM, 2013. [Ott et al., 2011] Myle Ott, Yejin Choi, Claire Cardie, and Jeffrey T. Hancock. Finding deceptive opinion spam by any stretch of the imagination. In ACL, 2011. [Rayana and Akoglu, 2015] Shebuti Rayana and Leman Akoglu. Collective opinion spam detection: Bridging review networks and metadata. In KDD, 2015. [Ren and Ji, 2017] Yafeng Ren and Donghong Ji. Neural networks for deceptive opinion spam detection: An empirical study. Information Sciences, 385:213 224, 2017. [Sutton et al., 2000] Richard S Sutton, David A Mc Allester, Satinder P Singh, and Yishay Mansour. Policy gradient methods for reinforcement learning with function approximation. In NIPS, 2000. [Tuan and Lee, 2018] Yi-Lin Tuan and Hung-Yi Lee. Improving conditional sequence generative adversarial networks by stepwise evaluation. ar Xiv:1808.05599, 2018. [Wasserman and Lafferty, 2008] Larry Wasserman and John D Lafferty. Statistical analysis of semi-supervised regression. In NIPS, 2008. [Xu et al., 2015] Yinqing Xu, Bei Shi, Wentao Tian, and Wai Lam. A uniﬁed model for unsupervised opinion spamming detection incorporating text generality. In AAAI, 2015. [Yu et al., 2017] Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. Seqgan: Sequence generative adversarial nets with policy gradient. In AAAI, 2017. [Zhang et al., 2018] Wen Zhang, Yuhang Du, Taketoshi Yoshida, and Qing Wang. Dri-rcnn: An approach to deceptive review identiﬁcation using recurrent convolutional neural network. Information Processing & Management, 54(4):576 592, 2018.

Proceedings of the Twenty-Eighth International Joint Conference on Artiﬁcial Intelligence (IJCAI-19)