# aspectbased_sentiment_analysis_with_opinion_tree_generation__979a2761.pdf Aspect-based Sentiment Analysis with Opinion Tree Generation Xiaoyi Bao1, Wang Zhongqing1 , Xiaotong Jiang1, Rong Xiao2, and Shoushan Li1 1 Natural Language Processing Lab, Soochow University, Suzhou, China 2 Alibaba Group, Hangzhou, China baoxiaoyiharris@foxmail.com, devjiang@outlook.com xiaorong.xr@taobao.com, {wangzq,lishoushan}@suda.edu.cn Existing studies usually extract the sentiment elements by decomposing the complex structure prediction task into multiple subtasks. Despite their effectiveness, these methods ignore the semantic structure in ABSA problems and require extensive task-specific designs. In this study, we introduce a new Opinion Tree Generation model, which aims to jointly detect all sentiment elements in a tree. The opinion tree can reveal a more comprehensive and complete aspect-level sentiment structure. Furthermore, we employ a pre-trained model to integrate both syntax and semantic features for opinion tree generation. On one hand, a pre-trained model with large-scale unlabeled data is important for the tree generation model. On the other hand, the syntax and semantic features are very effective for forming the opinion tree structure. Extensive experiments show the superiority of our proposed method. The results also validate the tree structure is effective to generate sentimental elements. 1 Introduction As a fine-grained sentiment analysis task, aspect-based sentiment analysis (ABSA) has received continuous attention. Multiple fundamental sentiment elements are involved in ABSA, including the aspect term, opinion term, aspect category, and sentiment polarity. Given a simple example sentence The surface is smooth. , the corresponding elements are surface , smooth , design and positive , respectively. In the literature, the main research line of ABSA focuses on the identification of those sentiment elements such as extracting the aspect term [Qiu et al., 2011], classifying the sentiment polarity for a given aspect [Tang et al., 2016], or jointly predicting multiple elements simultaneously [Peng et al., 2020; Cai et al., 2021]. In general, most ABSA tasks are formulated as either sequence-level or token-level classification problems. However, these methods would suffer severely from error propagation because the overall prediction performance hinges on the accuracy of every step [Peng Corresponding author Review Sentence The surface is smooth, but the apps are hard to use. surface (aspect) Design (category) smooth (opinion term) Positive (polarity) apps (aspect) Software (category) hard (opinion term) Opinion Tree Opinion Tree Generation Aspect Opinion Aspect Opinion Figure 1: Example of opinion tree generation. et al., 2020]. Besides, these methods ignore the label semantics, since they treat the labels as number indices during training [Yan et al., 2021; Zhang et al., 2021b]. Therefore, recent studies tackle the ABSA problem in a unified generative approach. For example, they treat the class index [Yan et al., 2021] or the desired sentiment element sequence [Zhang et al., 2021b] as the target of the generation model. [Zhang et al., 2021a] further rewrite sentiment quads into paraphrase, and employs a paraphrase generation model to generate the sentiment-aware paraphrase. These studies can fully utilize the rich label semantics by encoding the natural language label into the target output. Despite giving strong empirical results, the generative approaches can suffer from structural guarantees in their neural semantic representation [Lu et al., 2021; Zhou et al., 2021], i.e. they can not capture the semantic structure between aspect terms and opinion words. Intuitively, such issues can be alleviated by having a structural representation of semantic information, which treats aspect terms and opinion words as nodes, and builds structural relations between nodes. Explicit structures are also more interpretable compared to neural representation and have been shown useful for many NLP applications [Xue and Li, 2018; Zhang and Qian, 2020; Wang et al., 2020]. Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22) In this study, we introduce a new Opinion Tree Generation model, which aims to jointly detect all sentiment elements in a tree for a given review sentence. The opinion tree can be considered as a semantic representation in order to better represent the structure of sentiment elements. As shown in Figure 1, the opinion tree models a sentence using a rooted directed acyclic graph, highlighting its main elements (e.g. aspect terms, opinion words) and semantic relations. It can thus potentially reveal a more comprehensive and complete aspect-level semantic structure for extracting sentiment elements. Meanwhile, the structure of opinion tree is hard to capture, since it consists of rich semantic relations and multiple interactions between sentiment elements. To this end, we design two strategies for effectively forming the opinion tree structure. Firstly, we propose a constrained decoding algorithm, which can guide the generation process using opinion schemas. In this way, the opinion knowledge can be injected and exploited during inference. Secondly, we explore sequence-to-sequence joint learning of several pre-training tasks to integrate syntax and semantic features and optimize for the performance of opinion tree generation. Detailed evaluation shows that our model significantly advances the state-of-the-art performance on several benchmark datasets. The results also show that the effect of the proposed opinion tree architecture, and our proposed sequenceto-sequence pre-trained system is necessary to achieve strong performance. 2 Related Work Aspect-based sentiment analysis (ABSA) has drawn wide attention during the last decade. Early studies focus on the prediction of a single element, such as extracting the aspect term [Qiu et al., 2011], detecting the mentioned aspect category [Bu et al., 2021], and predicting the sentiment polarity for a given aspect [Tang et al., 2016]. Some works further consider the joint detection of two sentiment elements, including the pairwise extraction of aspect and opinion term [Xu et al., 2020b]; the prediction of aspect term and its corresponding sentiment polarity [Zhang and Qian, 2020]; and the co-extraction of aspect category and sentiment polarity [Cai et al., 2020]. Recently, aspect sentiment triplet and quadruple prediction tasks are proposed in ABSA, they employ end-to-end models to predict the sentiment elements in triplet or quadruple format [Peng et al., 2020; Wan et al., 2020; Cai et al., 2021; Zhang et al., 2021a]. More recently, there are some attempts on tackling ABSA problem in a sequence-to-sequence manner [Zhang et al., 2021a], either treating the class index [Yan et al., 2021] or the desired sentiment element sequence [Zhang et al., 2021b] as the target of the generation model. For example, [Yan et al., 2021] treated the ABSA as a text generation problem, and employ a sequence-to-sequence pre-trained model to generate the sequence of aspect terms and opinion words directly. Meanwhile, [Zhang et al., 2021a] proposed a paraphrase model that utilized the knowledge of the pre-trained model via casting the original task to a paraphrase generation process. They employed the paraphrase to represent aspect- based quads. Different from previous studies, we propose a new task called opinion tree generation, which aims to jointly detect all sentiment elements in a tree for a given review sentence. The opinion tree can reveal a more comprehensive and complete aspect-level sentiment structure for generating sentiment elements. 3 Opinion Tree Generation Given a review sentence, we aim to predict all aspect-level sentiment quadruplets which correspond to the aspect category, aspect term, opinion term, and sentiment polarity, respectively [Cai et al., 2021; Zhang et al., 2021a]. In this study, we propose an opinion tree generation model to jointly detect all sentiment quadruplets in an opinion tree. As shown in Figure 2, we firstly propose a tree generation model to generate an opinion tree with a constrained decoding algorithm, which can guide the generation process using opinion schemas. We then explore joint learning of several pretraining tasks to integrate syntax and semantic features for forming the opinion tree structure. Afterward, the sentiment quadruplets can be recovered from the opinion tree easily. In the present section, we first introduce how to reformulate ABSA as opinion tree generation via structure linearization, then describe the tree generation model and the constrained decoding algorithm. The joint pre-trained model will be described in the next section. 3.1 Opinion Tree Construction and Linearization As shown in Figure 2, the opinion tree models a sentence using a rooted directed acyclic graph, highlighting its main elements (e.g. aspect terms, opinion words) and semantic relations. Given a review sentence, we convert aspect sentiment quads (Figure 2c) into opinion tree (Figure 2b) as below: We first create a quad node to denote an aspect sentiment quad, and all the quad nodes are connected with a virtual root node; The aspect node and opinion node are connected with the corresponding quad node; The aspect category is connected with an aspect node, and the sentiment polarity is connected with an opinion node; The text spans (i.e, aspect term and opinion word) from the review sentence are linked to the corresponding nodes (i.e., aspect category and sentiment polarity) as leaves. From Figure 2b, we can see that: the connections between aspect term and aspect category can be used to identify the category of aspect term, and the connection between opinion word and polarity is very helpful for predicting the polarity. In addition, the aspect and opinion nodes are used to separate the aspect term and opinion word. Furthermore, the quad node is used to denote each aspect sentiment quad with corresponding elements. Since it is much easier to generate a sequence than generate a tree [Xu et al., 2020a; Lu et al., 2021], we linearize the Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence (IJCAI-22) The surface is smooth, but the apps are hard to use. Linearization: (Root, (Quad, (Design, surface), (Positive, smooth)), (Quad, (Software, apps), (Negative, hard))) Category Design Software Aspect Term surface apps Polarity Positive Negative Opinion Word smooth hard b) opinion tree a) review sentence c) aspect sentiment quads surface (aspect) Design (category) smooth (opinion term) Positive (polarity) apps (aspect) Software (category) hard (opinion term) Aspect Opinion Aspect Opinion Opinion Tree Generation Joint Pre-training Syntax Parsing Semantic Parsing Opinion Tree Generation Figure 2: Overview of proposed model. opinion tree to the target sequence. Given the converted opinion tree, we linearize it into a token sequence (Figure 2b) via depth-first traversal, where ( and ) are structure indicators used to represent the structure of linear expressions [Vinyals et al., 2015; van Noord and Bos, 2017]. The traversal order of the same depth is the order in which the text spans appear in the text, e.g., first surface then smooth in Figure 2b. 3.2 Tree Generation Model We then employ a tree generation model to generate the linearized opinion tree from the review sentence. In this study, we employ a sequence-to-sequence model to generate the opinion tree via a transformer-based encoderdecoder architecture [Vaswani et al., 2017]. Given the token sequence x = x1, ..., x|x| as input, the sequence-to-sequence model outputs the linearized representation y = y1, ..., y|y|. To this end, the sequence-to-sequence model first computes the hidden vector representation H = h1, ..., h|x| of the input via a multi-layer transformer encoder: H = Encoder(x1, ..., x|x|) (1) where each layer of Encoder is a transformer block with the multi-head attention mechanism. After the input token sequence is encoded, the decoder predicts the output structure token-by-token with the sequential input tokens hidden vectors. At the i-th step of generation, the self-attention decoder predicts the i-th token yi in the linearized form, and decoder state hd i as: yi, hd i = Decoder([H; hd 1, ..., hd i 1], yi 1) (2) where each layer of Decoder is a transformer block that contains self-attention with decoder state hd i and cross-attention with encoder state H. The generated output structured sequence starts from the start token bos and ends with the end token eos . The conditional probability of the whole output sequence p(y|x) is progressively combined by the probability of each step p(yi|y