Weakly-supervised Deep Embedding for Product Review Sentiment Analysis
Product reviews are valuable for upcoming buyers in helping them make decisions. To this end, different opinion mining techniques have been proposed, where judging a review sentence’s orientation (e.g. positive or negative) is one of their key challenges. Recently, deep learning has emerged as an effective means for solving sentiment classification problems. A neural network intrinsically learns a useful representation automatically without human efforts. However, the success of deep learning highly relies on the availability of large-scale training data. We propose a novel deep learning framework for product review sentiment classification which employs prevalently available ratings as weak supervision signals. The framework consists of two steps: (1) learning a high level representation (an embedding space) which captures the general sentiment distribution of sentences through rating information; (2) adding a classification layer on top of the embedding layer and use labeled sentences for supervised fine-tuning. We explore two kinds of low level network structure for modeling review sentences, namely, convolutional feature extractors and long short-term memory. To evaluate the proposed framework, we construct a dataset containing 1.1M weakly labeled review sentences and 11,754 labeled review sentences from Amazon. Experimental results show the efficacy of the proposed framework and its superiority over baselines.
Sentiment analysis is a long standing research topic. Readers can refer to for a recent survey. Sentiment classification is one of the key tasks in sentiment analysis and can be categorized as document level, sentence level and aspect level. Traditional machine learning methods for sentiment classification can generally be applied to the three levels]. Our work falls into the last category since we consider aspect information. In the next we review two subtopics closely related to our work. In recent years, deep learning has emerged as an effective means for solving sentiment classification problems. A deep neural network intrinsically learns a high level representation of the data, thus avoiding laborious work such as feature engineering. A second advantage is that deep models have exponentially stronger expressive power than shallow models. However, the success of deep learning heavily relies on the availability of large-scale training data. Fortunately, most merchant/review Websites allow customers to summarize their opinions by an overall rating score (typically in 5-stars scale). Ratings reflect the overall sentiment of customer reviews and have already been exploited for sentiment analysis.
To reduce the impact of sentences with rating-inconsistent orientation (hereafter called wrong-labeled sentences), we propose to penalize the relative distances among sentences in the embedding space through a ranking loss. In the second step, a classification layer is added on top of the embedding layer, and we use labeled sentences to fine-tune the deep network. The framework is dubbed Weakly-supervised Deep Embedding (WDE). Regarding network structure, two popular schemes are adopted to learn to extract fixed-length feature vectors from review sentences, namely, convolutional feature extractors and Long Short-Term Memory (LSTM)]. With a slight abuse of concept, we will refer to the former model as Convolutional Neural Network based WDE (WDE-CNN); the latter one is called LSTM based WDE (WDE-LSTM). We then compute high level features (embedding) by synthesizing the extracted features, as well as the contextual aspect information (e.g. screen of cell phones) of the product. The aspect input represents prior knowledge regarding the sentence’s orientation.
In this work we proposed a novel deep learning framework named Weakly-supervised Deep Embedding for review sentence sentiment classification. WDE trains deep neural networks by exploiting rating information of reviews which is prevalently available on many merchant/review Websites. The training is a 2-step procedure: first we learn an embedding space which tries to capture the sentiment distribution of sentences by penalizing relative distances among sentences according to weak labels inferred from ratings; then a softmax classifier is added on top of the embedding layer and we fine-tune the network by labeled data. Experiments on reviews collected from Amazon.com show that WDE is effective and outperforms baseline methods. Two specific instantiations of the framework, WDE-CNN and WDE-LSTM, are proposed. Compared to WDE-LSTM, WDECNN has fewer model parameters, and its computation is more easily parallelized on GPUs. Nevertheless, WDE-CNN cannot well handle long-term dependencies in sentences. WDE-LSTM is more capable of modeling the long-term dependencies in sentences, but it is less efficient than WDE-CNN and needs more training data. For future work, we plan to investigate how to combine different methods to generate better prediction performance. We will also try to apply WDE on other problems involving weak labels.
 Y. Bengio. Learning deep architectures for ai. Foundations and trendsR in Machine Learning, 2(1):1–127, 2009.
 Y. Bengio, A. Courville, and P. Vincent. Representation learning: A review and new perspectives. IEEE TPAMI, 35(8):1798–1828, 2013.
 C. M. Bishop. Pattern recognition and machine learning. springer, 2006.
 L. Chen, J. Martineau, D. Cheng, and A. Sheth. Clustering for simultaneous extraction of aspects and features from reviews. In NAACL-HLT, pages 789–799, 2016.
 R. Collobert, J. Weston, L. Bottou, M. Karlen, K. Kavukcuoglu, and P. Kuksa. Natural language processing (almost) from scratch. JMLR, 12:2493–2537, 2011.
 K. Dave, S. Lawrence, and D. M. Pennock. Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In WWW, pages 519–528, 2003.
 S. Deerwester, S. T. Dumais, G. W. Furnas, T. K. Landauer, and R. Harshman. Indexing by latent semantic analysis. Journal of the American society for information science, 41(6):391, 1990.
 X. Ding, B. Liu, and P. S. Yu. A holistic lexicon-based approach to opinion mining. In WSDM, pages 231–240, 2008.
 L. Dong, F. Wei, C. Tan, D. Tang, M. Zhou, and K. Xu. Adaptive recursive neural network for target-dependent twitter sentiment classification. In ACL, pages 49–54, 2014.
 J. Duchi, E. Hazan, and Y. Singer. Adaptive subgradient methods for online learning and stochastic optimization. JMLR, 12:2121–2159, 2011.
 R.-E. Fan, K.-W. Chang, C.-J. Hsieh, X.-R. Wang, and C.-J. Lin. Liblinear: A library for large linear classification. JMLR, 9:1871–1874, 2008.
 R. Feldman. Techniques and applications for sentiment analysis. Communications of the ACM, 56(4):82–89, 2013.
 J. L. Fleiss. Measuring nominal scale agreement among many raters. Psychological bulletin, 76(5):378, 1971.
 X. Glorot, A. Bordes, and Y. Bengio. Domain adaptation for large-scale sentiment classification: A deep learning approach. In ICML, pages 513– 520, 2011.
 A. Graves and J. Schmidhuber. Framewise phoneme classification with bidirectional lstm and other neural network architectures. Neural Networks, 18(5):602–610, 2005.