Life's Joy & Comfortable
學習句子分類,使用深度學習的方法對句子資料集進行分類。
問題
句子分類(Sentence Classification)是指給定一個句子,標註預先設定的若干類別中的一個類別。
句子分類包括情感分析(Sentiment Analysis)、問題分類(Question
Classification)等任務。情感分析又稱傾向性分析、意見抽取(Opinion extraction)、意見挖掘(Opinion mining)、情感挖掘(Sentiment mining)、主觀分析(Subjectivity analysis),它是對帶有情感色彩的主觀性文字進行分析、處理、歸納和推理的過程,如從評論文字中分析使用者對“數碼相機”的“變焦、價格、大小、重量、閃光、易用性”等屬性的情感傾向。
應用
瞭解對電影、商品、Twitter 等的褒貶評價,以此來改善產品和服務、發現競爭對手的優劣勢、預測股票走勢等。
資料集
Data | c | l | N | |V| | |V_pre| | Test |
---|---|---|---|---|---|---|
MR | 2 | 20 | 10662 | 18765 | 16448 | CV |
SST-1 | 5 | 18 | 11855 | 17836 | 16262 | 2210 |
SST-2 | 2 | 19 | 9613 | 16185 | 14838 | 1821 |
Subj | 2 | 23 | 10000 | 21323 | 17913 | CV |
TREC | 6 | 10 | 5952 | 9592 | 9125 | 500 |
CR | 2 | 19 | 3775 | 5340 | 5046 | CV |
MPQA | 2 | 3 | 10606 | 6246 | 6083 | CV |
- MR: Movie reviews 電影評論,每條評論包含一個句子。1
SST-1: Stanford Sentiment Treebank,MR 的擴充套件但劃分了 train/dev/test 集合並提供 5 個細粒度標籤(非常積極的,積極的,中性的,負面的,非常消極的)。
SST-2: 與 SST-1 一樣但移除中性評論並用二進位制標籤。
Subj: Subjectivity 主觀性資料集,任務是將句子分類為主觀或客觀的。3
TREC: TREC question dataset TREC 問題資料集,任務是將一個問題分成 6 類(關於人、位置、數字資訊等)。4
CR: Customer reviews 各種產品的客戶評論,任務是預測正面/負面評論。5
MPQA: MPQA 資料集意見極性檢測任務。6
方法
通常會把任務拆分成幾個子任務:
分詞
把句子根據意思分成多個詞,有時可能還需要去掉停用詞、瞭解詞性、轉換成詞向量等操作。
提取特徵
有時我們不會直接使用分詞後的多個詞來直接分類,這時需要提取特徵來方便分類。
常用特徵:TF-IDF、LDA、LSI
構建分類器
輸入特徵或詞向量等,通過一些模型,對該句子進行分類。
Naive Bayes
NBSVM: Naive Bayes SVM
MNB: Multinomial Naive Bayes 7
combine-skip
combine-skip + NB 8
Model | MR | SST-1 | SST-2 | Subj | TREC | CR | MPQA |
---|---|---|---|---|---|---|---|
NBSVM | 79.4 | - | - | 93.2 | - | 81.8 | 86.3 |
MNB | 79.0 | - | - | 93.6 | - | 80.0 | 86.3 |
combine-skip | 76.5 | - | - | 93.6 | 92.2 | 80.1 | 87.1 |
combine-skip+NB | 80.4 | - | - | 93.6 | - | 81.3 | 87.5 |
RNN
RCNN: Recurrent Convolutional Neural Networks 9
S-LSTM: Long Short-Term Memory Over Recursive Structures 10
LSTM: Long Short-Term Memory
BLSTM: Bidirectional Long Short-Term Memory
Tree-LSTM: Tree-structured Long Short-Term Memory 11
LSTMN: Long Short-Term Memory-Network 12
Multi-Task: Recurrent Neural Network for Text Classification with Multi-Task Learning 13
BLSTM-Att: Bidirectional Long Short-Term Memory, attention-based model
BLSTM-2DPooling: Bidirectional Long Short-Term Memory Networks with Two-Dimensional Max Pooling
BLSTM-2DCNN: Bidirectional Long Short-Term Memory Networks with 2D convolution 14
Model | MR | SST-1 | SST-2 | Subj | TREC | CR | MPQA |
---|---|---|---|---|---|---|---|
RCNN | - | 47.21 | - | - | - | - | - |
S-LSTM | - | - | 81.9 | - | - | - | - |
LSTM | - | 46.4 | 84.9 | - | - | - | - |
BLSTM | - | 49.1 | 87.5 | - | - | - | - |
Tree-LSTM | - | 51.0 | 88.0 | - | - | - | - |
LSTMN | - | 49.3 | 87.3 | - | - | - | - |
Multi-Task | - | 49.6 | 87.9 | 94.1 | - | - | - |
BLSTM | 80.0 | 49.1 | 87.6 | 92.1 | 93.0 | - | - |
BLSTM-Att | 81.0 | 49.8 | 88.2 | 93.5 | 93.8 | - | - |
BLSTM-2DPooling | 81.5 | 50.5 | 88.3 | 93.7 | 94.8 | - | - |
BLSTM-2DCNN | 82.3 | 52.4 | 89.5 | 94.0 | 96.1 | - | - |
CNN
DCNN: Dynamic Convolutional Neural Network 15
CNN-non-static: Convolutional Neural Networks, the pretrained vectors are fine-tuned for each task
CNN-multichannel: Convolutional Neural Networks with two sets of word vectors 16
TBCNN: Tree-based Convolutional Neural Network 17
Molding-CNN: Molding Convolutional Neural Networks 18
CNN-Ana: Non-static GloVe+word2vec CNN 19
MVCNN: Multichannel Variable-Size Convolution 20
DSCNN: Dependency Sensitive Convolutional Neural Networks 21
Model | MR | SST-1 | SST-2 | Subj | TREC | CR | MPQA |
---|---|---|---|---|---|---|---|
DCNN | - | 48.5 | 86.8 | - | 93.0 | - | - |
CNN-non-static | 81.5 | 48.0 | 87.2 | 93.4 | 93.6 | 84.3 | 89.5 |
CNN-multichannel | 81.1 | 47.4 | 88.1 | 93.2 | 92.2 | 85.0 | 89.4 |
TBCNN | - | 51.4 | 87.9 | - | 96.0 | - | - |
Molding-CNN | - | 51.2 | 88.6 | - | - | - | - |
CNN-Ana | 81.02 | 45.98 | 85.45 | 93.66 | 91.37 | 84.65 | 89.55 |
MVCNN | - | 49.6 | 89.4 | - | - | - | - |
DSCNN | 81.5 | 49.7 | 89.1 | 93.2 | 95.4 | - | - |
Others
RAE: Recursive Autoencoders with pre-trained word vectors from Wikipedia 22
AdaSent: self-adaptive hierarchical sentence model 23
RNTN: Recursive Neural Tensor Network 24
DRNN: Deep Recursive Neural Networks 25
Model | MR | SST-1 | SST-2 | Subj | TREC | CR | MPQA |
---|---|---|---|---|---|---|---|
RAE | 77.7 | 43.2 | 82.4 | - | - | - | 86.4 |
AdaSent | 83.1 | - | - | 95.5 | 92.4 | 86.3 | 93.3 |
RNTN | - | 45.7 | 85.4 | - | - | - | - |
DRNN | - | 49.8 | 86.6 | - | - | - | - |
參考
- (ACL 2005) Seeing Stars: Exploiting Class Relationships For Sentiment Categorization With Respect To Rating Scales https://www.cs.cornell.edu/people/pabo/movie-review-data/ ↩
- (EMNLP 2013) Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank https://nlp.stanford.edu/sentiment/ ↩
- (ACL 2004) A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts http://www.cs.cornell.edu/people/pabo/movie-review-data ↩
- (Language Resources and Evaluation 2005) Annotating Expressions Of Opinions And Emotions In Language http://mpqa.cs.pitt.edu/ ↩
- (ACL 2012) Baselines and Bigrams: Simple, Good Sentiment and Topic Classification ↩
- (NIPS 2015) Skip-Thought Vectors ↩
- (AAAI 2015) Recurrent Convolutional Neural Networks for Text Classification ↩
- (ICML 2015) Long Short-Term Memory Over Recursive Structures ↩
- (ACL 2015) Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks ↩
- (EMNLP2016) Long Short-Term Memory-Networks for Machine Reading ↩
- (IJCAI 2016) Recurrent Neural Network for Text Classification with Multi-Task Learning ↩
- (COLING 2016) Text Classification Improved by Integrating Bidirectional LSTM with Two-dimensional Max Pooling ↩
- (ACL 2014) A Convolutional Neural Network for Modelling Sentences ↩
- (EMNLP 2014) Convolutional Neural Networks for Sentence Classification ↩
- (EMNLP 2015) Discriminative Neural Sentence Modeling by Tree-Based Convolution ↩
- (EMNLP 2015) Molding CNNs for text: non-linear, non-consecutive convolutions ↩
- (IJCNLP 2017) A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification ↩
- (CoNLL 2015) Multichannel Variable-Size Convolution for Sentence Classification ↩
- (NAACL 2016) Dependency Sensitive Convolutional Neural Networks for Modeling Sentences and Documents ↩
- (EMNLP 2011) Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions ↩
- (IJCAI 2015) Self-adaptive hierarchical sentence model ↩
- (EMNLP 2013) Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank ↩
- (NIPS 2014) Deep Recursive Neural Networks for Compositionality in Language ↩