Reviewed 1 paper focusing on text classification . The paper discussed about comparing the accuracy of classification between using web2Vec, Doc2Vec model and bag of words representation as the feature. The results show that web2Vec, Doc2Vec models offer higher accuracy.
Document->Text processing –> Word2Vec/Doc2Vec feature generation –> (IDF/TF-IDF weight adjustment) –> Train Classifier –> Evaluate performance
- Introduced neural network based Word2Vec and Doc2Vec Model
- Coupled word vector with weighting strategy like IDF and TF-IDF
- Training set and testing set used the same dataset, the accuracy is not persuasive
- Only logistic regression is used. Can try more classifiers
- Only bag of words is used for baseline. Can try more models, e.g. LDA
Jiang, Suqi, et al. “Integrating rich document representations for text classification.” 2016 IEEE Systems and Information Engineering Design Symposium (SIEDS). IEEE, 2016.