NLP_Book-Classification/Clustering

Background of this project: Take different transfromation methods(BOW,TFIDF,DOC2VEC) and algorithms to classfiy and cluster five books-chesterton-brown,austen-emma,edgeworth-parents,milton-paradise,bible-kjv

Data preprocessing: Convert all letters into lower case Remove punctuations Tokenize the documents to remove stopwords (nltk library) Lemmatization Transform text into vector

Classification:

Support Vector Machines (SVM) K-Nearest Neighbors (KNN) Decision Tree Random Forest Logistic Regression

Clustering: K-means Hierarchical Expectation-Maximization (EM)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Book Classification-report.pptx		Book Classification-report.pptx
Book Classification_BOW&TFIDF.ipynb		Book Classification_BOW&TFIDF.ipynb
Book Classification_errorAnalysis.ipynb		Book Classification_errorAnalysis.ipynb
Book Clustering-Doc2Vec.ipynb		Book Clustering-Doc2Vec.ipynb
Book Clustering-report.pptx		Book Clustering-report.pptx
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP_Book-Classification/Clustering

About

Releases

Packages

Languages

liuyuanyue185/NLP_Book-Classification-Clustering

Folders and files

Latest commit

History

Repository files navigation

NLP_Book-Classification/Clustering

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages