NLP_Book-Classification/Clustering

Background of this project: Take different transfromation methods(BOW,TFIDF,DOC2VEC) and algorithms to classfiy and cluster five books-chesterton-brown,austen-emma,edgeworth-parents,milton-paradise,bible-kjv

Data preprocessing: Convert all letters into lower case Remove punctuations Tokenize the documents to remove stopwords (nltk library) Lemmatization Transform text into vector

Classification:

Support Vector Machines (SVM) K-Nearest Neighbors (KNN) Decision Tree Random Forest Logistic Regression

Clustering: K-means Hierarchical Expectation-Maximization (EM)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

NLP_Book-Classification/Clustering

Files

README.md

Latest commit

History

README.md

File metadata and controls

NLP_Book-Classification/Clustering