For the machine learning course project I worked on the publicly avaliable quora insincere question classification dataset. I used the popularly available NLTK library to clean and process the data such as removing the stop words, removing punctuations, stemming the words.
I evaluated three popular machine learning algorithms which are logistic regression, the perceptron algorithm and the naive bayes classifier using the popular sklearn library from scikit