Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For #2 : Logistic Regresssion on winequaliy.csv #37

Merged
merged 4 commits into from
Mar 20, 2020

Conversation

SanchiMittal
Copy link
Contributor

@SanchiMittal SanchiMittal commented Mar 10, 2020

I have used Logistic Regression classifier to perform binary classification on winequality.csv and classify the test data into recommended or not recommended wine.
@dzeber and @mlopatka Please let me know what improvements I need to make.
Thanks

@SanchiMittal
Copy link
Contributor Author

SanchiMittal commented Mar 11, 2020

I will add other ML models like KNN, SVM, Naive Bayes etc. in the next step and present a comparative study. @dzeber @mlopatka Kindly review my work till now, guiding if I need some changes in my workflow.
Regards.

Copy link
Contributor

@dzeber dzeber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR. The notebook is well-documented and easy to follow, as are your modules. This is sufficient to satisfy the startup task #2, but you're welcome to dig into the modeling further.

Please add a comment indicating how you decided on this train-test split ratio. We are asking for this input at this point because one of goals of the project is to better understand how this choice influences the outcomes of the model.

Also, do you have any other thoughts on the results from your classification report and confusion matrix? Notice that the model seems to assign most wines to the non-recommended category regardless, which might be inflating the overall accuracy. It would be interesting to see the results on an undersampled training set.

# Correelation Matrix
corr = d.corr()
print("Correlation Matrix:")
print(corr)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not both printing this since you are also displaying the visualizations.

@SanchiMittal
Copy link
Contributor Author

Thank you for the review. I will implement the above suggestios in my work. Also, I would like to dig more into modelling and study the performance of different models.

Copy link
Contributor

@mlopatka mlopatka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work! Thank you for addressing earlier feedback.

@mlopatka mlopatka merged commit 6543432 into mozilla:master Mar 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants