Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Point WordEmbeddingSimilarityIndex documentation to gensim.similarities #3003

Merged

Conversation

Witiko
Copy link
Contributor

@Witiko Witiko commented Nov 21, 2020

Currently, all documentation, Jupyter notebooks and code examples assume that WordEmbeddingSimilarityIndex is located in gensim.models, although it lives in gensim.similarities since 4.0.0. This pull request fixes the documentation.

@piskvorky piskvorky added bug Issue described a bug documentation Current issue related to documentation labels Nov 21, 2020
@piskvorky
Copy link
Owner

piskvorky commented Nov 21, 2020

Thanks @Witiko ! The main place for documentation is the gallery. I fear the notebooks directory is hard to discover, and not really maintained / tested.

Would it make sense to convert this into a gallery tutorial / howto script? With the notebook and html formats auto-generated from the converted source, instructions.

@Witiko
Copy link
Contributor Author

Witiko commented Nov 21, 2020

@piskvorky I like the idea behind the gallery. I don't have the time to convert now, but I will take a look during the next week.

@mpenkov mpenkov added this to the 4.0.0 milestone Feb 25, 2021
@piskvorky
Copy link
Owner

@Witiko we're planning the 4.0 release now. Can you please finish this? Thanks.

@Witiko
Copy link
Contributor Author

Witiko commented Feb 25, 2021

@piskvorky Thanks for the reminder. I will give it a look this weekend.

@Witiko
Copy link
Contributor Author

Witiko commented Feb 27, 2021

@piskvorky Would it be OK to merge the SCM notebook with the WMD autoexample? This would rid us of a lot of duplication and there would exist a single lean and mean example for word-embedding-based document similarity measures.

@piskvorky
Copy link
Owner

Anything that cleans up & simplifies documentation is highly desirable. @mpenkov WDYT?

@mpenkov
Copy link
Collaborator

mpenkov commented Feb 28, 2021

No objections from me.

@Witiko Witiko force-pushed the redirect-keyedvectorssimilarityindex branch from d4aa31c to 15f9282 Compare February 28, 2021 17:03
@Witiko
Copy link
Contributor Author

Witiko commented Feb 28, 2021

In the end, I kept the tutorials separate. Merging the tutorials seemed as a good idea at first, but the preprocessing pipelines for the SCM and the WMD are quite different and the merged tutorial would become significantly more complex as a result.

Perhaps I can give it another stab if we are ever adding sections about WmdSimilarity and SoftCosineSimilarity M:N document similarities, since this will make the autoexamples complex anyways. For now, I added some fixes for the WMD autoexample, such as removing a section about deprecated init_sims method, and redirecting the documentation to the Gallery.

@piskvorky
Copy link
Owner

Thanks so much @Witiko . Is the failing test related?

@Witiko
Copy link
Contributor Author

Witiko commented Feb 28, 2021

@piskvorky I don't think it is. The failing test is for Doc2Vec, which this PR does not touch.

@mpenkov
Copy link
Collaborator

mpenkov commented Mar 1, 2021

I re-ran the tests, they all passed.

@mpenkov mpenkov merged commit ddeeb12 into piskvorky:develop Mar 1, 2021
@mpenkov
Copy link
Collaborator

mpenkov commented Mar 1, 2021

Merged, thank you @Witiko !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issue described a bug documentation Current issue related to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants