forked from piskvorky/gensim
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Implement Soft Cosine Measure (piskvorky#1827)
* Implement Soft Cosine Similarity * Added numpy-style documentation for Soft Cosine Similarity * Added unit tests for Soft Cosine Similarity * Make WmdSimilarity and SoftCosineSimilarity handle empty queries * Rename Soft Cosine Similarity to Soft Cosine Measure * Add links to Soft Cosine Measure papers * Remove unused variables and parameters for Soft Cosine Measure * Replace explicit timers with magic %time in Soft Cosine Measure notebook * Rename var in term similarity matrix construction to reflect symmetry * Update SoftCosineSimilarity class example to define all variables * Make the code in Soft Cosine Measure notebook more compact * Use hanging indents in EuclideanKeyedVectors.similarity_matrix * Simplified expressions in WmdSimilarity and SoftCosineSimilarity * Extract the sparse2coo function to the global scope * Fix __str__ of SoftCosineSimilarity * Use hanging indents in SoftCossim.__init__ * Fix formatting of the matutils module * Make similarity matrix info messages appear at fixed frequency * Construct term similarity matrix rows for important terms first * Optimize softcossim for an estimated 100-fold constant speed increase * Remove unused import in gensim.similarities.docsim * Fix imports in gensim.models.keyedvectors * replace reference to anonymous link * Update "See Also" references to new *2vec implementation * Fix formatting error in gensim.models.keyedvectors * Update Soft Cosine Measure tutorial notebook * Update Soft Cosine Measure tutorial notebook * Use smaller glove-wiki-gigaword-50 model in Soft Cosine Measure notebook * Use gensim-data to load SemEval datasets in Soft Cosine Measure notebook * Use backwards-compatible syntax in Soft Cosine Similarity notebook * Remove unnecessary package requirements in Soft Cosine Measure notebook * Fix Soft Cosine Measure notebook to use true gensim-data dataset names * fix docs[1] * fix docs[2] * fix docs[3] * small fixes * small fixes[2]
- Loading branch information
1 parent
420b989
commit 804a6cc
Showing
9 changed files
with
1,041 additions
and
48 deletions.
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.