Add smart information retrieval system for TFIDF #1785
Labels
difficulty medium
Medium issue: required good gensim understanding & python skills
feature
Issue described a new feature
wishlist
Feature request
https://en.wikipedia.org/wiki/SMART_Information_Retrieval_System
The current TFIDF model uses natural TF and IDF for computing TFIDF. The idea is to try various transformation like logarithmic, augmented,boolean etc. before computing the vectors.
More about this - http://www.cs.odu.edu/~jbollen/IR04/readings/article1-29-03.pdf and https://nlp.stanford.edu/IR-book/pdf/06vect.pdf
Will send a PR tomorrow.
The text was updated successfully, but these errors were encountered: