Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use CoherenceModel in top_topics in LDAModel to avoid code duplication #1128

Closed
tmylk opened this issue Jan 31, 2017 · 3 comments · Fixed by #1427
Closed

Use CoherenceModel in top_topics in LDAModel to avoid code duplication #1128

tmylk opened this issue Jan 31, 2017 · 3 comments · Fixed by #1427
Labels
difficulty easy Easy issue: required small fix wishlist Feature request

Comments

@tmylk
Copy link
Contributor

tmylk commented Jan 31, 2017

'u_mass' coherence is currently implemented in two places in gensim: CoherenceModel and LDAModel.

Use CoherenceModel(coherence='u_mass', topn=num_words) in LDAModel top_topics

Will require changes to CoherenceModel to return non-aggregated coherence scores, topic words and avoid importing LDAModel.

@tmylk tmylk added difficulty easy Easy issue: required small fix wishlist Feature request labels Jan 31, 2017
@greninja
Copy link
Contributor

greninja commented Feb 14, 2017

I have a doubt regarding calculating the coherence of each individual topic.

Is it like:
We make a 2D list of confirmation measures for each individual topic, in direct_confirmation_measure.py's log_conditional_probability function, instead of creating one whole big list for all the topics combined, and then calculate the arithmetic mean of each topic's confirmation measures to give its coherence?

This will I believe generate non-aggregated coherence scores. I will make a PR accordingly.

menshikh-iv pushed a commit that referenced this issue Aug 18, 2017
* Add a `get_topics` method to all topic models, add test coverage for this, and update the `CoherenceModel` to use this for getting topics from models.

* Require topics returned from `get_topics` to be probability distributions for the probabilistic topic models.

* Replace code in `LdaModel.top_topics` with use of `CoherenceModel`.

* Fix docstrings to use Google style throughout PR changes and various LdaModel methods.
fabriciorsf pushed a commit to LINE-PESC/gensim that referenced this issue Aug 23, 2017
…iskvorky#1427)

* Add a `get_topics` method to all topic models, add test coverage for this, and update the `CoherenceModel` to use this for getting topics from models.

* Require topics returned from `get_topics` to be probability distributions for the probabilistic topic models.

* Replace code in `LdaModel.top_topics` with use of `CoherenceModel`.

* Fix docstrings to use Google style throughout PR changes and various LdaModel methods.
@ismskd
Copy link

ismskd commented Dec 3, 2018

@tmylk dear
my spyder hangs whenever i use coherence='c_v' in CoherenceModel

`
Lda_Model = LdaModel(corpus=corpus, id2word=dictionary, iterations=50, num_topics=2)

cm = CoherenceModel(model=Lda_Model, texts=texts, dictionary=dictionary, coherence='c_v')
print( cm.get_coherence())`

plz guide me if u have any idea about this issue

@menshikh-iv
Copy link
Contributor

@ismskd this related to Windows + Multiprocessing issue, see #2291

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
difficulty easy Easy issue: required small fix wishlist Feature request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants