Releases: bab2min/tomotopy
Releases · bab2min/tomotopy
0.11.0
- A new topic model
tomotopy.PTModel
for short texts was added into the package. - An issue was fixed where
tomotopy.HDPModel.infer
causes a segmentation fault sometimes. - A mismatch of numpy API version was fixed.
- Now asymmetric document-topic priors are supported.
- Serializing topic models to
bytes
in memory is supported. - An argument
normalize
was added toget_topic_dist()
,get_topic_word_dist()
andget_sub_topic_dist()
for controlling normalization of results. - Now
tomotopy.DMRModel.lambdas
andtomotopy.DMRModel.alpha
give correct values. - Categorical metadata supports for
tomotopy.GDMRModel
were added (see https://github.com/bab2min/tomotopy/blob/main/examples/gdmr_both_categorical_and_numerical.py ). - Python3.5 support was dropped.
0.10.2
0.10.1
- An issue was fixed where
tomotopy.utils.Corpus.extract_ngrams
craches with empty input. - An issue was fixed where
tomotopy.LDAModel.infer
raises exception with valid input. - An issue was fixed where
tomotopy.HLDAModel.infer
generates wrongtomotopy.Document.path
. - Since a new parameter
freeze_topics
fortomotopy.HLDAModel.train
was added, you can control whether to create a new topic or not when training.
0.10.0
- The interface of
tomotopy.utils.Corpus
and oftomotopy.LDAModel.docs
were unified. Now you can access the document in corpus with the same manner. - getitem of
tomotopy.utils.Corpus
was improved. Not only indexing by int, but also by Iterable[int], slicing are supported. Also indexing by uid is supported. - New methods
tomotopy.utils.Corpus.extract_ngrams
andtomotopy.utils.Corpus.concat_ngrams
were added. They extracts n-gram collocations using PMI and concatenates them into a single words. - A new method
tomotopy.LDAModel.add_corpus
was added, andtomotopy.LDAModel.infer
can receive corpus as input. - A new module
tomotopy.coherence
was added. It provides the way to calculate coherence of the model. - A paramter
window_size
was added totomotopy.label.FoRelevance
. - An issue was fixed where NaN often occurs when training
tomotopy.HDPModel
. - Now Python3.9 is supported.
- A dependency to py-cpuinfo was removed and the initializing of the module was improved.
0.9.1
0.9.0
- The
tomotopy.LDAModel.summary()
method, which prints human-readable summary of the model, has been added. - The random number generator of package has been replaced with EigenRand. It speeds up the random number generation and solves the result difference between platforms.
- Due to above, even if
seed
is the same, the model training result may be different from the version before 0.9.0. - Fixed a training error in
tomotopy.HDPModel
. tomotopy.DMRModel.alpha
now shows Dirichlet prior of per-document topic distribution by metadata.tomotopy.DTModel.get_count_by_topics()
has been modified to return a 2-dimensionalndarray
.tomotopy.DTModel.alpha
has been modified to return the same value astomotopy.DTModel.get_alpha()
.- Fixed an issue where the
metadata
value could not be obtained for the document oftomotopy.GDMRModel
. tomotopy.HLDAModel.alpha
now shows Dirichlet prior of per-document depth distribution.tomotopy.LDAModel.global_step
has been added.tomotopy.MGLDAModel.get_count_by_topics()
now returns the word count for both global and local topics.tomotopy.PAModel.alpha
,tomotopy.PAModel.subalpha
, andtomotopy.PAModel.get_count_by_super_topic()
have been added.
0.8.2
- New properties
tomotopy.DTModel.num_timepoints
andtomotopy.DTModel.num_docs_by_timepoint
have been added. - A bug which causes different results with the different platform even if
seeds
were the same was partially fixed.
As a result of this fix, nowtomotopy
in 32 bit yields different training results from earlier version.
0.8.1
- A bug where
tomotopy.LDAModel.used_vocabs
returned an incorrect value was fixed. - Now
tomotopy.CTModel.prior_cov
returns a covariance matrix with shape[k, k]
. - Now
tomotopy.CTModel.get_correlations
with empty arguments returns a correlation matrix with shape[k, k]
.
0.8.0
- Since NumPy was introduced in tomotopy, many methods and properties of tomotopy return not just list, but numpy.ndarray now.
- Tomotopy has a new dependency NumPy >= 1.10.0.
- A wrong estimation of tomotopy.HDPModel.infer was fixed.
- A new method about converting HDPModel to LDAModel was added.
- New properties including tomotopy.LDAModel.used_vocabs, tomotopy.LDAModel.used_vocab_freq and tomotopy.LDAModel.used_vocab_df were added into topic models.
- A new g-DMR topic model(tomotopy.GDMRModel) was added.
- An error at initializing tomotopy.label.FoRelevance in macOS was fixed.
- An error that occured when using tomotopy.utils.Corpus created without raw parameters was fixed.