-
-
Notifications
You must be signed in to change notification settings - Fork 63
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding document metadata into the DMR-Model #107
Comments
Hello @hhagedorn Fortunately, it doesn't seem difficult to extend the |
Hello @bab2min, May I ask if you already have any rough plan on when you might introduce the next update? Like will this rather be within a couple of weeks or more like a couple of months. I'm asking because in the first case I might wait for it before I actually conclude my current work. Thank you very much anyways! |
@hhagedorn |
Hi, |
improved DMR & GDMR (#107) improved GDMR's performance fixed wrong topic_id for excluded words copy() method for topic models typed Python exceptions & warnings refactored code based on c++14
@hhagedorn , sorry for a little later update than scheduled. A test version of DMRModel with multiple metadata labels is just uploaded. $ pip install -U --index-url https://test.pypi.org/simple/ tomotopy==0.12.0 Multiple labels are supported by Also a new method See more detail in https://github.com/bab2min/tomotopy/blob/main/examples/dmr_multi_label.py |
No worries, thank you very much for all of your efforts. It looks great and I will try it out within the next couple of days! |
Hi, in the meantime I trained all the models and using the new functionality in DMR works great, thank you! I just have one small remark, I don't even know if it is important to mention. When I wan't to inspect priors for given metadata in DMR, everything works great and exactly like described in the documentation. However, the method is not "known" to the Python implementation, i.e. my IDE tells me it wouldn't exist. |
@hhagedorn, I'm glad the new functionality works well. |
I am using PyCharm (Community Edition). But I am not sure whether the problem is linked to the IDE. When I inspect the DMRModel Class, all the methods and parameters are there - except the newly added ones. I.e. there is no |
Hi everybody,
in a current project, I am using various models from this great package. One Model which seemed particularly interesting is the DMR-Topic Model, but when I was trying to deploy it I failed to properly include the metadata of the given documents.
From my understanding of the paper, which the model is based on, each document can be linked to an arbitrary amount of metadata labels from a given set, e.g. a list of authors. For example on page two of the Mimno and McCallum paper it says:
Now looking at the signature of the DMR-Model's
add_doc()
Method it seems only possible to add a single string of metadata per document. Do I get it right, that accordingly only single-label documents can be included in this version of the DMR-Model? E.g. that only single author-documents can be considered and no lists or binary-vectors of metadata values can be put in?Thank you already!
The text was updated successfully, but these errors were encountered: