-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Architecture prototype: Explore partial dataset loading for NLU graph #8407
Comments
What happens currently in prepare partial training
Overall it seems the components do not collect highly specialized data in |
How much control do the components need over chunked dataThe chunked data is currently balanced once, featurized, and persisted to disc upfront. So far only two components have an implementation for
I guess the entity synonym wrapper and other extractors like the |
Have an answer whether it's possible to abstract the current training (chunks==1) as a subset of training with multiple chunksI can't think of any reason why it wouldn't be possible. I just think the current design is suboptimal. By exposing the chunks to the components directly you are adding unnecessary complexity to the components if you want to handle the chunks in a somewhat smart way. Probably you would just have the components create some sort of data handler on the chunks to prevent code duplication. And then why not just hand them some data handler right away that abstracts away the details. What I think we should do is take the good parts of the existing implementation, such as logic on how to create balanced chunks upfront, rearrange things a bit, and have a more general and cleaner solution. Will think about this a bit more in any case... |
Further notes (work in progress)
|
Thanks for the investigation!
There are no components consuming their outputs during training anyway, are there? Where do we do the balancing? |
Description of Problem:
Research already did a PR for training the NLU model in chunks and any new model architecture in the future will have to be able to support this use case to enable training large scale models on huge datasets. This issue is limited to the NLU part of partial training data set loading. Partial training data set loading / processing doesn't need to be explored.
Overview of the Solution:
There is no need for implementation here. However, we need to have clear picture of how an implementation with the proposed graph architecture could look like.
Blockers (if relevant):
None
Definition of Done:
Component
s need over the chunked training data (e.g. whether the components need intermedite write operations)Component
s have to be able to train on chunks (e.g. what kind of summary data they need before - these things are currently done inprepare_partial_training
)chunks==1
) as a subset of training with multiple chunksComponent
if the data is on disk or in memory) and which integrates into the graph structureThe text was updated successfully, but these errors were encountered: