-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changing duckling url shouldn't require a model retrain #3389
Comments
Thanks for raising this issue, @MetcalfeTom will get back to you about it soon. |
Hey @cmcc13, does this help?
See relevant issue here |
This might be a work around. But I think the URL should not be put into the model in the first place. The URL should be read from a YML file (config, .env or endpoints). We'll give the environment variable a go. Thanks. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically closed due to inactivity. Please create a new issue if you need more help. |
I found the same problem - if i change duckling http url in the cnfig file it requires a complete retrain. Please consider fixing this as its very un-intuitive - spent a lot of time trying to figure out why the url change is not getting picked up before stumbling on this. |
I agree. Not sure how best to handle this, either the URL should be part of the endpoints.yml (but would still need to be able to define in config for NLU only models? 🤔 ) or its value shouldn't influence the fingerprinting. |
@wochinge in progress but no assignee? |
Thanks, fixed it :-) |
@wochinge why are we not fixing that? |
Because
So basically the relation between benefit and effort is very bad. |
We have to change the duckling url regularly because dev and prod environments are different. So frequency of needing to change this is daily. |
Actually @cmcc13 the prod url is now "duckling.default.svc.cluster.local.:8000" and that could change later if we do more advanced GKE service stuff. But in dev we just want to docker-compose up and let docker sort out all the networking. So @wochinge it's a bit of a headache for our team to not have this all in a config file. |
As far as I understand it, you have two setups, correct? One docker-compose and one K8s? And are they completely separate or are you sharing the trained models between them? Because if you are not sharing the models between these two deployments, then you have to retrain either way. |
For each chatbot that we have (I believe its 5), we have two model files: model-dev and model-prod. These models are identical except that they were trained with different duckling URLs. Depending on our environment we then build a Rasa docker container with one of these files. The duckling URL is the only thing necessitating two training runs and managing two model files for each bot. We have a different action URL for dev and prod as well but this is easily changed in the endpoints.yml file. |
Ah, I think I'm getting it now.
Would an easy workaround to add an alias for |
This problem still exists and is very uninituitv. Every endpoint can be configured in the endpoints.yml except the duckling part. Makes the automate deployment e.g. via helm very messy. |
@wochinge, is there any chance this will be changed at some point? In order to avoid messing up our /etc/hosts file, we ended up budgeting for a separate, global Duckling server, used both for dev and prod. We are not completely satisfied but it was the simplest way to avoid multiple traininings of the same model. |
@s-montes-majorel This is currently not super high on our priority list 😬 How about using environment variables for this and set the env variable depending on the context? |
We use an env variable for the Duckling URL. However, changing the value while leaving the config.yml unchanged still is detected as a change that requires retraining. Is the Duckling component used during training? If it is not the case, maybe it would make sense to only replace the env variable at inference time. I do understand that this may be low in the priority list, though :) For us, it meant budgeting the Duckling component in a different way. It could also be explained in the documentation for people deploying Rasa using separate microservices instead of one big Kubernetes. |
Thanks for the explanation!
It actually isn't 👍🏻 We are currently working on some changes for 3.0 where we could consider this 🤔
What do you mean by "one big Kubernetes"? |
Thanks for the reply!
I meant that the documentation (for Rasa X, in particular) assumes one big deploy (using Helm, Docker Compose, etc) that takes care of all the microservices. Our current implementation is more modular. Most of the components (the tracker DB, the Duckling API, a custom event broker using Pub/Sub) are handled in different machines. We also have the different environments (dev, pre, pro). We managed to make everything work using environment variables, except for the Duckling URL. |
Rasa version:
Rasa core 0.14.0
Rasa nlu 0.14.4
Python version:
3.6.8
Operating system (windows, osx, ...):
osx
Issue:
rasa nlu model creation takes the duckling URL from the config.yml file and puts it into the metadata.json file of the trained model.
we use docker-compose for local testing and k8s for cloud test/prod.
docker and k8s use different way to network between containers; docker uses named containers eg duckling and k8s uses localhost. So we need different duckling url in local vs cloud testing.
we've separated the URL's in environment files but the Rasa training puts the URL into the metadata.json file of the model. This means that the model has to be retrained between local (docker-compose) and cloud (k8s-docker) testing. It makes more sense to have the URL outside of the model in a config file that can be controlled with environment and build processes so that the trained model can be copied rather than retrained (for no reason other than URL change due to environment).
eg.
for docker-compose "url": "http://duckling:8000",
for k8s "url": "http://localhost:8000",
Content of configuration file (config.yml):
Content of domain file (domain.yml) (if used & relevant):
not relevant
The text was updated successfully, but these errors were encountered: