Changing duckling url shouldn't require a model retrain #3389

c12k · 2019-05-06T23:07:22Z

Rasa version:
Rasa core 0.14.0
Rasa nlu 0.14.4
Python version:
3.6.8
Operating system (windows, osx, ...):
osx
Issue:
rasa nlu model creation takes the duckling URL from the config.yml file and puts it into the metadata.json file of the trained model.
we use docker-compose for local testing and k8s for cloud test/prod.
docker and k8s use different way to network between containers; docker uses named containers eg duckling and k8s uses localhost. So we need different duckling url in local vs cloud testing.
we've separated the URL's in environment files but the Rasa training puts the URL into the metadata.json file of the model. This means that the model has to be retrained between local (docker-compose) and cloud (k8s-docker) testing. It makes more sense to have the URL outside of the model in a config file that can be controlled with environment and build processes so that the trained model can be copied rather than retrained (for no reason other than URL change due to environment).
eg.
for docker-compose "url": "http://duckling:8000",
for k8s "url": "http://localhost:8000",

Content of configuration file (config.yml):

for docker-compose:

pipeline:
# other stuff
  - name: ner_duckling_http
    url: http://duckling:8000

for cloud k8s:
pipeline:
# other stuff
  - name: ner_duckling_http
    url: http://localhost:8000

Content of domain file (domain.yml) (if used & relevant):

not relevant

The text was updated successfully, but these errors were encountered:

akelad · 2019-05-08T13:36:08Z

Thanks for raising this issue, @MetcalfeTom will get back to you about it soon.

erohmensing · 2019-05-10T04:16:17Z

Hey @cmcc13, does this help?

In addition to setting the default ``url`` of your duckling server in the
configuration, you can also change the url of your duckling server (without
needing to re-train your model) by setting the ``RASA_DUCKLING_HTTP_URL``
environment variable.

See relevant issue here

c12k · 2019-05-12T23:51:05Z

This might be a work around. But I think the URL should not be put into the model in the first place. The URL should be read from a YML file (config, .env or endpoints). We'll give the environment variable a go. Thanks.

stale · 2019-08-11T00:24:27Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

stale · 2019-08-18T00:27:36Z

This issue has been automatically closed due to inactivity. Please create a new issue if you need more help.

sankaran45 · 2019-11-14T10:10:24Z

I found the same problem - if i change duckling http url in the cnfig file it requires a complete retrain. Please consider fixing this as its very un-intuitive - spent a lot of time trying to figure out why the url change is not getting picked up before stumbling on this.

erohmensing · 2019-11-14T13:38:05Z

I agree. Not sure how best to handle this, either the URL should be part of the endpoints.yml (but would still need to be able to define in config for NLU only models? 🤔 ) or its value shouldn't influence the fingerprinting.

erohmensing · 2019-11-18T10:55:46Z

@wochinge in progress but no assignee?

wochinge · 2019-11-18T11:19:56Z

Thanks, fixed it :-)

akelad · 2020-01-27T09:39:26Z

@wochinge why are we not fixing that?

wochinge · 2020-01-27T11:03:01Z

Because

it's messy in the code
it's only a tiny tiny advancement if we don't retrain in case the duckling url is changed (how often are you changing your duckling url?)

So basically the relation between benefit and effort is very bad.

c12k · 2020-01-27T21:25:11Z

We have to change the duckling url regularly because dev and prod environments are different. So frequency of needing to change this is daily.
for docker-compose "url": "http://duckling:8000",
for k8s "url": "http://localhost:8000"

Yoomtah · 2020-01-28T01:11:14Z

Actually @cmcc13 the prod url is now "duckling.default.svc.cluster.local.:8000" and that could change later if we do more advanced GKE service stuff. But in dev we just want to docker-compose up and let docker sort out all the networking. So @wochinge it's a bit of a headache for our team to not have this all in a config file.

wochinge · 2020-01-28T08:56:39Z

@Yoomtah

As far as I understand it, you have two setups, correct? One docker-compose and one K8s? And are they completely separate or are you sharing the trained models between them? Because if you are not sharing the models between these two deployments, then you have to retrain either way.

Yoomtah · 2020-01-29T02:10:22Z

For each chatbot that we have (I believe its 5), we have two model files: model-dev and model-prod. These models are identical except that they were trained with different duckling URLs. Depending on our environment we then build a Rasa docker container with one of these files.

The duckling URL is the only thing necessitating two training runs and managing two model files for each bot. We have a different action URL for dev and prod as well but this is easily changed in the endpoints.yml file.

wochinge · 2020-01-30T16:44:11Z

Ah, I think I'm getting it now.

You train a model in the dev environment and decide it's worth to promote it to the prod environment
You can't promote it to the prod environment, because the duckling url is diferent and then you have to retrain it, right?

Would an easy workaround to add an alias for duckling to your hosts file? (https://en.wikipedia.org/wiki/Hosts_(file))

nmelche · 2020-11-14T10:43:07Z

This problem still exists and is very uninituitv. Every endpoint can be configured in the endpoints.yml except the duckling part. Makes the automate deployment e.g. via helm very messy.

s-montes-majorel · 2021-07-23T11:53:37Z

@wochinge, is there any chance this will be changed at some point? In order to avoid messing up our /etc/hosts file, we ended up budgeting for a separate, global Duckling server, used both for dev and prod. We are not completely satisfied but it was the simplest way to avoid multiple traininings of the same model.

wochinge · 2021-07-26T13:03:54Z

@s-montes-majorel This is currently not super high on our priority list 😬 How about using environment variables for this and set the env variable depending on the context?

s-montes-majorel · 2021-07-27T08:15:03Z

We use an env variable for the Duckling URL. However, changing the value while leaving the config.yml unchanged still is detected as a change that requires retraining. Is the Duckling component used during training? If it is not the case, maybe it would make sense to only replace the env variable at inference time.

I do understand that this may be low in the priority list, though :) For us, it meant budgeting the Duckling component in a different way. It could also be explained in the documentation for people deploying Rasa using separate microservices instead of one big Kubernetes.

wochinge · 2021-08-09T07:59:35Z

Thanks for the explanation!

Is the Duckling component used during training?

It actually isn't 👍🏻 We are currently working on some changes for 3.0 where we could consider this 🤔

It could also be explained in the documentation for people deploying Rasa using separate microservices instead of one big Kubernetes.

What do you mean by "one big Kubernetes"?

s-montes-majorel · 2021-08-16T09:58:50Z

Thanks for the reply!

What do you mean by "one big Kubernetes"?

I meant that the documentation (for Rasa X, in particular) assumes one big deploy (using Helm, Docker Compose, etc) that takes care of all the microservices. Our current implementation is more modular. Most of the components (the tracker DB, the Duckling API, a custom event broker using Pub/Sub) are handled in different machines. We also have the different environments (dev, pre, pro). We managed to make everything work using environment variables, except for the Duckling URL.

akelad added the status:more-details-needed Waiting for the user to provide more details / stacktraces / answer a question label May 10, 2019

no-response bot removed the status:more-details-needed Waiting for the user to provide more details / stacktraces / answer a question label May 12, 2019

stale bot added the status:stale label Aug 11, 2019

stale bot closed this as completed Aug 18, 2019

erohmensing changed the title ~~nlu duckling url in model metadata.json creates problem with docker vs k8s~~ Changing duckling url shouldn't require a model retrain Nov 14, 2019

erohmensing added type:enhancement ✨ Additions of new features or changes to existing ones, should be doable in a single PR and removed status:stale labels Nov 14, 2019

wochinge reopened this Nov 18, 2019

wochinge added the area:rasa-oss 🎡 Anything related to the open source Rasa framework label Nov 18, 2019

wochinge added the resolution:wontfix issue is acknowledged but we will not work on this (nor will we accept contributions) label Jan 24, 2020

wochinge closed this as completed Jan 24, 2020

wochinge mentioned this issue Aug 9, 2021

Implement Recipe to convert current config to GraphSchema #9277

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Changing duckling url shouldn't require a model retrain #3389

Changing duckling url shouldn't require a model retrain #3389

c12k commented May 6, 2019

akelad commented May 8, 2019

erohmensing commented May 10, 2019

c12k commented May 12, 2019

stale bot commented Aug 11, 2019

stale bot commented Aug 18, 2019

sankaran45 commented Nov 14, 2019

erohmensing commented Nov 14, 2019

erohmensing commented Nov 18, 2019

wochinge commented Nov 18, 2019

akelad commented Jan 27, 2020

wochinge commented Jan 27, 2020

c12k commented Jan 27, 2020

Yoomtah commented Jan 28, 2020

wochinge commented Jan 28, 2020

Yoomtah commented Jan 29, 2020

wochinge commented Jan 30, 2020 •

edited

Loading

nmelche commented Nov 14, 2020

s-montes-majorel commented Jul 23, 2021

wochinge commented Jul 26, 2021

s-montes-majorel commented Jul 27, 2021 •

edited

Loading

wochinge commented Aug 9, 2021

s-montes-majorel commented Aug 16, 2021

Changing duckling url shouldn't require a model retrain #3389

Changing duckling url shouldn't require a model retrain #3389

Comments

c12k commented May 6, 2019

akelad commented May 8, 2019

erohmensing commented May 10, 2019

c12k commented May 12, 2019

stale bot commented Aug 11, 2019

stale bot commented Aug 18, 2019

sankaran45 commented Nov 14, 2019

erohmensing commented Nov 14, 2019

erohmensing commented Nov 18, 2019

wochinge commented Nov 18, 2019

akelad commented Jan 27, 2020

wochinge commented Jan 27, 2020

c12k commented Jan 27, 2020

Yoomtah commented Jan 28, 2020

wochinge commented Jan 28, 2020

Yoomtah commented Jan 29, 2020

wochinge commented Jan 30, 2020 • edited Loading

nmelche commented Nov 14, 2020

s-montes-majorel commented Jul 23, 2021

wochinge commented Jul 26, 2021

s-montes-majorel commented Jul 27, 2021 • edited Loading

wochinge commented Aug 9, 2021

s-montes-majorel commented Aug 16, 2021

wochinge commented Jan 30, 2020 •

edited

Loading

s-montes-majorel commented Jul 27, 2021 •

edited

Loading