Including `RulePolicy` leads to infinite max_history being used to deduplicate training trackers #8295

indam23 · 2021-03-25T10:42:11Z

Rasa version: 2.4.1

Issue:

When rules/RulePolicy are included in dialogue management, in order to check for conflicts with stories, effectively infinite max_history is used. This greatly increases the time it takes to load and process dialogue data.

Command or request that led to error:
edit train, not validate

rasa train core

Definition of Done

set max_history to lowest possible during rasa train core based on the training data

The text was updated successfully, but these errors were encountered:

Ghostvv · 2021-03-25T10:45:01Z

one can read rules first, calculate the length of the longest rule, then use it as rulepolicy’s effective max_history which can be compared with max_history of other policies to calculate unique_last_num_states limit for deduplication of generated trackers

wochinge · 2021-05-31T11:37:00Z

@melindaloubser1 I tried to reproduce this on Sara but the validation is super quick. Note that the max_history currently has to be set manually (I tried with it set and with the default / none set) and we don't distinguish for story vs. rule validation.

I also don't see how max history should be a big deal in my opinion as rules are somewhat enforced to be very short anyway. cc @JEM-Mosig

wochinge · 2021-05-31T11:38:11Z

And from what I can see from the code the included policies shouldn't matter for rasa data validate stories at all

indam23 · 2021-05-31T12:54:22Z

No you're right; I'm thinking I miswrote this and it applies only to loading during rasa train - would that make more sense?

wochinge · 2021-05-31T16:26:09Z

That makes more sense I think 🤔 I think what you're then suggesting to set max_history here in an automated fashion then? What do you think of that @samsucik ?

samsucik · 2021-06-01T11:37:25Z

@wochinge I explored the relevant code parts and I agree with automatically setting max_history to a value as low as possible. However, we can't compute this value before training the policy, right? Because a rule can lead to a long sequence of states even if it has just a single user intent (the intent can be followed by many actions), and so we need to see the actual rule data to determine the minimal max_history.

wochinge · 2021-06-02T07:52:12Z

Can't we loop over the rule_trackers which we receive for training and count the numbers of actions for each rule tracker to determine the longest one and use that in turn to set max_history?

samsucik · 2021-06-02T10:34:43Z

Yeah, fair, that's what would need to happen... Like this, right?

initialise all policies
load the data and determine max_history
update rule policy's max_history

What about batches, though? I think we'd like to use the same number of turns for tracker deduplication across batches. And perhaps we can say that rule data won't come in batches, but story data will... Maybe we'd need an "initial" pass through the data to determine things like this unique_last_num_states value -- kind of similar to an initial pass in the NLU pipeline for the purposes of constructing the tokeniser's vocab? 🤔 I guess I'm less worried about it for now, but more in the context of Rasa 3.0 🙂

wochinge · 2021-06-02T13:35:57Z

load the data and determine max_history

We can even do this during train within the RulePolicy, no? In my opinion we shouldn't leak such a RulePolicy-specific thing to code outside of the RulePolicy.

I don't understand the part about the batches. The RulePolicy has no concept of batches or whatsoever. Do you mean training in chunks?

samsucik · 2021-06-02T14:46:19Z

We can even do this during train within the RulePolicy, no?

I think this doesn't work because we use the determined max_history value in loading the training stories for other policies, or not? 🤔

I don't understand the part about the batches. The RulePolicy has no concept of batches or whatsoever. Do you mean training in chunks?

Fair, we call these "chunks", not "batches". I know that RulePolicy currently doesn't do anything with chunks, but I was worried about this in terms of future versions of Rasa where, if we have some general way of loading data from disk and providing it to different policies, I don't see a good reason for saying "any policy can consume data in chunks, but RulePolicy can't". But maybe I'm misunderstanding the extent to which Rasa 3.0 will be flexible regarding this. If there's still the guarantee that all rule data will be loaded at once (and before any story data), then I'm not worried at all.

wochinge · 2021-06-02T16:24:23Z

I think this doesn't work because we use the determined max_history value in loading the training stories for other policies, or not? 🤔

We do but that they can just use whatever max_history they want to use, right? The policies are not dependent on each other.

I know that RulePolicy currently doesn't do anything with chunks, but I was worried about this in terms of future versions of Rasa where

I think you made an excellent point here.

samsucik · 2021-06-03T07:53:12Z

The policies are not dependent on each other.

Hm, fair point, I didn't think about it this way. I worried about it because right now we load & de-duplicate all the Core data at once, for all policies, and so all policies receive trackers that have been de-duplicated using the same unique_last_num_states value. If each policy receives data de-duplicated based on that concrete policy's max_history, then what you're saying sounds perfect and my worry vanishes 🙂

wochinge · 2021-06-03T15:47:07Z

If each policy receives data de-duplicated based on that concrete policy's max_history, then what you're saying sounds perfect and my worry vanishes 🙂

I'm not sure that I understand this. Are you saying that all policies must have the same max_history because things would go wrong otherwise? I was under the impression that every policy gets fully trackers (deduplicated and augmented but nothing else) and than the featurizer in the policy handles featurization of the tracker based on the max_history param which we pass into it.

samsucik · 2021-06-04T07:58:51Z

I'm getting lost in my own thoughts and am making mistakes along the way, sorry. I'll try to backtrack and clarify:

I'm not against the current approach where we take the maximum max_history, use it for de-duplication of all trackers, and pass these trackers to all policies.
The same way, I'm not against each policy using its own max_history (it can even do its own de-duplication of trackers, whatever)

I'll go back to:

The policies are not dependent on each other.

I agree with this. I guess what I had in mind is the fact that we load and de-duplicate all trackers (rules, stories) together. If we can process rules vs stories a bit separately, then I can totally imagine a scenario where we:

let the non-rule policies do their things as they do today (determine the maximum max_history value > load all stories > de-duplicate using that value)
do a separate thing for RulePolicy where we load the rules > determine max_history from them > de-duplicate the rule trackers

I hope I made things a bit clearer now, sorry for the convoluted thoughts in my earlier replies...

wochinge · 2021-06-04T08:11:25Z

Yes, you did! Rule trackers are already deduplicated separately.
So your 2nd approach would be completely feasible.

I'll do some debugging later to find out where we spend so much time exactly (during deduplication or featurization) and then maybe a PR so we have something more tangible to discuss.

samsucik · 2021-06-04T08:34:18Z

Oh, no, I completely overlooked the separation in the code. I learned something new today...

wochinge · 2021-06-04T13:54:46Z

Investigation results on rasa-demo:

the contradiction checks takes the huge chunk of time (1.2 seconds for the actual training, 13 seconds for the contradiction check)
Setting a mix history for the RulePolicy doesn't change this
Configuring a max_history for the rule generation leads to contradictions 🤔
I also profiled the RulePolicy training (see results here - viewable with snakeviz). the results stress that the contradiction check takes the longest amount of time. In my opinion this clearly shows that it's not related to max_history but rather to the amount of predictions. The biggest part seems to be caused by the json.loads calls. If we actually wanna do something then we need to revisit the mechanics of the policy itself (e.g. how data is serialized) and how we can speed up predictions. This should in my opinion be done in a separate issue.

A quick fix for this issue would to use lru_cache (with max_size > 1000) on _rule_key_to_state. By doing this I lowered the contradiction check time from 13 seconds to ~4 seconds. The issue is that it rather addresses symptoms than causes.

samsucik · 2021-06-04T16:19:59Z

@wochinge wow, these are very interesting findings! Btw have you uploaded the right profiling results? In these ones, I don't see json.loads or any other expected bits anywhere...

I agree with creating a separate issue for the root cause.

As for the quick fix using caching, do you think it's safe with regards to Rasa's usage of RAM? If the usage stays below the kind of magnitudes we require for actually training the policies, then it sounds good to me and it would definitely make people's lives (mine included) easier -- way before we get to address the actual cause.

Aaand I'm especially curious about the details around this:

Configuring a max_history for the rule generation leads to contradictions 🤔

wochinge · 2021-06-07T09:59:49Z

Sorry, it was the wrong link 🙈 Updated it with the correct one 👍🏻

As for the quick fix using caching, do you think it's safe with regards to Rasa's usage of RAM?

We can set a max_size for the cache 👍🏻 It's just a couple jsons and their dict representations so I'd expect it be tiny.

So next steps:

create PR with cache (I include details about memory usage)
separate issue for overall performance?

didn't really dive deeper into it tbh. I used the max_history which was specific in the config of the AugmentedMemoizationPolicy for Sara. I guess 5 might be too short for some rules?

samsucik · 2021-06-07T15:24:29Z

I used the max_history which was specific in the config of the AugmentedMemoizationPolicy for Sara. I guess 5 might be too short for some rules?

Ah, okay, makes sense. I'm also getting such errors when I use only RulePolicy and hard-code its max_history to anything smaller than 4 🙂

wochinge · 2021-06-08T12:26:00Z

Posting the details for the memory usage here as I haven't created the PR yet and need to put the results somewhere 😄

With Sara I've a 248 cached items with 1 424 684 hits after the training.
The cache stores a mapping of call params and output. 248 items have a total size of 862 920 (88 992 + 773 928) bytes. With a max_size of 1000 items for the cache we have max memory usage of 3 479 516 bytes (3.32 MiB) which seems okay to me 👍🏻

indam23 added type:bug 🐛 Inconsistencies or issues which will cause an issue or problem for users or implementors. area:rasa-oss 🎡 Anything related to the open source Rasa framework labels Mar 25, 2021

indam23 assigned Ghostvv and unassigned Ghostvv Mar 25, 2021

Ghostvv changed the title ~~Including RulePolicy leads to infinite max_history being used to determine story conflicts~~ Including RulePolicy leads to infinite max_history being used to deduplicate training trackers Mar 25, 2021

TyDunn added the area:rasa-oss/training-data Issues focused around Rasa training data (stories, NLU, domain, etc.) label Apr 22, 2021

wochinge added the effort:atom-squad/2 Label which is used by the Rasa Atom squad to do internal estimation of task sizes. label Apr 23, 2021

TyDunn assigned Imod7 Apr 26, 2021

TyDunn assigned wochinge and unassigned Imod7 May 25, 2021

wochinge mentioned this issue Jun 8, 2021

Improve RulePolicy prediction timings #8850

Closed

1 task

wochinge mentioned this issue Jun 8, 2021

speed up rule policy contraction check #8851

Merged

4 tasks

wochinge closed this as completed in #8851 Jun 9, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Including `RulePolicy` leads to infinite max_history being used to deduplicate training trackers #8295

Including `RulePolicy` leads to infinite max_history being used to deduplicate training trackers #8295

indam23 commented Mar 25, 2021 •

edited

Loading

Ghostvv commented Mar 25, 2021

wochinge commented May 31, 2021

wochinge commented May 31, 2021

indam23 commented May 31, 2021

wochinge commented May 31, 2021

samsucik commented Jun 1, 2021

wochinge commented Jun 2, 2021

samsucik commented Jun 2, 2021

wochinge commented Jun 2, 2021

samsucik commented Jun 2, 2021

wochinge commented Jun 2, 2021

samsucik commented Jun 3, 2021

wochinge commented Jun 3, 2021

samsucik commented Jun 4, 2021

wochinge commented Jun 4, 2021

samsucik commented Jun 4, 2021

wochinge commented Jun 4, 2021 •

edited

Loading

samsucik commented Jun 4, 2021 •

edited

Loading

wochinge commented Jun 7, 2021

samsucik commented Jun 7, 2021

wochinge commented Jun 8, 2021

Including RulePolicy leads to infinite max_history being used to deduplicate training trackers #8295

Including RulePolicy leads to infinite max_history being used to deduplicate training trackers #8295

Comments

indam23 commented Mar 25, 2021 • edited Loading

Ghostvv commented Mar 25, 2021

wochinge commented May 31, 2021

wochinge commented May 31, 2021

indam23 commented May 31, 2021

wochinge commented May 31, 2021

samsucik commented Jun 1, 2021

wochinge commented Jun 2, 2021

samsucik commented Jun 2, 2021

wochinge commented Jun 2, 2021

samsucik commented Jun 2, 2021

wochinge commented Jun 2, 2021

samsucik commented Jun 3, 2021

wochinge commented Jun 3, 2021

samsucik commented Jun 4, 2021

wochinge commented Jun 4, 2021

samsucik commented Jun 4, 2021

wochinge commented Jun 4, 2021 • edited Loading

samsucik commented Jun 4, 2021 • edited Loading

wochinge commented Jun 7, 2021

samsucik commented Jun 7, 2021

wochinge commented Jun 8, 2021

Including `RulePolicy` leads to infinite max_history being used to deduplicate training trackers #8295

Including `RulePolicy` leads to infinite max_history being used to deduplicate training trackers #8295

indam23 commented Mar 25, 2021 •

edited

Loading

wochinge commented Jun 4, 2021 •

edited

Loading

samsucik commented Jun 4, 2021 •

edited

Loading