Implement train_step instead of overwriting fit #7438

tabergma · 2020-12-02T16:04:53Z

Proposed changes:

related to https://github.com/RasaHQ/research/issues/97
(Training is currently not working due to 'fit' takes a dataset with increasing batch size tensorflow/tensorflow#41019 - we have an open PR for tensorflow. We probably copy over some code in the meantime.)

Status (please check what you already did):

added some tests for the functionality
updated the documentation
updated the changelog (please check changelog for instructions)
reformat files using black (please check Readme for instructions)

rasa/core/policies/ted_policy.py

rasa/utils/tensorflow/models.py

rasa/utils/train_utils.py

rasa/core/policies/ted_policy.py

rasa/utils/tensorflow/callback.py

rasa/utils/tensorflow/data_generator.py

rasa/utils/tensorflow/models.py

tabergma · 2020-12-11T07:56:02Z

The results of the model regression tests do not match the results I got when I'm testing locally. Need to investigate that further.

joejuzl · 2021-01-20T15:56:17Z

I profiled the memory usage while running tests/core/test_policies.py on master, and on this branch.

master:

train_step (commit 4ea8d8 - current head )

train_step (commit 07d63c - before keras clear_session)

Observations:

Takes much longer to run on train_step.
Way higher memory usage on train_step, even at the same relative point in time.
The Keras clear_session seemed to help somewhat.
All show a continuous increase in memory usage, suggesting a memory leak.

Ghostvv · 2021-02-02T11:36:56Z

@joejuzl could you please give it a final review?

.github/workflows/continous-integration.yml

rasa/core/policies/ted_policy.py

joejuzl · 2021-03-03T08:15:11Z

#8085 should help to reduce the memory usage of the policy tests and stop the CI tests failing.

Ghostvv · 2021-03-09T09:55:51Z

@joejuzl thank you for updating the tests. Could you please merge your changes into this branch? There are some merge conflicts

Ghostvv reviewed Dec 2, 2020

View reviewed changes

rasa/core/policies/ted_policy.py Outdated Show resolved Hide resolved

tabergma marked this pull request as ready for review December 7, 2020 09:45

tabergma added status:model-regression-tests runner:gpu labels Dec 7, 2020

tabergma requested a review from Ghostvv December 7, 2020 09:48

tabergma added status:model-regression-tests and removed status:model-regression-tests labels Dec 7, 2020

github-actions bot deleted a comment from tabergma Dec 7, 2020

Ghostvv reviewed Dec 7, 2020

View reviewed changes

tabergma removed runner:gpu status:model-regression-tests labels Dec 8, 2020

Ghostvv reviewed Dec 8, 2020

View reviewed changes

rasa/utils/tensorflow/models.py Outdated Show resolved Hide resolved

tabergma added status:model-regression-tests runner:gpu labels Dec 8, 2020

tabergma requested a review from Ghostvv December 8, 2020 14:18

github-actions bot removed status:model-regression-tests runner:gpu labels Dec 8, 2020

Ghostvv mentioned this pull request Dec 9, 2020

Attention weight logging #5673

Merged

4 tasks

tabergma added status:model-regression-tests runner:gpu labels Dec 10, 2020

tabergma mentioned this pull request Dec 10, 2020

Implement train_in_chunks on classifiers #7394

Closed

github-actions bot removed status:model-regression-tests runner:gpu labels Dec 10, 2020

tabergma added the status:model-regression-tests label Dec 11, 2020

github-actions bot deleted a comment from tabergma Dec 11, 2020

github-actions bot removed the status:model-regression-tests label Dec 11, 2020

tabergma added the status:model-regression-tests label Dec 11, 2020

github-actions bot removed the status:model-regression-tests label Dec 11, 2020

tabergma added the status:model-regression-tests label Dec 11, 2020

Base automatically changed from master to main January 22, 2021 11:15

joejuzl mentioned this pull request Jan 27, 2021

Create CI check to detect training memory leaks #7827

Closed

3 tasks

Ghostvv and others added 6 commits January 27, 2021 14:48

merge master

483fbdb

Merge branch 'main' into train_step

4d50ccd

implement custom rasa predict, because keras one is leaky

58f6a92

Merge branch 'main' into train_step

a57b42a

add docstrings

7d6e7e7

Disable verbose mode for pytest

4a9a596

Ghostvv requested a review from joejuzl February 2, 2021 12:13

joejuzl reviewed Feb 2, 2021

View reviewed changes

.github/workflows/continous-integration.yml Outdated Show resolved Hide resolved

rasa/core/policies/ted_policy.py Outdated Show resolved Hide resolved

add windows tests back

327318a

joejuzl approved these changes Feb 2, 2021

View reviewed changes

Ghostvv and others added 3 commits February 2, 2021 18:26

Merge branch 'main' into train_step

0470773

Merge branch 'main' into train_step

e9eb4fc

merge master

80c7219

Ghostvv mentioned this pull request Feb 12, 2021

reduce amount of memory required for our unit tests that are run on PRs #7947

Closed

Ghostvv and others added 2 commits February 12, 2021 17:12

Merge branch 'main' into train_step

c8aa42d

Merge branch 'main' into train_step

0ffdb1b

joejuzl and others added 4 commits March 9, 2021 13:51

Merge remote-tracking branch 'origin/main' into train_step

ba47c96

fix merge conflicts

6463251

Merge branch 'main' into train_step

4cd5548

Merge branch 'main' into train_step

4054bf7

Ghostvv enabled auto-merge (squash) March 10, 2021 13:06

Ghostvv merged commit 46c1a97 into main Mar 10, 2021

Ghostvv deleted the train_step branch March 10, 2021 13:48

dakshvar22 mentioned this pull request Apr 30, 2021

proof of concept for streaming messages in nlu training #8518

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement train_step instead of overwriting fit #7438

Implement train_step instead of overwriting fit #7438

tabergma commented Dec 2, 2020 •

edited

Loading

tabergma commented Dec 11, 2020 •

edited

Loading

joejuzl commented Jan 20, 2021

Ghostvv commented Feb 2, 2021

joejuzl commented Mar 3, 2021

Ghostvv commented Mar 9, 2021

Implement train_step instead of overwriting fit #7438

Implement train_step instead of overwriting fit #7438

Conversation

tabergma commented Dec 2, 2020 • edited Loading

tabergma commented Dec 11, 2020 • edited Loading

joejuzl commented Jan 20, 2021

master:

train_step (commit 4ea8d8 - current head )

train_step (commit 07d63c - before keras clear_session)

Observations:

Ghostvv commented Feb 2, 2021

joejuzl commented Mar 3, 2021

Ghostvv commented Mar 9, 2021

tabergma commented Dec 2, 2020 •

edited

Loading

tabergma commented Dec 11, 2020 •

edited

Loading