Incremental training regression tests #7544

dakshvar22 · 2020-12-14T15:56:03Z

Proposed changes:

...

Status (please check what you already did):

added some tests for the functionality
updated the documentation
updated the changelog (please check changelog for instructions)
reformat files using black (please check Readme for instructions)

Co-authored-by: Joe Juzl <[email protected]>

describe `--epoch-fraction` usage

…tinuous_training

…7504) * Use fingerprinting for finetuning and add more tests * Use all training labels for fingerprinting * rename to action_names

* doc strings and changes needed to cvf * added tests, small refactoring in cvf * refactor regex featurizers and fix tests * added tests for regex featurizer, comments and doc strings * rename 'finetune_mode' parameter inside load * address review comments, make ML components inside NLU loadable in finetune mode. * try resetting default additional slots in cvf to 0, see if results go back to normal * revert default in regex also, to see if model regression tests pass * rectify how regex featurizer is loaded * revert back defaults for additional vocab params in cvf and regex * add default minimum for cvf as well * Load core model in fine-tuning mode * Core finetune loading test * Test and PR comments * Fallback to default epochs * Test policy and ensemble fine-tuning exception cases * Remove epoch_override from Policy.load * Apply suggestions from code review Co-authored-by: Tobias Wochinger <[email protected]> * review comments and add tests for loaded diet and rs * fix regex tests * use kwargs * fix * fix train tests * More test fixes * Apply suggestions from code review Co-authored-by: Daksh Varshneya <[email protected]> * remove unneeded sklearn epochs * Apply suggestions from code review Co-authored-by: Tobias Wochinger <[email protected]> * PR comments for warning strings * Add typing * add back invalid model tests * handle empty sections in config * review comments * make core models finetunable * add tests finetuning core policies * add print for loaded model * add vocabulary stats logging for cvf * code quality * review comments * reduce number of finetuning epochs in tests * Use fingerprinting for finetuning and add more tests * review comments * review comments * fix tests * Use all training labels for fingerprinting * rename to action_names Co-authored-by: Joseph Juzl <[email protected]> Co-authored-by: Tobias Wochinger <[email protected]>

…tinuous_training

* Add migration guide for policies * spelling fix * changelog

github-actions · 2020-12-14T16:01:55Z

Hey @dakshvar22! 👋 To run model regression tests, comment with the /modeltest command and a configuration.

Tips 💡: The model regression test will be run on push events. You can re-run the tests by re-add status:model-regression-tests label or use a Re-run jobs button in Github Actions workflow.

Tips 💡: Every time when you want to change a configuration you should edit the comment with the previous configuration.

You can copy this in your comment and customize:

/modeltest

```yml
##########
## Available datasets
##########
# - "Carbon Bot"
# - "Hermit"
# - "Private 1"
# - "Private 2"
# - "Private 3"
# - "Sara"

##########
## Available configurations
##########
# - "BERT + DIET(bow) + ResponseSelector(bow)"
# - "BERT + DIET(seq) + ResponseSelector(t2t)"
# - "Spacy + DIET(bow) + ResponseSelector(bow)"
# - "Spacy + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + BERT + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + BERT + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + DIET(seq) + ResponseSelector(t2t)"
# - "Sparse + Spacy + DIET(bow) + ResponseSelector(bow)"
# - "Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)"

## Example configuration
#################### syntax #################
## include:
##   - dataset: ["<dataset_name>"]
##     config: ["<configuration_name>"]
#
## Example:
## include:
##  - dataset: ["Carbon Bot"]
##    config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]
#
## Shortcut:
## You can use the "all" shortcut to include all available configurations or datasets
#
## Example: Use the "Sparse + EmbeddingIntent + ResponseSelector(bow)" configuration
## for all available datasets
## include:
##  - dataset: ["all"]
##    config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]
#
## Example: Use all available configurations for the "Carbon Bot" and "Sara" datasets
## and for the "Hermit" dataset use the "Sparse + DIET + ResponseSelector(T2T)" and
## "BERT + DIET + ResponseSelector(T2T)" configurations:
## include:
##  - dataset: ["Carbon Bot", "Sara"]
##    config: ["all"]
##  - dataset: ["Hermit"]
##    config: ["Sparse + DIET(seq) + ResponseSelector(t2t)", "BERT + DIET(seq) + ResponseSelector(t2t)"]
#
## Example: Define a branch name to check-out for a dataset repository. Default branch is 'master'
## dataset_branch: "test-branch"
## include:
##  - dataset: ["Carbon Bot", "Sara"]
##    config: ["all"]


include:
 - dataset: ["Carbon Bot"]
   config: ["Sparse + DIET(bow) + ResponseSelector(bow)"]

```

github-actions · 2020-12-14T16:01:57Z

/modeltest

include:
 - dataset: ["all"]
   config: ["Sparse + BERT + DIET(bow) + ResponseSelector(bow)", "Sparse + BERT + DIET(seq) + ResponseSelector(t2t)", "Sparse + DIET(bow) + ResponseSelector(bow)", "Sparse + DIET(seq) + ResponseSelector(t2t)", "Sparse + Spacy + DIET(bow) + ResponseSelector(bow)", "Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)"]

github-actions · 2020-12-14T16:01:59Z

The model regression tests have started. It might take a while, please be patient.
As soon as results are ready you'll see a new comment with the results.

Used configuration can be found in the comment.

github-actions · 2020-12-14T20:07:12Z

Commit: 212eff2, The full report is available as an artifact.

Dataset: Carbon Bot, Dataset repository branch: master

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + BERT + DIET(bow) + ResponseSelector(bow)` test: `1m28s`, train: `4m10s`, total: `5m37s`	0.8097 (0.02)	0.7529 (0.00)	0.5581 (`no data`)
`Sparse + BERT + DIET(seq) + ResponseSelector(t2t)` test: `1m36s`, train: `4m53s`, total: `6m28s`	0.8039 (0.01)	0.7925 (-0.00)	0.5533 (0.04)
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `27s`, train: `2m56s`, total: `3m23s`	0.7359 (0.02)	0.7529 (0.00)	0.5232 (`no data`)
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `40s`, train: `4m6s`, total: `4m45s`	0.7437 (0.01)	0.7079 (0.01)	0.5099 (0.01)

Dataset: Hermit, Dataset repository branch: master

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + BERT + DIET(bow) + ResponseSelector(bow)` test: `3m18s`, train: `20m58s`, total: `24m15s`	0.8690 (0.00)	0.7504 (0.00)	`no data`
`Sparse + BERT + DIET(seq) + ResponseSelector(t2t)` test: `2m42s`, train: `13m26s`, total: `16m7s`	0.8643 (0.00)	0.7919 (-0.00)	`no data`
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `57s`, train: `20m24s`, total: `21m21s`	0.8318 (-0.01)	0.7504 (0.00)	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `1m9s`, train: `12m35s`, total: `13m44s`	0.8346 (-0.00)	0.7503 (-0.01)	`no data`

Dataset: Private 1, Dataset repository branch: master

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `20s`, train: `3m52s`, total: `4m11s`	0.9033 (-0.00)	0.9612 (0.00)	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `37s`, train: `3m22s`, total: `3m58s`	0.8992 (-0.01)	0.9745 (-0.00)	`no data`
`Sparse + Spacy + DIET(bow) + ResponseSelector(bow)` test: `1m21s`, train: `5m11s`, total: `6m32s`	0.8929 (0.00)	0.9574 (0.00)	`no data`
`Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)` test: `1m29s`, train: `4m31s`, total: `5m59s`	0.9064 (0.00)	0.9698 (-0.00)	`no data`

Dataset: Private 2, Dataset repository branch: master

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `28s`, train: `4m1s`, total: `4m28s`	0.8519 (0.01)	`no data`	`no data`
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `35s`, train: `5m15s`, total: `5m49s`	0.8552 (0.01)	`no data`	`no data`
`Sparse + Spacy + DIET(bow) + ResponseSelector(bow)` test: `1m18s`, train: `5m3s`, total: `6m21s`	0.8412 (-0.02)	`no data`	`no data`
`Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)` test: `1m22s`, train: `7m12s`, total: `8m33s`	0.8594 (0.00)	`no data`	`no data`

Dataset: Private 3, Dataset repository branch: master

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `28s`, train: `45s`, total: `1m12s`	0.8189 (-0.01)	`no data`	`no data`
`Sparse + Spacy + DIET(bow) + ResponseSelector(bow)` test: `1m16s`, train: `1m55s`, total: `3m10s`	0.8107 (-0.02)	`no data`	`no data`
`Sparse + Spacy + DIET(seq) + ResponseSelector(t2t)` test: `1m16s`, train: `1m32s`, total: `2m48s`	0.8642 (0.01)	`no data`	`no data`

Dataset: Sara, Dataset repository branch: master

Configuration	Intent Classification Micro F1	Entity Recognition Micro F1	Response Selection Micro F1
`Sparse + BERT + DIET(bow) + ResponseSelector(bow)` test: `2m22s`, train: `8m5s`, total: `10m27s`	0.8697 (-0.00)	0.8683 (0.00)	0.8957 (0.01)
`Sparse + BERT + DIET(seq) + ResponseSelector(t2t)` test: `2m28s`, train: `4m48s`, total: `7m16s`	0.8756 (0.00)	0.8944 (-0.00)	0.9000 (0.01)
`Sparse + DIET(bow) + ResponseSelector(bow)` test: `37s`, train: `5m50s`, total: `6m27s`	0.8374 (0.01)	0.8683 (0.00)	0.8630 (0.01)
`Sparse + DIET(seq) + ResponseSelector(t2t)` test: `52s`, train: `4m5s`, total: `4m57s`	0.8433 (-0.01)	0.8523 (0.02)	0.8761 (-0.00)

joejuzl and others added 30 commits December 1, 2020 16:37

Add functionality to check if a model is fine-tunable

60e0ac1

add params for finetuning

14093aa

simplify temporary directory creation

f16e689

fix usage of deprecated asyncio.coroutine

334c607

load potential model for finetuning

71ffac4

fix types

2b4b6a9

pass in Agent / Interpreter to finetune

687ff5b

add docstrings

b76d8ea

use faster model instead of moodbot model

2e245a9

improve performance for getting model to finetune

5cfe28e

load model from directory and polish

bff637b

move test module to correct location

25b6190

test edge cases of get_models_for_finetuning

0ad6739

add docstrings

9d51715

undo not necessary changes

3f69d45

improve phrasing

08fa050

use absolute import

7527342

add documentation

b5f2037

add telemetry

2dff126

Also return correctly loaded agent

3860eba

fix typos / phrasing

7b33f58

Co-authored-by: Joe Juzl <[email protected]>

describe `

6294644

describe `--epoch-fraction` usage

use True instead of weird string

6415072

simplify by using helper to mock async things

ab250f7

de-duplicate tests

32d739c

refactor model loading

dfebe10

move debug message to correct location

b37bde8

debug CI

70094e9

remove unused param

b859861

unpack tuples for older python versions

597225d

dakshvar22 and others added 17 commits December 11, 2020 10:24

Update docs/docs/command-line-interface.mdx

d2d0a94

changelog to reflect issue number

64e4342

Merge branch 'continuous_training' of github.com:RasaHQ/rasa into con…

6505351

…tinuous_training

Integrate the finetune fingerprint checks into the train commands. (#…

5d4e466

…7504) * Use fingerprinting for finetuning and add more tests * Use all training labels for fingerprinting * rename to action_names

Merge remote-tracking branch 'origin/master' into continuous_training

cd680e3

Merge branch 'continuous_training' of github.com:RasaHQ/rasa into con…

3fe63be

…tinuous_training

Merge branch 'continuous_training' of github.com:RasaHQ/rasa into con…

d8c64dd

…tinuous_training

fix regex test

df926ec

Fix min version test

e18640d

Merge branch 'continuous_training' of github.com:RasaHQ/rasa into con…

a78c71c

…tinuous_training

fix regex tests

4a0a642

Add migration guide for policies (#7522)

903d25a

* Add migration guide for policies * spelling fix * changelog

review comments

5a1d75e

add kwarg

5ee6a58

Merge remote-tracking branch 'origin/master' into continuous_training

977de59

dummy change

8d5cbc7

dakshvar22 added status:model-regression-tests runner:gpu and removed status:model-regression-tests labels Dec 14, 2020

github-actions bot deleted a comment from dakshvar22 Dec 14, 2020

dakshvar22 mentioned this pull request Dec 14, 2020

Incremental Training inside Rasa Open Source #7498

Merged

4 tasks

github-actions bot removed status:model-regression-tests runner:gpu labels Dec 14, 2020

dakshvar22 closed this Dec 14, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incremental training regression tests #7544

Incremental training regression tests #7544

dakshvar22 commented Dec 14, 2020

github-actions bot commented Dec 14, 2020

github-actions bot commented Dec 14, 2020

github-actions bot commented Dec 14, 2020

github-actions bot commented Dec 14, 2020

Incremental training regression tests #7544

Incremental training regression tests #7544

Conversation

dakshvar22 commented Dec 14, 2020

github-actions bot commented Dec 14, 2020

github-actions bot commented Dec 14, 2020

github-actions bot commented Dec 14, 2020

github-actions bot commented Dec 14, 2020