Fix `DeepARModel` and `TFTModel` to work with changed `prediction_size` #1251

Mr-Geekman · 2023-04-28T15:35:11Z

Before submitting (must do checklist)

Did you read the contribution guide?
Did you update the docs? We use Numpy format for all the methods and classes.
Did you write any new necessary tests?
Did you update the CHANGELOG?

Proposed Changes

Keep the non-deterministic behavior of DeepARModel. If someone wants to make it deterministic he can use seed_everything.
Fix some inference tests to work with changed prediction horizon.

Closing issues

Closes #802.

github-actions · 2023-04-28T15:40:26Z

🚀 Deployed on https://deploy-preview-1251--etna-docs.netlify.app

codecov-commenter · 2023-04-28T16:25:28Z

Codecov Report

Merging #1251 (59353fc) into master (7cbc065) will increase coverage by 0.35%.
The diff coverage is 85.71%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

@@            Coverage Diff             @@
##           master    #1251      +/-   ##
==========================================
+ Coverage   87.31%   87.67%   +0.35%     
==========================================
  Files         175      175              
  Lines       10330    10330              
==========================================
+ Hits         9020     9057      +37     
+ Misses       1310     1273      -37

Impacted Files	Coverage Δ
etna/models/nn/utils.py	`85.10% <85.71%> (+3.94%)`	⬆️

... and 5 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

martins0n · 2023-05-02T14:19:52Z

tests/test_models/nn/test_deepar.py

@@ -139,19 +140,17 @@ def test_forecast_model_equals_pipeline(example_tsds):
    horizon = 10
    pfdb = _get_default_dataset_builder(horizon)

-    import torch  # TODO: remove after fix at issue-802


We don't make fix, do we?

We use seed as in the past anyway

Yes, I thought, that it isn't really the problem that is isn't deterministic. If someone needs it to be deterministic he can fix the seeds.

martins0n · 2023-05-02T14:20:34Z

tests/test_models/nn/test_deepar.py

@@ -2,6 +2,7 @@

 import pandas as pd
 import pytest
+from lightning_fabric.utilities.seed import seed_everything


It seems like we don't have that package in pyproject.toml

As I understand, it is a part of pytorch_lightning package: source. I first tried to use pytorch_lightning.utilities.seed, but it is deprecated in favor of lightning_fabric.utilities.seed.

martins0n · 2023-05-02T14:23:51Z

etna/models/nn/utils.py

+        # `TimeSeriesDataSet.from_parameters` in predict mode ignores `min_prediction_length`,
+        # and we can change prediction size only by changing `max_prediction_length`
+        dataset_params = deepcopy(self.pf_dataset_params)
+        dataset_params["max_prediction_length"] = horizon


It seems it could change the behaviour.
Have you checked both results - before changing and after?

I'll explain the core of the problem. The problem was that max_prediction_length is set during training. You can set min_predicition_length, but it is ignored during forecasting and set equal to max_prediction_length. It is how pf works.

It leads to the situation when you can't make a forecast on dataset with smaller horizon that was used during training. It expects to forecast max_prediction_length points.

About the identity of the results I'll write report below in this discussion.

I have a script

from pytorch_forecasting.data import GroupNormalizer from lightning_fabric.utilities.seed import seed_everything from etna.datasets import TSDataset from etna.datasets import generate_ar_df from etna.models.nn import DeepARModel, PytorchForecastingDatasetBuilder from etna.pipeline import Pipeline def main(): # load data df = generate_ar_df(periods=100, n_segments=3, start_time="2020-01-01", freq="D", random_seed=0) ts = TSDataset(df=TSDataset.to_dataset(df), freq="D") # fit pipeline builder = PytorchForecastingDatasetBuilder( max_encoder_length=5, max_prediction_length=5, time_varying_known_reals=["time_idx"], time_varying_unknown_reals=["target"], target_normalizer=GroupNormalizer(groups=["segment"]), ) model = DeepARModel(dataset_builder=builder, trainer_params=dict(max_epochs=5), lr=0.01) pipeline = Pipeline(model=model, horizon=5) seed_everything(0) pipeline.fit(ts) # forecast ts_forecast_1 = pipeline.forecast() print(ts_forecast_1.to_pandas(flatten=True)) if __name__ == "__main__": main()

On this branch the result is:

timestamp segment target 0 2020-04-10 segment_0 5.302200 1 2020-04-11 segment_0 5.245512 2 2020-04-12 segment_0 5.072897 3 2020-04-13 segment_0 4.956703 4 2020-04-14 segment_0 4.994010 5 2020-04-10 segment_1 8.561027 6 2020-04-11 segment_1 8.674180 7 2020-04-12 segment_1 8.962281 8 2020-04-13 segment_1 8.377350 9 2020-04-14 segment_1 8.384933 10 2020-04-10 segment_2 -6.026866 11 2020-04-11 segment_2 -6.008684 12 2020-04-12 segment_2 -5.824715 13 2020-04-13 segment_2 -6.108476 14 2020-04-14 segment_2 -5.785951

The same result is the same on the current master branch.

Code behavior isn't the same for all cases, because never version fixes inference tests. But I think this example proves that for normal scenario we haven't change logic of DeepARModel.

In this example both max_prediction_lenght and horizon are the same.
I guess difference could arise in case of horzion smaller than max_prediction_lenght in case of transformers ( maybe if there is a difference it's a bug of source library - we shouldn't have bug in case of causal transformers )

Another experiment:

from pytorch_forecasting.data import GroupNormalizer from lightning_fabric.utilities.seed import seed_everything from etna.datasets import TSDataset from etna.datasets import generate_ar_df from etna.models.nn import DeepARModel, PytorchForecastingDatasetBuilder def main(): # load data df = generate_ar_df(periods=100, n_segments=3, start_time="2020-01-01", freq="D", random_seed=0) ts = TSDataset(df=TSDataset.to_dataset(df), freq="D") # fit pipeline builder = PytorchForecastingDatasetBuilder( max_encoder_length=5, max_prediction_length=5, time_varying_known_reals=["time_idx"], time_varying_unknown_reals=["target"], target_normalizer=GroupNormalizer(groups=["segment"]), ) model = DeepARModel(dataset_builder=builder, trainer_params=dict(max_epochs=5), lr=0.01) seed_everything(0) model.fit(ts) # forecast future = ts.make_future(future_steps=3, tail_steps=model.context_size) result = model.forecast(future, prediction_size=3) print(result.to_pandas(flatten=True)) if __name__ == "__main__": main()

Current master branch: fails with error AssertionError: filters should not remove entries all entries - check encoder/decoder lengths and lags.

This branch: works fine with result:

timestamp segment target 0 2020-04-10 segment_0 5.302200 1 2020-04-11 segment_0 5.245512 2 2020-04-12 segment_0 5.072897 3 2020-04-10 segment_1 8.561027 4 2020-04-11 segment_1 8.674180 5 2020-04-12 segment_1 8.962281 6 2020-04-10 segment_2 -6.026866 7 2020-04-11 segment_2 -6.008684 8 2020-04-12 segment_2 -5.824716

The goal of this change was exactly to make it possible to make a forecast with smaller horizon. We have some inference tests that worked in a similar scenario before tsdataset-2.0 and stopped working after. So, I thought that we should make it work as before.

d.a.bunin added 2 commits April 28, 2023 17:15

fix: change handling of non-determenistic behavior of DeepARModel

f73f532

fix: fix inference tests for PF NNs

73d7d93

Mr-Geekman self-assigned this Apr 28, 2023

Mr-Geekman changed the title ~~Fix inference tests for DeepARModel and TFTModel~~ Fix DeepARModel and TFTModel to work with changed prediction_size Apr 28, 2023

chore: update changelog

a1cea10

github-actions bot temporarily deployed to pull request April 28, 2023 15:40 Inactive

martins0n reviewed May 2, 2023

View reviewed changes

Mr-Geekman requested a review from martins0n May 2, 2023 15:02

martins0n approved these changes May 2, 2023

View reviewed changes

Merge branch 'master' into issue-802

59353fc

github-actions bot temporarily deployed to pull request May 2, 2023 16:44 Inactive

Mr-Geekman merged commit 6396851 into master May 3, 2023

Mr-Geekman deleted the issue-802 branch May 3, 2023 06:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix `DeepARModel` and `TFTModel` to work with changed `prediction_size` #1251

Fix `DeepARModel` and `TFTModel` to work with changed `prediction_size` #1251

Mr-Geekman commented Apr 28, 2023 •

edited

Loading

github-actions bot commented Apr 28, 2023 •

edited

Loading

codecov-commenter commented Apr 28, 2023 •

edited

Loading

martins0n May 2, 2023

Mr-Geekman May 2, 2023

martins0n May 2, 2023

Mr-Geekman May 2, 2023

martins0n May 2, 2023

Mr-Geekman May 2, 2023

Mr-Geekman May 2, 2023

martins0n May 2, 2023

Mr-Geekman May 2, 2023

Mr-Geekman May 2, 2023 •

edited

Loading

Fix DeepARModel and TFTModel to work with changed prediction_size #1251

Fix DeepARModel and TFTModel to work with changed prediction_size #1251

Conversation

Mr-Geekman commented Apr 28, 2023 • edited Loading

Before submitting (must do checklist)

Proposed Changes

Closing issues

github-actions bot commented Apr 28, 2023 • edited Loading

codecov-commenter commented Apr 28, 2023 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Mr-Geekman May 2, 2023 • edited Loading

Choose a reason for hiding this comment

Fix `DeepARModel` and `TFTModel` to work with changed `prediction_size` #1251

Fix `DeepARModel` and `TFTModel` to work with changed `prediction_size` #1251

Mr-Geekman commented Apr 28, 2023 •

edited

Loading

github-actions bot commented Apr 28, 2023 •

edited

Loading

codecov-commenter commented Apr 28, 2023 •

edited

Loading

Mr-Geekman May 2, 2023 •

edited

Loading