Skip to content

Optimize TSDataset.describe #1341

Closed
Mr-Geekman opened this issue Aug 1, 2023 · 0 comments · Fixed by #1344
Closed

Optimize TSDataset.describe #1341

Mr-Geekman opened this issue Aug 1, 2023 · 0 comments · Fixed by #1344
Assignees
Labels
enhancement New feature or request

Comments

@Mr-Geekman
Copy link
Contributor

Mr-Geekman commented Aug 1, 2023

🚀 Feature Request

We should optimize method TSDataset.describe, because it can consume up to 30% of all computation time during backtest on NaiveModel with 10k segments.

Proposal

In current implementation the bottleneck is TSDataset._gather_segments_data and it should be optimized. The problem lies in per-segment iteration.

Possible solution:

  • Vectorization
  • Optimization of one iteration
  • Rewriting cycle using numba

As an alternative we could optimize the places where TSDataset.describe is used:

Test cases

Make sure current tests pass.

Additional context

Connected issues: #1336.

@Mr-Geekman Mr-Geekman added the enhancement New feature or request label Aug 1, 2023
@github-project-automation github-project-automation bot moved this to Specification in etna board Aug 1, 2023
@Mr-Geekman Mr-Geekman moved this from Specification to Todo in etna board Aug 1, 2023
@Mr-Geekman Mr-Geekman self-assigned this Aug 2, 2023
@Mr-Geekman Mr-Geekman moved this from Todo to In Progress in etna board Aug 2, 2023
@Mr-Geekman Mr-Geekman moved this from In Progress to In Review in etna board Aug 2, 2023
@github-project-automation github-project-automation bot moved this from In Review to Done in etna board Aug 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

1 participant