You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We should optimize method TSDataset.describe, because it can consume up to 30% of all computation time during backtest on NaiveModel with 10k segments.
Proposal
In current implementation the bottleneck is TSDataset._gather_segments_data and it should be optimized. The problem lies in per-segment iteration.
Possible solution:
Vectorization
Optimization of one iteration
Rewriting cycle using numba
As an alternative we could optimize the places where TSDataset.describe is used:
🚀 Feature Request
We should optimize method
TSDataset.describe
, because it can consume up to 30% of all computation time during backtest onNaiveModel
with 10k segments.Proposal
In current implementation the bottleneck is
TSDataset._gather_segments_data
and it should be optimized. The problem lies in per-segment iteration.Possible solution:
numba
As an alternative we could optimize the places where
TSDataset.describe
is used:BasePipeline._make_predict_timestamps
FoldMask.validate_on_dataset
Test cases
Make sure current tests pass.
Additional context
Connected issues: #1336.
The text was updated successfully, but these errors were encountered: