Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

forecast about autoregression #9

Open
reoml opened this issue Jul 31, 2024 · 5 comments
Open

forecast about autoregression #9

reoml opened this issue Jul 31, 2024 · 5 comments

Comments

@reoml
Copy link

reoml commented Jul 31, 2024

What I do: I have trained a data file and get a timer model for MS task. I wanted to use another file to test the result by this model so I change the dataset to read two file and change the index for test file,then set the is_finetuning to zeros.

What I think: However, I set all target to zeros.(I think autoregression by mask will be used in the test prediction). I found that the prediction still depending on labels. Is there a mistake in my understanding of the decoding of transformer? How can I make auto regression in Timer without using label?

@ZDandsomSP
Copy link
Collaborator

Hello, in the inference phase of Timer's autoregression, only the lookback window part is needed. During inference, Timer will provide the scrolling result of each segment in the lookback window in parallel, and we only need the result of the last segment as the final result. This process may differ from the autoregressive rolling in language models, but it is consistent in overall logic.

@reoml
Copy link
Author

reoml commented Aug 6, 2024

Hello, in the inference phase of Timer's autoregression, only the lookback window part is needed. During inference, Timer will provide the scrolling result of each segment in the lookback window in parallel, and we only need the result of the last segment as the final result. This process may differ from the autoregressive rolling in language models, but it is consistent in overall logic.

My understanding of autoregression is as follows: I have initial data features and sequence length labels. I use this initial data to predict the next data label, then shift the time step forward, using the new predicted label, new features, and the previous labels and features (minus one because we predicted a new label) to predict the next label. This operates like a sliding window, as you mentioned. However, after reviewing the source code, I noticed the following: the batch size is set to 1, and the code loops over batches, updating the predicted labels within each batch and then embedding them into X. This means that the predicted results from the current batch cannot be used for the next batch. Additionally, each batch inputs the labels from X. If masking is applied within the model, the model might not see the labels, which might not be a big issue. However, when I set the label data to all zeros, the predictions turned into a straight line. This indicates that the model is using label data during the test inference phase. I am not sure where my understanding is wrong, and I don't quite understand why the inference process code is written this way, which is why I am asking this question.
6EBEB9CE-7B87-4C2D-8F68-57DB8082BB44-92126-00001D1B5A796882
4C2F0B99-B4B8-4235-9162-745134FEE0DD-92126-00001D1B6321080F

@WenWeiTHU
Copy link
Collaborator

Sorry, I'm a bit confused. Why do autoregressive predictions need to use data from other batches in the mentioned scenario?

@JamesGOAT
Copy link

I contend that the source of confusion lies in the fact that pred_len and patch_len essentially serve as redundant parameters. In the model's implementation, each forward pass produces predictions for the subsequent patch_len timestamps. Conversely, within the dataloader, batch_x and batch_y are separated by a span of pred_len timestamps. A discrepancy in the sizes of pred_len and patch_len introduces logical inconsistencies. It should be noted that in scripts related to forecasting, both parameters are set to the same value.

@WenWeiTHU
Copy link
Collaborator

We released a new codebase OpenLTM, which contains a more detailed pipeline to pre-train and inference with large time-series models. A refined autoregression pipeline is also provided here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants