fix: Chronos inference in foundation ts arena #382

abdulfatir · 2024-06-03T18:58:34Z

Thank you for evaluating Chronos again. It's great to see it performing accurately on this benchmark as well.

We found some problems with the way inference is being done for Chronos:

Excess NaN padding was being applied to short time series which is not required and would slow down the model significantly.
The original time series were being casted to bfloat16 which results in loss of information and may lead to poor accuracy.

This PR fixes these issues. The following table shows a comparison of Chronos (Large)'s performance before (taken from the original table in this repo) and after these fixes, and also reports the performance of other variants of Chronos. These experiments were performed on a g5.4xlarge instance, as in the original study.

	Accuracy				Inference Time
	Monthly	Weekly	Daily	Hourly	Monthly	Weekly	Daily	Hourly
Chronos-Large (Before)	0.960	0.709	0.652	0.735	38.581	5.081	7.908	11.662
Chronos-Large	0.950	0.704	0.652	0.654	5.402	5.054	7.882	11.500
Chronos-Base	0.966	0.709	0.663	0.646	1.966	1.712	2.940	4.714
Chronos-Small	0.982	0.724	0.669	0.671	0.689	0.550	0.986	1.818
Chronos-Mini	0.968	0.736	0.682	0.729	0.476	0.356	0.688	1.371
Chronos-Tiny	0.976	0.765	0.686	0.799	0.316	0.212	0.427	0.965

We observe:

improvements in the MASE for Monthly (~1%) and Hourly (~11%) datasets.
a significant improvement (~38mins to ~5mins) in the inference time for the Monthly subset which has many very short time series.
smaller Chronos models provide a quality-speed trade-off with the Base model performing almost as well as Large while being much faster, and even the mini model performing better than most baselines in the original study.

Here's how the average MASE ranking plots look like before and after the fix:

After the fix, Chronos-Large achieves the best overall rank (center plot). Chronos-Base obtains the same overall ranking as TimesFM and TimeGPT (right plot).

For the fidelity of the study, we recommend that the authors update their results and discussions accordingly, ideally after an independent verification with the latest code change (see usage below). Thank you again for your effort!

Usage

Download data and setup environment as described here.
Run python eval-chronos.py to re-evaluate (only) Chronos.

CLAassistant · 2024-06-03T18:58:43Z

All committers have signed the CLA.

abdulfatir · 2024-06-07T16:43:56Z

@AzulGarza @cchallu @mergenthaler Did you get a chance to take a look at this? I hope the main results in the repo can be updated soon so people do not get an inaccurate impression.

AzulGarza · 2024-07-04T14:55:43Z

hey @abdulfatir! thank you. could you please sign the CLA?

abdulfatir · 2024-07-04T15:11:03Z

@AzulGarza thanks for your reply. Signed.

abdulfatir · 2024-07-16T08:56:36Z

@AzulGarza @mergenthaler @cchallu Any update on this?

fix: Chronos inference in foundation ts arena

6c645aa

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Chronos inference in foundation ts arena #382

fix: Chronos inference in foundation ts arena #382

abdulfatir commented Jun 3, 2024 •

edited

Loading

CLAassistant commented Jun 3, 2024 •

edited

Loading

abdulfatir commented Jun 7, 2024

AzulGarza commented Jul 4, 2024

abdulfatir commented Jul 4, 2024

abdulfatir commented Jul 16, 2024

fix: Chronos inference in foundation ts arena #382

Are you sure you want to change the base?

fix: Chronos inference in foundation ts arena #382

Conversation

abdulfatir commented Jun 3, 2024 • edited Loading

Usage

CLAassistant commented Jun 3, 2024 • edited Loading

abdulfatir commented Jun 7, 2024

AzulGarza commented Jul 4, 2024

abdulfatir commented Jul 4, 2024

abdulfatir commented Jul 16, 2024

abdulfatir commented Jun 3, 2024 •

edited

Loading

CLAassistant commented Jun 3, 2024 •

edited

Loading