Skip to content

Commit

Permalink
Chronos : add [how to tune a forecaster model] notebook (intel-analyt…
Browse files Browse the repository at this point in the history
…ics#5399)

* add how_to_tune_a_forecaster_model notebook

* modify step1: data preparation

* update based how to train

* modify a bug of summary

* update based on comment

* move to howto

* fix typo

* add file

* update based on comments

* rename the tile
  • Loading branch information
rnwang04 authored and ForJadeForest committed Sep 20, 2022
1 parent 0da86d0 commit eda0a35
Show file tree
Hide file tree
Showing 3 changed files with 308 additions and 2 deletions.
Original file line number Diff line number Diff line change
@@ -0,0 +1,300 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/intel-analytics/BigDL/blob/main/docs/readthedocs/source/doc/Chronos/Howto/how_to_tune_forecaster_model.ipynb)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![image.png](data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAJsAAABHCAMAAAAnQ8XqAAAACXBIWXMAAA7DAAAOwwHHb6hkAAADAFBMVEVHcEyAgYR+gYU0OD85OTuOkZSChYk5OTs5OTs5OTuAgYR/gYM5OTuAgYSAgYSBgoWAgoU5OTs+NTg5OTs4ODs5OTsAccQ1NTk3Nzo4ODuBg4U4ODs5OTs4ODs5OTuRlJY6OjwAccM4ODs5OTs3Nzo4ODuBgoQ3Nzo4OTo3NzqSlJc5OTsBccOAgYSRlJeRk5Y5OTs5OTs5OTs4ODuPkpQ4ODo4ODs5OTs5OTs4ODuRlJeSlJeJio4BccM5OTsDbrw4ODuPkpU4ODuVmJs4ODsBccOTlpk2Njo3OTwBccOSlZiUmJo5OTs5OTs5OTyPkZSRk5eTlpk5OTw4ODs5OTw2Njk5OTuRlJeUl5pzi6EAccM5OTsBcMM4ODs4ODs4ODs4ODs4ODs5OTs4ODuAgYSAgYM4ODs4ODuTl5kBccM4ODo4ODs4ODs5OTs4ODs5OTs5OTuSlJc4ODs4ODs5OTs4ODuAgYSRk5aFh4o4ODucn6I4ODs4ODv7/P4BccM5OTuAgoVDQ0c4ODsBccM4ODtFf7ABcMM5OTuTlZh/goM4ODqAgYSFhomnrK4jTnCDhYeAgYUCb744OTyChIc4ODsAcMI5OTsBccM5OTt/gYSChIeFh4paW15/gYOChIcDbrs5OTv///+AgYT+/v6IiYw6OjyBgoX9/f47Oz09PT99foF+f4KDhIc5OTw8PD88PD73+Pj9/v6Oj5KCg4Z/gIMBccOJio1/gYTq6us7Oz56e3719fU6Oj2AgYV8fYCAgoR4eXzm5ud7fH/7+/s9PUDs7Ozh4uLi4uOCg4W2triJio6RkpTq6+s+PkCOj5F5en2RkpXs7O2HiIz8/P2Gh4qDhIgBdMj09PX5+fn29vaPkJP29/cBcsW1treKi46XmJvT1NV3eHs4ODqEhYh8fIDf3+Dp6emjpKYBccTb29zOz9A/P0HDw8WdnqCUlZjz8/O+v8Hv7/Dx8fGysrSvr7F9foJzdHiqq60Bdcv+/v/HyMm2t7nGxsien6Hj4+S3t7nLBRsYAAAAoHRSTlMA+wMC/QEC/vz7nyP6+nL7oAIBBFHrAQovQgQZ1xA/PAP+OvIHYnISoA438Pz+KDPv+d+EDBR/pveBMCUFI+c+Sg+tByH9HAicJCsi2S3+CRYSHCklOEo/Ggb02xjUorKptvFb/J0xZSD6dJErwpXkbkJHactLmEcjRAvQiQMZ0qkMV/I0DiG8TiId+NMuBMn7E5u0efzF5LlkvllplbYvkV0hXwAADA1JREFUaN7MmXtUFNcdx6867OyY9dTl4QPFRYNH3FjeKoLaoIgK8oqgMa2KWo3G+k7qqWliEnPS1qbtSWPb056ednZwhtmF3W1lhYC7LCIx+ABSwdpo1CiN2hiNzyRt/2jvvTM7O6996Dnafs/CDDNz3I/f3/397u/eAeB/qRgs8H8lJdDcgqp1o3T0mE2S/Tlz2cxn80vS0pdaLemzacqo1eMPW05FRUWxNc8CofLW5xflLCsHcQk86VSLpR+XS+W7youK0/Ks1nRLcXpxUWVluezm6wksQWr1iF2KqSwqKsmzQJcseSVF6ysSDYbgPZE/FrJ5CVqrR2LS3IKCyvz0NKt1qTU9Py+/MidnZshEEHx7JGyJiYmBrywvLy9Ky0+zbrNuy1uanl9UVJE4eLBwL9YwU1CiSgYppo/It1de2fVsSVHx+vySosqKuQ/xDzwKthhQMXz48N+9/fYv/vDuu6+tXbv2+1BDhw4dHqWGvvZbEPMwbCMMIwIK9YgBPHG8qal/YGCgv79//wVBTcebjsNPE/o0NQknslPxvAk9dOrD18DgSGyxgh7ct58MGzTsu1CDoIY9sH7avxYYovbNHGcOnL66cNZCpFkL3xocku0JxvdNphGqC/90oV9cI8dxjEJcoyj0WPDyoP1DRTY+JFtc0mSsIQIfCih45zfXNkxB2jDvubfAiJBsjK+5Ham3vV089va2djgYf6OMr7Vdutve69Fh0/dt4vJNm8auzkB6avevV04bB0AyYnv1uY8PCrr2q4Xh2Dz9d4+dP3b+PPo5duzY3XtfXb1x8WxDW3sr0xWw7d5/8H381N0GBOeLhi37Ou/mWUG8227nl4+Dzo0A39lzbd54pHkHN8wKw8Y5ztba1Ko9d/vTz284WruQdz5Pwx3ZvSP766P1bZo910iIMhopytjCTwTJiO3j8di28QenhGFr5up7Dtn66pQSKD642osI/EzDFemJPtsX++s50biIbN0mp3AKD2QZTVKssxT8PFo26NuBc7ZarerqoJ23OhjI4Wk4aasLXLcdEtiY6NgUF0ja5N4EhkTP1obYtFFFeLW2W72Y7T1bXeA6ZHNEw3ZZh40lSSMPE0LLZk5OjtUdb3psGAO6dKOV45pVbNA3z4P4RpI0ZSLQARpn34jZhFwYL+SCUPfM8BBrRnXaHGuWxVSGU4uZhL/qbJ+2cT4cU+mJM/vr/dHGtFP0jbR3wydIeGJs2Q1L756DU+YhTRmP2GBdKd2+vRAXGPmcpWTDJzK42tp/XuzgGAVbcLz55Gx6vaXIRpLs5n1ZuDyTBJswBryz5+u/Cvr62iyYG6NfNrrcuTWQLnvVHKSVqfMBYruv8K3v9hfYPpGtzna3vdGjjanGN6MWrUxiI/gqUIp9g5zedSD2j9+StCwWxHtbyiiK7ly9OC632+1yudyd9CTMVq9g+8vAqat3gn/X2T6/1OhR++YXSm+QLSNQYWXyXpaxvQnAi24jjYNaDYYoRv2kDN4EhyKZ6c5ag6ogRZnojCSgHW8nGc+X92x14oXaPttnlxqYSL6ZJ4wep9bo6t9LuUDw6wBYaacwm3siSMlO/eUCQal7wQv2TEhWRtJGNhd7S5Psk3ps73Nc28XAeEO+fdKqjalPxabfsf5M5htcq64S2CjX8+D1LddddixXZykyFBGh0uwkhcALbH7O0aNkY9pu2mxBtn+0qWrIGa1vMP3VSgY7c2Rs1ZDNJbGNXM5mUlgmZ9X8DJhIqC6TLE+UqdgOqNk+PCJdgEXE4Q+Vp8ygU6F8M4D0PCCxoTjKfRs5ljcKtYX07i2FFRmfO3N3s4qYNuuwnZCzHfVwzac1uRAxpjnbYPVUsom+uXdAtpYAGz/mB52U8FDLDlDjpshw441xKHw7CrsO9XiLgs2C1tYSmwuyrXSJubBE7hs/ZrRd5MndC/a5IrEpfPvAo5lP6xlVDdHO02n5aJETyFPMhmsILHtsqS4bwcK6sSYCm1853g77GE19i5inBRZ8gL6Rglc7QLyTJC+TNMFnzQfz5TGV2J6aBLYr2Joj+XaYYZofuIZYcvBhRmCud24Z62RRMhK517NBbEqQzatgeykCG6fyjWMizln6EQVgn+Ab+jKexHXC270pBYAU3ZhCtqcjxJRT5imMKaMdb2F9y7EIaCCeJ4Q5lCaMZbhVyppmhh2RjM0bhk3rm08z3nTy1IfTIUR92xnY/4oXKj7uyVFXTj8JcyIujG9LlL51aecF1XjzMRHnBUXVteaJtoH4lgCb2Max3uXVsEuL0jcm0rxw2K/1LWwulG+TTuN5oxRTEve9Rrf3aZQLIdiWPGAN4ZjmELng12GLCeSowjfS62ZxU05ntuAaEhWbp1E73tQ1JPR8qmEzgPy04CXJNzbrRQrCwZnTabLDPuThYnpSE1N17Q0/3nYFclTmG55PS7cITTllXxnRN4JdPVlgi5QLTKhc4DRsiogq2cAcYT6FE8OQuEgxpel1+mwObX2LMqaw6qbJbFOyZQs9EpzTnwGRai/JToiq9sKVfYhcaNb4lmNRBFiHDbZp9JtgUbRsPZHm02h7JFVEQ7DR5Ch9NtiHIDYWT8CIrbnr/oGI4+10VDE1gGKr0kaJzSVng2Npkd54Wz016Btm82vyVFNDfJ4ofZu7U1WGlb65RDaiVMEW6HtpulrqQ8SYNkbukaKb62OAtSAsm+ib0f0S+HaQ7ZnqFkLMzQkq3zzc/Qhzlg4bJ/a935CxxYCSYhATmY02da6RsxXuzWTJAPM0e9g89XGR2XR9K7doGhJd30ydbyjYxuQ68cRG8C/PyOBJOmwN6bhZq1hnaedTv0/DBiNaqbZNmrMoezZsgnV9YxeDLF5YnxJeOyuunUOx+TqC63rM5lflQpOH44Rd/iAbjqghhG+0qXszAKmib5Rd7hv7PbQZIQw4ghL6gcB40/PtqrQfgtbO9ZyS7cxAq6PNAcXJfdOJaDCmps6NIF4MnaqGwFXXVrwfgiufkQ7rW0PDl/8O7tXgPQflXP/RxZ4TJ86ePdEj8w1W3YLQbLCvSFjkFbcP4QX5vAB9mzybN6HFIW0k3bRivPnlbH+2vdd76cIVicSG9mpUvtX+CemjuqP1Uv8Gq266XnMe7N94niDE76VNVXK2xQBsZ3kTEs9nz8b/gxC+nfvXJ4ekvUFxj0vOFtziPFwf9K38x7qLezEXULQI0ikMJcK7JU7Wh0C2ODA6AW8qJbwBxN1Ncbw1yudT4aRWvsd1S71vKW7xy9gMwFIBYsLEVL5koFyr5P0bYksBSc8vWJC6NQnEKdnUe9HovYKEBhG+6lWwBYMtscWA9Xn6S+hSwSsZGUln2rfK+14U0+CbwtgAG46pcg2oecdg+1uPQ5ELOmxg5tIQ24OFQrkSw4pEZLqzUsB8af+NclYlrZg+fXpNTc2c1MmgcLU43pzVGt9UaLYjts/aG5nwvulWXZGNokySEImRtdPb4Tpry3W3y4U+7u7Cwha7C+9gOpNAFYU3+mH/OUZYy/TovpdBrvXZ7lxwcEwo33CeJoaMKABVnXaZ3C43n/GjUXB9GvfDOdNF1UydnODE3JneGVM34gEK82X2SNG3Q7Y6PUHg9292NKJ3Rlc0b+P6Ar6BZUtDvteemhrUio07XlgzYSoAZvVTu4XVP0nzTl44oewrYPbid0Z1Nn3dvjXw90YPZDt9W+fuSZEtZER1ZTar9oeTwWY7LL1OXGcIPCxp6r9bHwZx2+p7L/8dQwXHj7188Ofu3zUHtoBqzWXbNz84eOw4Kjh28NXqRROWCDDoyeMeKmcQRgYKCgqCmEsKFBgkd3Kyg+e6wCQoKx9xYhACzxnp6h7ahQHO7zr/cNeWzUBJMDi06xA62HVIV5dxSSpDB6WrU4QYZhyxAI9IrIUMTwCzsimDECtDJ2MKIyM/LsAIAzhku6/M1XNjoHBtniBDV/7ZBTyQCWoOTp6lZ6ODgKEJDLclhMAlPFKXzvV744lRoh1nOsnwyPo9oEn9netWKmdrQiYEHfnU+EAIBKAUH7KAGp+aGpoAisrZ1FhzBMwewVPajUTt7OyqzQu6sGTlAQRCoHBSMQUCcO6AZRgmygAblVwnqCAIYwjRJTgAjNdLil1g3K4AAAAASUVORK5CYII=)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Tune forecaster on single node\n",
"\n",
"\n",
"## Introduction\n",
"\n",
"In this guidance, we demonstrate **how to tune forecaster on single node**. In tuning process, forecaster will find the best hyperparameter combination among user-defined search space, which is a common process if users pursue a forecaster with higher accuracy.\n",
"\n",
"Chronos support forecasting model's hyperparameter tuning in 2 sepaerated APIs (i.e. `Forecaster.tune` and `AutoTSEstimator`) for users with different demands:\n",
"\n",
"| |`Forecaster.tune`|`AutoTSEstimator`|\n",
"|-------------------|:---------------:|:---------------:|\n",
"|Single Node |✓ |✓ |\n",
"|Cluster |X |✓ |\n",
"|Performance-awared Tuning|✓ |X |\n",
"|Feature Selection |X |✓ |\n",
"|Customized Model |X |✓ |\n",
"\n",
"`Forecaster.tune` provides easier and more stright-forward API for users who are familiar with Chronos forecasters, it is recommened to try this method first.\n",
"\n",
"We will take `AutoformerForecaster` and nyc_taxi dataset as an example in this guide."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Setup\n",
"\n",
"Before we begin, we need to install chronos if it isn’t already available, we choose to use pytorch as deep learning backend."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"!pip install --pre --upgrade bigdl-chronos[pytorch,automl]\n",
"!pip uninstall -y torchtext # uninstall torchtext to avoid version conflict\n",
"exit()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Data preparation\n",
"\n",
"First, we load the nyc taxi dataset.\n",
"\n",
"Currently, tune func only support **Numpy Ndarray input**."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.preprocessing import StandardScaler\n",
"from bigdl.chronos.data.repo_dataset import get_public_dataset\n",
"\n",
"def get_tsdata():\n",
" name = 'nyc_taxi'\n",
" tsdata_train, tsdata_valid, _ = get_public_dataset(name)\n",
" stand_scaler = StandardScaler()\n",
" for tsdata in [tsdata_train, tsdata_valid]:\n",
" tsdata.impute(mode=\"linear\")\\\n",
" .scale(stand_scaler, fit=(tsdata is tsdata_train))\n",
" return tsdata_train, tsdata_valid\n",
"\n",
"tsdata_train, tsdata_valid = get_tsdata()\n",
"\n",
"input_feature_num = 1\n",
"output_feature_num = 1\n",
"lookback = 20\n",
"horizon = 1\n",
"label_len = 10\n",
"\n",
"train_data = tsdata_train.roll(lookback=lookback, horizon=horizon, label_len=label_len, time_enc=True).to_numpy()\n",
"val_data = tsdata_valid.roll(lookback=lookback, horizon=horizon, label_len=label_len,time_enc=True).to_numpy()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"train_data and val_data is compose of (x, y, x_enc, y_enc) as we set `time_enc=True` which is only necessary for Autoformer."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Tuning\n",
"\n",
"The first step of tuning a forecaster is to define forecaster with space parameters.\n",
"\n",
"There are several common space choices:\n",
"\n",
"`space.Categorical` : search space for hyperparameters which are categorical, e.g. a = space.Categorical('a', 'b', 'c', 'd')\n",
"\n",
"`space.Real` : search space for numeric hyperparameter that takes continuous values, e.g. learning_rate = space.Real(0.01, 0.1, log=True)\n",
"\n",
"`space.Int` : search space for numeric hyperparameter that takes integer values, e.g. range = space.Int(0, 100)\n",
"\n",
"How to change these hyperparameters might be tricky and highly based on experience, but lr, d_model, d_ff and layers or similar parameters usually has a great impact on performance."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import bigdl.nano.automl.hpo.space as space\n",
"from bigdl.chronos.forecaster.autoformer_forecaster import AutoformerForecaster\n",
"\n",
"autoformer = AutoformerForecaster(input_feature_num=input_feature_num,\n",
" output_feature_num=output_feature_num,\n",
" past_seq_len=lookback,\n",
" future_seq_len=horizon,\n",
" label_len=label_len,\n",
" seed=1024,\n",
" freq='t',\n",
" loss=\"mse\",\n",
" metrics=['mae', 'mse', 'mape'],\n",
" lr = space.Real(0.0001, 0.1, log=True),\n",
" d_model=space.Categorical(32, 64, 128, 256),\n",
" d_ff=space.Categorical(32, 64, 128, 256),\n",
" e_layers=space.Categorical(1,2),\n",
" n_head=space.Categorical(1,8))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then just call `tune` on the training data and validation data!\n",
"\n",
"In addition to data, there are three parameters which **need** to be specified : n_trials, target_metric and direction(or directions for multi-objective HPO).\n",
"\n",
"`n_trials`: number of trials to run. The more trials, the longer the running time, the better results.\n",
"\n",
"`target_metric`: the target metric to optimize, a string or an instance of torchmetrics.metric.Metric, default to 'mse'. If you want to try a multi-objective HPO, you need to pass in a list, for example ['mse', 'latency'] in which latency is a built-in metric for performance.\n",
"\n",
"`direction`: in which direction to optimize the target metric, \"maximize\" or \"minimize\", default to \"minimize\". If you want to try a multi-objective HPO, you need to set direction=None, and specify directions which is a list containing direction for each metric, for example ['minimize', 'minimize'].\n",
"\n",
"there are other two parameters which you **may** change their default values : epochs and batch_size."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"autoformer.tune(train_data, validation_data=val_data,\n",
" n_trials=10, target_metric='mse', direction=\"minimize\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then, you can see the whole trial history by calling `search_summary()`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"autoformer.search_summary()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"After `tune`, the model parameters of autoformer is **initialized** according to the best trial parameters. You need to fit the model again."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"autoformer.fit(train_data, epochs=4, batch_size=32)\n",
"# evaluate on val set\n",
"evaluate = autoformer.evaluate(val_data)\n",
"print(evaluate)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Save and load(Optional)\n",
"\n",
"After tuning and fitting, you can save your model by calling `save` with a filename."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"autoformer.save(checkpoint_file=\"best.ckpt\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Then, when you need to load the model weights, just call `load()` with corresponding filename."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"autoformer.load(checkpoint_file=\"best.ckpt\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Or if there is a new session, just define a new forecaster with **six necessary parameters: input_feature_num, output_feature_num, past_seq_len, future_seq_len, label_len, and freq**, then `load` with corresponding filename."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"new_autoformer = AutoformerForecaster(input_feature_num=input_feature_num,\n",
" output_feature_num=output_feature_num,\n",
" past_seq_len=lookback,\n",
" future_seq_len=horizon,\n",
" label_len=label_len,\n",
" freq='s')\n",
"new_autoformer.load(checkpoint_file=\"best.ckpt\")"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.8.10 64-bit",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.10"
},
"vscode": {
"interpreter": {
"hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"
}
}
},
"nbformat": 4,
"nbformat_minor": 2
}
8 changes: 7 additions & 1 deletion docs/readthedocs/source/doc/Chronos/Howto/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,19 @@ How-to guides are bite-sized, executable examples where users could check when m

Forecasting
-------------------------
* `Train forcaster on single node <how_to_train_forecaster_on_one_node.html>`__
* `Train forecaster on single node <how_to_train_forecaster_on_one_node.html>`__

In this guidance, **we demonstrate how to train forecasters on one node**. In the training process, forecaster will learn the pattern (like the period, scale...) in history data. Although Chronos supports training on a cluster, it's highly recommeneded to try one node first before allocating a cluster to make life easier.

* `Tune forecaster on single node <how_to_tune_forecaster_model.html>`__

In this guidance, we demonstrate **how to tune forecaster on single node**. In tuning process, forecaster will find the best hyperparameter combination among user-defined search space, which is a common process if users pursue a forecaster with higher accuracy.


.. toctree::
:maxdepth: 1
:hidden:

how_to_train_forecaster_on_one_node

how_to_tune_forecaster_model
Original file line number Diff line number Diff line change
Expand Up @@ -209,7 +209,7 @@ def tune(self,
def search_summary(self):
# add tuning check
invalidOperationError(self.use_hpo, "No search summary when HPO is disabled.")
return self.trainer.search_summary()
return self.tune_trainer.search_summary()

def fit(self, data, validation_data=None, epochs=1, batch_size=32, validation_mode='output',
earlystop_patience=1, use_trial_id=None):
Expand Down

0 comments on commit eda0a35

Please sign in to comment.