-
Notifications
You must be signed in to change notification settings - Fork 121
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: Niels Bantilan <[email protected]>
- Loading branch information
1 parent
a1dde19
commit b5005ac
Showing
7 changed files
with
194 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
FROM python:3.8-slim-buster | ||
LABEL org.opencontainers.image.source https://github.com/flyteorg/flytesnacks | ||
|
||
WORKDIR /root | ||
ENV VENV /opt/venv | ||
ENV LANG C.UTF-8 | ||
ENV LC_ALL C.UTF-8 | ||
ENV PYTHONPATH /root | ||
|
||
# This is necessary for opencv to work | ||
RUN apt-get update && apt-get install -y libsm6 libxext6 libxrender-dev ffmpeg build-essential curl | ||
|
||
WORKDIR /root | ||
|
||
ENV VENV /opt/venv | ||
# Virtual environment | ||
RUN python3 -m venv ${VENV} | ||
ENV PATH="${VENV}/bin:$PATH" | ||
|
||
# Install Python dependencies | ||
COPY requirements.in /root | ||
RUN pip install -r /root/requirements.in | ||
RUN pip freeze | ||
|
||
# Copy the actual code | ||
COPY . /root | ||
|
||
# This tag is supplied by the build script and will be used to determine the version | ||
# when registering tasks, workflows, and launch plans | ||
ARG tag | ||
ENV FLYTE_INTERNAL_IMAGE $tag |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,45 @@ | ||
(time_series_modeling)= | ||
|
||
# Time Series Modeling | ||
|
||
```{eval-rst} | ||
.. tags:: Advanced, MachineLearning | ||
``` | ||
|
||
Time series data is fundamentally different from Independent and Identically | ||
Distributed (IID) data, which is commonly used in many machine learning tasks. | ||
Here are a few key differences: | ||
|
||
1. **Temporal Dependency**: In time series data, observations are ordered | ||
chronologically and exhibit temporal dependencies. Each data point is related | ||
to its past and future values. This sequential nature is crucial for | ||
forecasting and trend analysis. In contrast, IID data assumes that each | ||
observation is independent of others. | ||
2. **Non-stationarity**: Time series often display trends, seasonality, or cyclic | ||
patterns that evolve over time. This non-stationarity means that statistical | ||
properties like mean and variance can change, making analysis more complex. IID | ||
data, by definition, maintains constant statistical properties. | ||
3. **Autocorrelation**: Time series data frequently shows autocorrelation, where | ||
an observation is correlated with its own past values. This feature is essential | ||
for many time series models but is not the case for IID data. | ||
4. **Importance of Order**: The sequence of observations in time series data is | ||
critical and cannot be shuffled without losing information. In IID data, the | ||
order of observations is assumed to be irrelevant. | ||
5. **Inference is Focused on Forecasting**: Time series analysis often aims to | ||
predict future values based on historical patterns, whereas many machine | ||
learning tasks with IID data focus on classification or regression without | ||
a temporal component. | ||
6. **Specific Modeling Techniques**: Time series data requires specialized | ||
modeling techniques like ARIMA, Prophet, or RNNs that can capture temporal | ||
dynamics. These models are not typically used with IID data. | ||
|
||
Understanding these differences is crucial for selecting appropriate analysis | ||
methods and interpreting results in time series modeling tasks. | ||
|
||
Below are examples demonstrating how to use Flyte to train time series models. | ||
|
||
## Examples | ||
|
||
```{auto-examples-toc} | ||
neural_prophet | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
flytekit>=1.7.0 | ||
wheel | ||
matplotlib | ||
flytekitplugins-deck-standard |
Empty file.
109 changes: 109 additions & 0 deletions
109
examples/time_series_modeling/time_series_modeling/neural_prophet.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,109 @@ | ||
# %% [markdown] | ||
# # Train a Neural Prophet Model | ||
# | ||
# This script demonstrates how to train a model for time series forecasting | ||
# using the [neural prophet](https://neuralprophet.com/) library. | ||
|
||
# %% [markdown] | ||
# ## Imports and Setup | ||
# | ||
# First, we import necessary libraries to run the training workflow. | ||
|
||
import pandas as pd | ||
from flytekit import current_context, task, workflow, Deck, ImageSpec | ||
from flytekit.types.file import FlyteFile | ||
|
||
# %% [markdown] | ||
# ## Define an ImageSpec | ||
# | ||
# For reproducibility, we create an `ImageSpec` object with required packages | ||
# for our tasks. | ||
|
||
image = ImageSpec( | ||
name="neuralprophet", | ||
packages=[ | ||
"neuralprophet", | ||
"matplotlib", | ||
"ipython", | ||
"pandas", | ||
"pyarrow", | ||
], | ||
# This registry is for a local flyte demo cluster. Replace this with your | ||
# own registry, e.g. `docker.io/<username>/<imagename>` | ||
registry="localhost:30000" | ||
) | ||
|
||
# %% [markdown] | ||
# ## Data Loading Task | ||
# | ||
# This task loads the time series data from the specified URL. In this case, | ||
# we use a hard-coded URL for a sample dataset that ships with the neural prophet. | ||
|
||
URL = "https://github.com/ourownstory/neuralprophet-data/raw/main/kaggle-energy/datasets/tutorial01.csv" | ||
|
||
@task(container_image=image) | ||
def load_data() -> pd.DataFrame: | ||
return pd.read_csv(URL) | ||
|
||
# %% [markdown] | ||
# ## Model Training Task | ||
# | ||
# This task trains the Neural Prophet model on the loaded data. | ||
# We train the model in the hourly frequency for ten epochs. | ||
|
||
@task(container_image=image) | ||
def train_model(df: pd.DataFrame) -> FlyteFile: | ||
from neuralprophet import NeuralProphet, save | ||
|
||
working_dir = current_context().working_directory | ||
model = NeuralProphet() | ||
model.fit(df, freq="H", epochs=10) | ||
model_fp = f"{working_dir}/model.np" | ||
save(model, model_fp) | ||
return FlyteFile(model_fp) | ||
|
||
# %% [markdown] | ||
# ## Forecasting Task | ||
# | ||
# This task loads the trained model, makes predictions, and visualizes the | ||
# results using a Flyte Deck. | ||
|
||
@task( | ||
container_image=image, | ||
enable_deck=True, | ||
) | ||
def make_forecast(df: pd.DataFrame, model_file: FlyteFile) -> pd.DataFrame: | ||
from neuralprophet import load | ||
|
||
model_file.download() | ||
model = load(model_file.path) | ||
|
||
# Create a new dataframe reaching 365 into the future | ||
# for our forecast, n_historic_predictions also shows historic data | ||
df_future = model.make_future_dataframe( | ||
df, | ||
n_historic_predictions=True, | ||
periods=365, | ||
) | ||
|
||
# Predict the future | ||
forecast = model.predict(df_future) | ||
|
||
# Plot on a Flyte Deck | ||
fig = model.plot(forecast) | ||
Deck("Forecast", fig.to_html()) | ||
|
||
return forecast | ||
|
||
# %% [markdown] | ||
# ## Main Workflow | ||
# | ||
# Finally, this workflow orchestrates the entire process: loading data, | ||
# training the model, and making forecasts. | ||
|
||
@workflow | ||
def main() -> pd.DataFrame: | ||
df = load_data() | ||
model_file = train_model(df) | ||
forecast = make_forecast(df, model_file) | ||
return forecast |