Skip to content

Add in_column to plot_anomalies, plot_anomalies_interactive #618

Merged
merged 3 commits into from
Mar 23, 2022

Conversation

Mr-Geekman
Copy link
Contributor

@Mr-Geekman Mr-Geekman commented Mar 22, 2022

IMPORTANT: Please do not create a Pull Request without creating an issue first.

Before submitting (must do checklist)

  • Did you read the contribution guide?
  • Did you update the docs? We use Numpy format for all the methods and classes.
  • Did you write any new necessary tests?
  • Did you update the CHANGELOG?

Type of Change

  • Examples / docs / tutorials / contributors update
  • Bug fix (non-breaking change which fixes an issue)
  • Improvement (non-breaking change which improves an existing feature)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Proposed Changes

Look #298.

Related Issue

#298.

Closing issues

Closes #298.

@Mr-Geekman Mr-Geekman added the enhancement New feature or request label Mar 22, 2022
@Mr-Geekman Mr-Geekman self-assigned this Mar 22, 2022
@Mr-Geekman
Copy link
Contributor Author

Script for test of plot_anomalies:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

from etna.analysis import get_anomalies_density
from etna.analysis import plot_anomalies
from etna.datasets import TSDataset


def main():
    df = pd.read_csv("examples/data/example_dataset.csv", parse_dates=["timestamp"])
    df_exog = df.copy()
    df_exog.rename(columns={"target": "leak"}, inplace=True)
    df_exog["leak"] = np.log1p(df_exog["leak"])
    timestamp = df["timestamp"].sort_values().unique()
    timestamp_train = timestamp[10:-10]
    df = df[df["timestamp"].isin(timestamp_train)]
    ts = TSDataset(df=TSDataset.to_dataset(df), df_exog=TSDataset.to_dataset(df_exog), freq="D", known_future="all")

    # plot with target
    outliers = get_anomalies_density(ts=ts, distance_coef=1.0, in_column="target")
    plot_anomalies(ts=ts, in_column="target", anomaly_dict=outliers)
    plt.savefig("anomaly_target")

    # plot with not target
    outliers = get_anomalies_density(ts=ts, distance_coef=1.0, in_column="leak")
    plot_anomalies(ts=ts, in_column="leak", anomaly_dict=outliers)
    plt.savefig("anomaly_exog")


if __name__ == "__main__":
    main()

anomaly_target:
anomaly_target

anomaly_exog:
anomaly_exog

@Mr-Geekman
Copy link
Contributor Author

Utility for testing plot_anomalies_interactive:

import pathlib

import numpy as np
import pandas as pd

from etna.analysis import get_anomalies_density
from etna.analysis import plot_anomalies_interactive
from etna.datasets import TSDataset

ROOT_PATH = pathlib.Path(__file__).parent


def load_data():
    df = pd.read_csv(ROOT_PATH.joinpath("examples/data/example_dataset.csv"), parse_dates=["timestamp"])
    df_exog = df.copy()
    df_exog.rename(columns={"target": "leak"}, inplace=True)
    df_exog["leak"] = np.log1p(df_exog["leak"])
    timestamp = df["timestamp"].sort_values().unique()
    timestamp_train = timestamp[10:-10]
    df = df[df["timestamp"].isin(timestamp_train)]
    ts = TSDataset(df=TSDataset.to_dataset(df), df_exog=TSDataset.to_dataset(df_exog), freq="D", known_future="all")
    return ts


def plot_target():
    ts = load_data()
    params_bounds = {"window_size": (5, 20, 1), "distance_coef": (0.1, 3, 0.25)}
    method = get_anomalies_density
    plot_anomalies_interactive(
        ts=ts, in_column="target", segment="segment_a", method=method, params_bounds=params_bounds
    )


def plot_leak():
    ts = load_data()
    params_bounds = {"window_size": (5, 20, 1), "distance_coef": (0.1, 3, 0.25)}
    method = get_anomalies_density
    plot_anomalies_interactive(ts=ts, in_column="leak", segment="segment_a", method=method, params_bounds=params_bounds)

To use it just import plot_target and plot_leak from jupyter notebook.

@codecov-commenter
Copy link

codecov-commenter commented Mar 23, 2022

Codecov Report

Merging #618 (5b33ca4) into master (f6de083) will not change coverage.
The diff coverage is 0.00%.

@@           Coverage Diff           @@
##           master     #618   +/-   ##
=======================================
  Coverage   84.54%   84.54%           
=======================================
  Files         118      118           
  Lines        5973     5973           
=======================================
  Hits         5050     5050           
  Misses        923      923           
Impacted Files Coverage Δ
etna/analysis/plotters.py 22.14% <0.00%> (ø)

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@Mr-Geekman Mr-Geekman merged commit 1bacb46 into master Mar 23, 2022
@Mr-Geekman Mr-Geekman deleted the issue-298 branch March 23, 2022 10:53
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

in_column -> plot_anomalies
3 participants