Skip to content

Latest commit

 

History

History
52 lines (33 loc) · 1.33 KB

README.md

File metadata and controls

52 lines (33 loc) · 1.33 KB

K-Means

Citekey YairiEtAl2001Fault
Source own
Learning type unsupervised
Input dimensionality multivariate

Dependencies

  • python 3

Hyper Parameters

k (n_clusters)

k is the number of clusters to be fitted to the data. The bigger k is, the less noisy the anomaly scores are.

Small k (k==2) small k

Big k (k==20) big k

window_size

This parameter defines the number of data points being chunked in one window. The bigger window_size is, the bigger the anomaly context is. If it's to big, things seem anomalous that are not. If it's too small, the algorithm is not able to find anomalous windows and looses its time context. If window_size (anomaly_window_size) is smaller than the anomaly, the algorithm might only detect the transitions between normal data and anomaly.

Small window_size (window_size == 5) small p

Big window_size (window_size == 50) big p

stride

It is the step size between windows. The larger stride is, the noisier the scores get.

Small stride (stride == 1) small p

Big stride (stride == 20) big p

(Plots were made after post-processing)

Notes

KMeans automatically computes point-wise anomaly scores.