Unsupervised Models

All unsupervised detectors inherit from BaseDetector and accept 3D input (n_windows, window_size, n_features) — the direct output of windowify(). Classical models flatten internally via features_stat(). The caller never manages this conversion.

PCAAnomaly

PCA based anomaly detection.

This model learns a low dimensional subspace of nominal telemetry features using Principal Component Analysis (PCA). Anomaly scores are computed as the reconstruction error when projecting samples into the PCA subspace and back into the original space.

class telemetry_anomdet.models.unsupervised.pca.PCAAnomaly(n_components: int | None = None, scale: bool = True, percentile: float = 95.0)[source]

Bases: BaseDetector

PCA-based anomaly detector.

Accepts 3D windowed input (n_windows, window_size, n_features) and flattens internally via features_stat() before fitting PCA. The caller never needs to manage this conversion.

Parameters:
  • n_components (int or None, default=None) – Number of principal components to retain. If None, all components are kept. Choose a value that retains your desired fraction of variance — inspect model.explained_variance_ratio_ after fitting.

  • scale (bool, default=True) – Apply StandardScaler before PCA. Recommended when telemetry channels differ significantly in scale (e.g. voltage vs. temperature).

  • percentile (float, default = 95.0) – Percentile of training reconstruction errors used to set the default anomaly threshold. 95.0 means the top 5% most anomalous training windows are labelled as anomalies.

  • fit) (Attributes (set after)

  • --------------------------

  • decision_scores (np.ndarray, shape (n_windows,)) – Reconstruction errors on training data.

  • threshold (float) – Default anomaly cutoff derived from training scores at percentile.

  • labels (np.ndarray, shape (n_windows,)) – Binary anomaly labels on training data. 0 = normal, 1 = anomaly.

  • model (sklearn.decomposition.PCA) – Fitted PCA instance.

  • scaler (sklearn.preprocessing.StandardScaler or None) – Fitted scaler when scale = True, otherwise None.

fit(X: ndarray, y: ndarray | None = None) PCAAnomaly[source]

Fit PCA on nominal telemetry windows.

Parameters:
  • X (np.ndarray, shape (n_windows, window_size, n_features)) – Windowed telemetry tensor from windowify().

  • y (ignored) – Present for API consistency.

Returns:

self

Return type:

PCAAnomaly

decision_function(X: ndarray) ndarray[source]

Compute reconstruction error for each window.

Parameters:

X (np.ndarray, shape (n_windows, window_size, n_features))

Returns:

scores – Reconstruction errors. Higher = more anomalous.

Return type:

np.ndarray, shape (n_windows,)

KMeansAnomaly

K-Means clustering anomaly detection.

Each telemetry window is assigned to its nearest cluster centroid. Anomaly scores are distances to the nearest centroid. Windows far from any learned nominal operating mode score higher and are flagged as anomalies.

class telemetry_anomdet.models.unsupervised.kmeans.KMeansAnomaly(n_clusters: int = 5, scale: bool = False, percentile: float = 95.0)[source]

Bases: BaseDetector

K-Means clustering based anomaly detector.

Accepts 3D windowed input (n_windows, window_size, n_features) and flattens internally via features_stat() before clustering. The caller never needs to manage this conversion.

Parameters:
  • n_clusters (int, default = 5) – Number of clusters (nominal operating modes) to learn. Each cluster represents a recurring pattern in the telemetry.

  • scale (bool, default = False) – Apply StandardScaler before clustering. Enable when channels differ significantly in scale so distance calculations are not dominated by high-magnitude channels.

  • percentile (float, default = 95.0) – Percentile of training centroid distances used to set the default anomaly threshold. 95.0 means the top 5% most distant training windows are labelled as anomalies.

  • fit) (Attributes (set after)

  • --------------------------

  • decision_scores (np.ndarray, shape (n_windows,)) – Distance-to-nearest-centroid scores on training data.

  • threshold (float) – Default anomaly cutoff derived from training scores at percentile.

  • labels (np.ndarray, shape (n_windows,)) – Binary anomaly labels on training data. 0 = normal, 1 = anomaly.

  • model (sklearn.cluster.KMeans) – Fitted KMeans instance.

  • centroids (np.ndarray, shape (n_clusters, n_features)) – Learned cluster centers in the (optionally scaled) feature space.

  • scaler (sklearn.preprocessing.StandardScaler or None) – Fitted scaler when scale=True, otherwise None.

fit(X: ndarray, y: ndarray | None = None) KMeansAnomaly[source]

Fit K-Means on nominal telemetry windows.

Parameters:
  • X (np.ndarray, shape (n_windows, window_size, n_features)) – Windowed telemetry tensor from windowify().

  • y (ignored) – Present for API consistency.

Returns:

self

Return type:

KMeansAnomaly

Raises:

ValueError – If n_clusters < 1 or n_clusters > n_windows.

decision_function(X: ndarray) ndarray[source]

Compute distance-to-nearest-centroid scores for each window.

Parameters:

X (np.ndarray, shape (n_windows, window_size, n_features))

Returns:

scores – Centroid distances. Higher = more anomalous.

Return type:

np.ndarray, shape (n_windows,)

predict_clusters(X: ndarray) ndarray[source]

Assign each window to its nearest cluster.

Unique to KMeansAnomaly. Not part of the BaseDetector interface. Useful for understanding which nominal operating mode each window belongs to, independent of whether it is flagged as an anomaly.

Parameters:

X (np.ndarray, shape (n_windows, window_size, n_features))

Returns:

cluster_labels – Cluster index in [0, n_clusters - 1] for each window.

Return type:

np.ndarray of int, shape (n_windows,)

IsolationForestModel

Note

IsolationForestModel is under active development and will be completed in Phase 1 (SMAP Integration). It will wrap sklearn.ensemble.IsolationForest as a full BaseDetector subclass and be registered in AnomalyEnsemble alongside PCAAnomaly and KMeansAnomaly.

Isolation Forest unsupervised anomaly detector wrapper.

This module exposes a simple class with a consistent API: - fit(X) - predict(X) -> anomaly scores (higher = more anomalous) - save / load via BaseModel

class telemetry_anomdet.models.unsupervised.isolation_forest.IsolationForestModel[source]

Bases: BaseDetector

Wrapper for sklearn IsolationForest.

Parameters:

config – Optional dict to pass to sklearn IsolationForest.