Unsupervised Models¶
All unsupervised detectors inherit from BaseDetector
and accept 3D input (n_windows, window_size, n_features) — the direct output of
windowify(). Classical models flatten internally via features_stat(). The caller
never manages this conversion.
PCAAnomaly¶
PCA based anomaly detection.
This model learns a low dimensional subspace of nominal telemetry features using Principal Component Analysis (PCA). Anomaly scores are computed as the reconstruction error when projecting samples into the PCA subspace and back into the original space.
- class telemetry_anomdet.models.unsupervised.pca.PCAAnomaly(n_components: int | None = None, scale: bool = True, percentile: float = 95.0)[source]¶
Bases:
BaseDetectorPCA-based anomaly detector.
Accepts 3D windowed input (n_windows, window_size, n_features) and flattens internally via features_stat() before fitting PCA. The caller never needs to manage this conversion.
- Parameters:
n_components (int or None, default=None) – Number of principal components to retain. If None, all components are kept. Choose a value that retains your desired fraction of variance — inspect
model.explained_variance_ratio_after fitting.scale (bool, default=True) – Apply StandardScaler before PCA. Recommended when telemetry channels differ significantly in scale (e.g. voltage vs. temperature).
percentile (float, default = 95.0) – Percentile of training reconstruction errors used to set the default anomaly threshold. 95.0 means the top 5% most anomalous training windows are labelled as anomalies.
fit) (Attributes (set after)
--------------------------
decision_scores (np.ndarray, shape (n_windows,)) – Reconstruction errors on training data.
threshold (float) – Default anomaly cutoff derived from training scores at
percentile.labels (np.ndarray, shape (n_windows,)) – Binary anomaly labels on training data. 0 = normal, 1 = anomaly.
model (sklearn.decomposition.PCA) – Fitted PCA instance.
scaler (sklearn.preprocessing.StandardScaler or None) – Fitted scaler when
scale = True, otherwise None.
- fit(X: ndarray, y: ndarray | None = None) PCAAnomaly[source]¶
Fit PCA on nominal telemetry windows.
- Parameters:
X (np.ndarray, shape (n_windows, window_size, n_features)) – Windowed telemetry tensor from windowify().
y (ignored) – Present for API consistency.
- Returns:
self
- Return type:
KMeansAnomaly¶
K-Means clustering anomaly detection.
Each telemetry window is assigned to its nearest cluster centroid. Anomaly scores are distances to the nearest centroid. Windows far from any learned nominal operating mode score higher and are flagged as anomalies.
- class telemetry_anomdet.models.unsupervised.kmeans.KMeansAnomaly(n_clusters: int = 5, scale: bool = False, percentile: float = 95.0)[source]¶
Bases:
BaseDetectorK-Means clustering based anomaly detector.
Accepts 3D windowed input (n_windows, window_size, n_features) and flattens internally via features_stat() before clustering. The caller never needs to manage this conversion.
- Parameters:
n_clusters (int, default = 5) – Number of clusters (nominal operating modes) to learn. Each cluster represents a recurring pattern in the telemetry.
scale (bool, default = False) – Apply StandardScaler before clustering. Enable when channels differ significantly in scale so distance calculations are not dominated by high-magnitude channels.
percentile (float, default = 95.0) – Percentile of training centroid distances used to set the default anomaly threshold. 95.0 means the top 5% most distant training windows are labelled as anomalies.
fit) (Attributes (set after)
--------------------------
decision_scores (np.ndarray, shape (n_windows,)) – Distance-to-nearest-centroid scores on training data.
threshold (float) – Default anomaly cutoff derived from training scores at
percentile.labels (np.ndarray, shape (n_windows,)) – Binary anomaly labels on training data. 0 = normal, 1 = anomaly.
model (sklearn.cluster.KMeans) – Fitted KMeans instance.
centroids (np.ndarray, shape (n_clusters, n_features)) – Learned cluster centers in the (optionally scaled) feature space.
scaler (sklearn.preprocessing.StandardScaler or None) – Fitted scaler when
scale=True, otherwise None.
- fit(X: ndarray, y: ndarray | None = None) KMeansAnomaly[source]¶
Fit K-Means on nominal telemetry windows.
- Parameters:
X (np.ndarray, shape (n_windows, window_size, n_features)) – Windowed telemetry tensor from windowify().
y (ignored) – Present for API consistency.
- Returns:
self
- Return type:
- Raises:
ValueError – If n_clusters < 1 or n_clusters > n_windows.
- decision_function(X: ndarray) ndarray[source]¶
Compute distance-to-nearest-centroid scores for each window.
- Parameters:
X (np.ndarray, shape (n_windows, window_size, n_features))
- Returns:
scores – Centroid distances. Higher = more anomalous.
- Return type:
np.ndarray, shape (n_windows,)
- predict_clusters(X: ndarray) ndarray[source]¶
Assign each window to its nearest cluster.
Unique to KMeansAnomaly. Not part of the BaseDetector interface. Useful for understanding which nominal operating mode each window belongs to, independent of whether it is flagged as an anomaly.
- Parameters:
X (np.ndarray, shape (n_windows, window_size, n_features))
- Returns:
cluster_labels – Cluster index in [0, n_clusters - 1] for each window.
- Return type:
np.ndarray of int, shape (n_windows,)
IsolationForestModel¶
Note
IsolationForestModel is under active development and will be completed in Phase 1
(SMAP Integration). It will wrap sklearn.ensemble.IsolationForest as a full
BaseDetector subclass and be registered in AnomalyEnsemble alongside
PCAAnomaly and KMeansAnomaly.
Isolation Forest unsupervised anomaly detector wrapper.
This module exposes a simple class with a consistent API: - fit(X) - predict(X) -> anomaly scores (higher = more anomalous) - save / load via BaseModel
- class telemetry_anomdet.models.unsupervised.isolation_forest.IsolationForestModel[source]¶
Bases:
BaseDetectorWrapper for sklearn IsolationForest.
- Parameters:
config – Optional dict to pass to sklearn IsolationForest.