多变量的时间序列异常检测怎么做(python),找了几个模型代码,没看懂
22条回答 默认 最新
关注 获得0.45元问题酬金 🕖上午好,题主!目前:🟢[在线]
📢参考通义千问、由郭老师整理提供!
📣今日诗号:古来云海茫茫,八表神游,顶峰何在?行尽九洲四海,人间自有,谪仙风采。多变量时间序列异常检测 (Python)
以下是用 Python 进行多变量时间序列异常检测的解决方案和代码示例,包括多种方法:
1. 基于统计的方法
a. 多元高斯分布
import numpy as np from scipy.stats import multivariate_normal # 训练数据 X_train = np.array([[1, 2, 3], [2, 3, 4], [3, 4, 5]]) # 计算均值和协方差矩阵 mean = np.mean(X_train, axis=0) cov = np.cov(X_train.T) # 定义异常阈值 (例如,3倍标准差) threshold = 3 * np.sqrt(np.diag(cov)) # 测试数据 X_test = np.array([[10, 2, 3], [2, 3, 4], [3, 4, 10]]) # 计算马氏距离 mahalanobis_distance = np.array([multivariate_normal.mahalanobis(x, mean, cov) for x in X_test]) # 识别异常 anomalies = np.where(mahalanobis_distance > threshold)[0] print("异常索引:", anomalies)
b. 滑动窗口 + z-score
import pandas as pd # 创建示例数据 df = pd.DataFrame({'value1': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 100], 'value2': [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 101]}) # 设置滑动窗口大小 window_size = 3 # 计算滚动平均值和标准差 rolling_mean = df.rolling(window=window_size).mean() rolling_std = df.rolling(window=window_size).std() # 计算 z-score z_scores = (df - rolling_mean) / rolling_std # 设置异常阈值 threshold = 3 # 识别异常 anomalies = np.where(np.abs(z_scores) > threshold) print("异常索引:", anomalies)
2. 基于机器学习的方法
a. Isolation Forest
from sklearn.ensemble import IsolationForest from sklearn.preprocessing import StandardScaler # 训练数据 X_train = np.array([[1, 2, 3], [2, 3, 4], [3, 4, 5]]) # 标准化数据 scaler = StandardScaler() X_train = scaler.fit_transform(X_train) # 训练 Isolation Forest 模型 model = IsolationForest() model.fit(X_train) # 测试数据 X_test = np.array([[10, 2, 3], [2, 3, 4], [3, 4, 10]]) X_test = scaler.transform(X_test) # 预测异常 anomaly_scores = model.score_samples(X_test) anomalies = np.where(anomaly_scores < model.offset_)[0] print("异常索引:", anomalies)
b. One-Class SVM
from sklearn.svm import OneClassSVM # 训练数据 X_train = np.array([[1, 2, 3], [2, 3, 4], [3, 4, 5]]) # 训练 One-Class SVM 模型 model = OneClassSVM(nu=0.1) model.fit(X_train) # 测试数据 X_test = np.array([[10, 2, 3], [2, 3, 4], [3, 4, 10]]) # 预测异常 predictions = model.predict(X_test) anomalies = np.where(predictions == -1)[0] print("异常索引:", anomalies)
3. 基于深度学习的方法
a. LSTM Autoencoder
from keras.layers import Input, LSTM, RepeatVector, Dense from keras.models import Model # 定义模型参数 timesteps = 10 n_features = 3 # 定义编码器 inputs = Input(shape=(timesteps, n_features)) encoded = LSTM(10, activation='relu')(inputs) # 定义解码器 decoded = RepeatVector(timesteps)(encoded) decoded = LSTM(n_features, activation='linear', return_sequences=True)(decoded) # 创建模型 autoencoder = Model(inputs, decoded) autoencoder.compile(optimizer='adam', loss='mse') # 训练数据 (需要根据时间步长进行调整) X_train = np.random.rand(100, timesteps, n_features) # 训练模型 autoencoder.fit(X_train, X_train, epochs=10) # 测试数据 X_test = np.random.rand(10, timesteps, n_features) X_test[2, :, :] = 10 # 添加异常数据 # 预测重构误差 reconstructions = autoencoder.predict(X_test) reconstruction_errors = np.mean(np.power(X_test - reconstructions, 2), axis=1) # 设置异常阈值 threshold = np.percentile(reconstruction_errors, 95) # 识别异常 anomalies = np.where(reconstruction_errors > threshold)[0] print("异常索引:", anomalies)
以上只是一些示例代码,需要根据具体的数据和问题进行调整。在实际应用中,还需要考虑数据预处理、特征工程、模型选择和评估等问题。
解决 无用评论 打赏 举报
悬赏问题
- ¥15 android 集成sentry上报时报错。
- ¥50 win10链接MySQL
- ¥35 跳过我的世界插件ip验证
- ¥15 抖音看过的视频,缓存在哪个文件
- ¥15 自定义损失函数报输入参数的数目不足
- ¥15 如果我想学习C大家有是的的资料吗
- ¥15 根据文件名称对文件进行排序
- ¥15 deploylinux的ubuntu系统无法成功安装使用MySQL❓
- ¥15 有人会用py或者r画这种图吗
- ¥15 MOD04_3K图像预处理