机器学习算法测试用例¶
测试目标: 验证机器学习算法的功能和性能 测试类型: 单元测试、集成测试、性能测试 涉及组件: 分类算法、回归算法、聚类算法、评估指标
📋 测试概述¶
测试目标¶
- 功能测试: 验证算法的正确性
- 性能测试: 评估算法的计算效率
- 鲁棒性测试: 测试算法对异常数据的稳定性
- 对比测试: 比较不同算法的性能
测试环境¶
- Python版本: 3.8+
- 机器学习框架: scikit-learn
- 测试框架: pytest
- 数值计算: NumPy
🧪 测试用例列表¶
1. 分类算法测试¶
测试用例1.1: 逻辑回归¶
测试目标: 验证逻辑回归算法
测试代码:
Python
import pytest
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification
from sklearn.metrics import accuracy_score
def test_logistic_regression():
"""测试逻辑回归"""
# 创建测试数据
X, y = make_classification(
n_samples=100,
n_features=10,
n_classes=2,
random_state=42,
)
# 划分数据
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# 训练模型
model = LogisticRegression(random_state=42)
model.fit(X_train, y_train)
# 预测
y_pred = model.predict(X_test)
# 验证
accuracy = accuracy_score(y_test, y_pred)
assert accuracy > 0.7, f"准确率过低: {accuracy}"
# 验证概率输出
y_proba = model.predict_proba(X_test)
assert y_proba.shape == (len(y_test), 2)
assert np.allclose(y_proba.sum(axis=1), 1.0), "概率和不为1"
print(f"✓ 逻辑回归测试通过 (准确率: {accuracy:.2%})")
预期结果: 准确率>70%,概率和为1
测试用例1.2: 决策树¶
测试目标: 验证决策树算法
测试代码:
Python
def test_decision_tree():
"""测试决策树"""
from sklearn.tree import DecisionTreeClassifier
# 创建测试数据
X, y = make_classification(
n_samples=100,
n_features=10,
n_classes=3,
random_state=42,
)
# 划分数据
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# 训练模型
model = DecisionTreeClassifier(random_state=42, max_depth=5)
model.fit(X_train, y_train)
# 预测
y_pred = model.predict(X_test)
# 验证
accuracy = accuracy_score(y_test, y_pred)
assert accuracy > 0.6, f"准确率过低: {accuracy}"
# 验证树深度
assert model.get_depth() <= 5, "树深度超过限制"
# 验证特征重要性
importance = model.feature_importances_
assert len(importance) == 10
assert np.allclose(importance.sum(), 1.0), "特征重要性和不等于1"
print(f"✓ 决策树测试通过 (准确率: {accuracy:.2%}, 树深度: {model.get_depth()})")
预期结果: 准确率>60%,树深度≤5
测试用例1.3: 随机森林¶
测试目标: 验证随机森林算法
测试代码:
Python
def test_random_forest():
"""测试随机森林"""
from sklearn.ensemble import RandomForestClassifier
# 创建测试数据
X, y = make_classification(
n_samples=200,
n_features=20,
n_classes=2,
random_state=42,
)
# 划分数据
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# 训练模型
model = RandomForestClassifier(
n_estimators=100,
random_state=42,
n_jobs=-1,
)
model.fit(X_train, y_train)
# 预测
y_pred = model.predict(X_test)
# 验证
accuracy = accuracy_score(y_test, y_pred)
assert accuracy > 0.8, f"准确率过低: {accuracy}"
# 验证树的数量
assert len(model.estimators_) == 100, "树的数量不正确"
# 验证特征重要性
importance = model.feature_importances_
assert len(importance) == 20
assert np.allclose(importance.sum(), 1.0), "特征重要性和不等于1"
print(f"✓ 随机森林测试通过 (准确率: {accuracy:.2%})")
预期结果: 准确率>80%
测试用例1.4: 支持向量机¶
测试目标: 验证SVM算法
测试代码:
Python
def test_svm():
"""测试支持向量机"""
from sklearn.svm import SVC
# 创建测试数据
X, y = make_classification(
n_samples=100,
n_features=10,
n_classes=2,
random_state=42,
)
# 划分数据
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# 训练模型
model = SVC(kernel='rbf', random_state=42)
model.fit(X_train, y_train)
# 预测
y_pred = model.predict(X_test)
# 验证
accuracy = accuracy_score(y_test, y_pred)
assert accuracy > 0.7, f"准确率过低: {accuracy}"
# 验证支持向量
assert len(model.support_) > 0, "没有支持向量"
assert len(model.support_) <= len(X_train), "支持向量数量超过训练样本"
print(f"✓ SVM测试通过 (准确率: {accuracy:.2%}, 支持向量数: {len(model.support_)})")
预期结果: 准确率>70%
2. 回归算法测试¶
测试用例2.1: 线性回归¶
测试目标: 验证线性回归算法
测试代码:
Python
def test_linear_regression():
"""测试线性回归"""
from sklearn.linear_model import LinearRegression
from sklearn.datasets import make_regression
from sklearn.metrics import mean_squared_error, r2_score
# 创建测试数据
X, y = make_regression(
n_samples=100,
n_features=5,
noise=0.1,
random_state=42,
)
# 划分数据
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# 训练模型
model = LinearRegression()
model.fit(X_train, y_train)
# 预测
y_pred = model.predict(X_test)
# 验证
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
assert r2 > 0.5, f"R²过低: {r2}"
# 验证系数
assert len(model.coef_) == 5, "系数数量不正确"
assert model.intercept_ is not None, "截距不存在"
print(f"✓ 线性回归测试通过 (R²: {r2:.4f}, MSE: {mse:.4f})")
预期结果: R²>0.5
测试用例2.2: Ridge回归¶
测试目标: 验证Ridge回归
测试代码:
Python
def test_ridge_regression():
"""测试Ridge回归"""
from sklearn.linear_model import Ridge
# 创建测试数据
X, y = make_regression(
n_samples=100,
n_features=10,
noise=0.1,
random_state=42,
)
# 划分数据
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# 训练模型
model = Ridge(alpha=1.0, random_state=42)
model.fit(X_train, y_train)
# 预测
y_pred = model.predict(X_test)
# 验证
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
assert r2 > 0.5, f"R²过低: {r2}"
# 验证正则化效果
# Ridge的系数应该比普通线性回归小
from sklearn.linear_model import LinearRegression
lr = LinearRegression()
lr.fit(X_train, y_train)
ridge_norm = np.linalg.norm(model.coef_) # np.linalg线性代数运算
lr_norm = np.linalg.norm(lr.coef_)
assert ridge_norm < lr_norm, "Ridge正则化未生效"
print(f"✓ Ridge回归测试通过 (R²: {r2:.4f}, 系数范数: {ridge_norm:.4f})")
预期结果: R²>0.5,系数范数小于普通线性回归
测试用例2.3: Lasso回归¶
测试目标: 验证Lasso回归
测试代码:
Python
def test_lasso_regression():
"""测试Lasso回归"""
from sklearn.linear_model import Lasso
# 创建测试数据
X, y = make_regression(
n_samples=100,
n_features=10,
n_informative=5,
noise=0.1,
random_state=42,
)
# 划分数据
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# 训练模型
model = Lasso(alpha=0.1, random_state=42)
model.fit(X_train, y_train)
# 预测
y_pred = model.predict(X_test)
# 验证
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
assert r2 > 0.5, f"R²过低: {r2}"
# 验证稀疏性
# Lasso应该产生稀疏系数
zero_coef_count = np.sum(np.abs(model.coef_) < 1e-10)
assert zero_coef_count > 0, "Lasso未产生稀疏系数"
print(f"✓ Lasso回归测试通过 (R²: {r2:.4f}, 零系数数: {zero_coef_count})")
预期结果: R²>0.5,产生稀疏系数
3. 聚类算法测试¶
测试用例3.1: K-Means¶
测试目标: 验证K-Means聚类
测试代码:
Python
def test_kmeans():
"""测试K-Means聚类"""
from sklearn.cluster import KMeans
from sklearn.datasets import make_blobs
from sklearn.metrics import adjusted_rand_score
# 创建测试数据
X, y_true = make_blobs(
n_samples=300,
centers=4,
cluster_std=0.60,
random_state=42,
)
# 训练模型
model = KMeans(n_clusters=4, random_state=42, n_init=10)
y_pred = model.fit_predict(X)
# 验证
ari = adjusted_rand_score(y_true, y_pred)
assert ari > 0.8, f"ARI过低: {ari}"
# 验证聚类中心
assert model.cluster_centers_.shape == (4, 2), "聚类中心维度错误"
# 验证标签
assert len(np.unique(y_pred)) == 4, "聚类数量不正确"
print(f"✓ K-Means测试通过 (ARI: {ari:.4f})")
预期结果: ARI>0.8
测试用例3.2: DBSCAN¶
测试目标: 验证DBSCAN聚类
测试代码:
Python
def test_dbscan():
"""测试DBSCAN聚类"""
from sklearn.cluster import DBSCAN
from sklearn.datasets import make_moons
# 创建测试数据
X, y_true = make_moons(n_samples=300, noise=0.05, random_state=42)
# 训练模型
model = DBSCAN(eps=0.3, min_samples=5)
y_pred = model.fit_predict(X)
# 验证
ari = adjusted_rand_score(y_true, y_pred)
assert ari > 0.8, f"ARI过低: {ari}"
# 验证噪声点
n_noise = np.sum(y_pred == -1)
assert n_noise >= 0, "噪声点数量异常"
# 验证聚类数量
n_clusters = len(set(y_pred)) - (1 if -1 in y_pred else 0)
assert n_clusters == 2, f"聚类数量不正确: {n_clusters}"
print(f"✓ DBSCAN测试通过 (ARI: {ari:.4f}, 噪声点数: {n_noise})")
预期结果: ARI>0.8,识别出2个聚类
4. 评估指标测试¶
测试用例4.1: 分类评估指标¶
测试目标: 验证分类评估指标
测试代码:
Python
def test_classification_metrics():
"""测试分类评估指标"""
from sklearn.metrics import (
accuracy_score, precision_score, recall_score,
f1_score, confusion_matrix, roc_auc_score,
)
# 创建测试数据
y_true = np.array([0, 1, 0, 1, 0, 1, 0, 1])
y_pred = np.array([0, 1, 0, 0, 0, 1, 1, 1])
y_proba = np.array([0.1, 0.9, 0.2, 0.4, 0.3, 0.8, 0.7, 0.6])
# 计算指标
accuracy = accuracy_score(y_true, y_pred)
precision = precision_score(y_true, y_pred)
recall = recall_score(y_true, y_pred)
f1 = f1_score(y_true, y_pred)
cm = confusion_matrix(y_true, y_pred)
auc = roc_auc_score(y_true, y_proba)
# 验证指标范围
assert 0 <= accuracy <= 1, "准确率范围错误"
assert 0 <= precision <= 1, "精确率范围错误"
assert 0 <= recall <= 1, "召回率范围错误"
assert 0 <= f1 <= 1, "F1分数范围错误"
assert 0 <= auc <= 1, "AUC范围错误"
# 验证混淆矩阵
assert cm.shape == (2, 2), "混淆矩阵维度错误"
print(f"✓ 分类评估指标测试通过")
print(f" 准确率: {accuracy:.4f}")
print(f" 精确率: {precision:.4f}")
print(f" 召回率: {recall:.4f}")
print(f" F1分数: {f1:.4f}")
print(f" AUC: {auc:.4f}")
预期结果: 所有指标在合理范围内
测试用例4.2: 回归评估指标¶
测试目标: 验证回归评估指标
测试代码:
Python
def test_regression_metrics():
"""测试回归评估指标"""
from sklearn.metrics import (
mean_squared_error, mean_absolute_error,
r2_score, mean_absolute_percentage_error,
)
# 创建测试数据
y_true = np.array([1.0, 2.0, 3.0, 4.0, 5.0])
y_pred = np.array([1.1, 1.9, 3.2, 3.8, 5.1])
# 计算指标
mse = mean_squared_error(y_true, y_pred)
mae = mean_absolute_error(y_true, y_pred)
r2 = r2_score(y_true, y_pred)
mape = mean_absolute_percentage_error(y_true, y_pred)
# 验证指标
assert mse >= 0, "MSE应该非负"
assert mae >= 0, "MAE应该非负"
assert r2 <= 1.0, "R²应该<=1"
assert mape >= 0, "MAPE应该非负"
# 验证完美预测
y_pred_perfect = y_true.copy()
mse_perfect = mean_squared_error(y_true, y_pred_perfect)
r2_perfect = r2_score(y_true, y_pred_perfect)
assert mse_perfect == 0, "完美预测的MSE应该为0"
assert r2_perfect == 1.0, "完美预测的R²应该为1"
print(f"✓ 回归评估指标测试通过")
print(f" MSE: {mse:.4f}")
print(f" MAE: {mae:.4f}")
print(f" R²: {r2:.4f}")
print(f" MAPE: {mape:.4f}")
预期结果: 所有指标在合理范围内
5. 性能测试¶
测试用例5.1: 训练速度测试¶
测试目标: 测试算法训练速度
测试代码:
Python
import time
def test_training_speed():
"""测试训练速度"""
from sklearn.ensemble import RandomForestClassifier
# 创建测试数据
X, y = make_classification(
n_samples=10000,
n_features=20,
n_classes=2,
random_state=42,
)
# 训练模型
model = RandomForestClassifier(
n_estimators=100,
random_state=42,
n_jobs=-1,
)
start_time = time.time()
model.fit(X, y)
end_time = time.time()
training_time = end_time - start_time
print(f"✓ 训练速度测试通过")
print(f" 训练时间: {training_time:.2f}s")
print(f" 样本数: {len(X)}")
print(f" 每秒样本数: {len(X)/training_time:.2f}")
# 验证训练时间在合理范围内
assert training_time < 30, "训练时间过长"
预期结果: 训练时间<30s
测试用例5.2: 预测速度测试¶
测试目标: 测试算法预测速度
测试代码:
Python
def test_prediction_speed():
"""测试预测速度"""
from sklearn.ensemble import RandomForestClassifier
# 创建测试数据
X_train, y_train = make_classification(
n_samples=1000,
n_features=20,
random_state=42,
)
X_test, _ = make_classification(
n_samples=10000,
n_features=20,
random_state=42,
)
# 训练模型
model = RandomForestClassifier(
n_estimators=100,
random_state=42,
)
model.fit(X_train, y_train)
# 预测
start_time = time.time()
y_pred = model.predict(X_test)
end_time = time.time()
prediction_time = end_time - start_time
throughput = len(X_test) / prediction_time
print(f"✓ 预测速度测试通过")
print(f" 预测时间: {prediction_time:.2f}s")
print(f" 样本数: {len(X_test)}")
print(f" 每秒样本数: {throughput:.2f}")
# 验证预测速度
assert prediction_time < 5, "预测时间过长"
assert throughput > 1000, "吞吐量过低"
预期结果: 预测时间<5s,吞吐量>1000 samples/s
📊 测试执行¶
运行所有测试¶
Bash
# 运行所有测试
pytest tests/test_ml_algorithms.py -v
# 运行特定测试
pytest tests/test_ml_algorithms.py::test_logistic_regression -v
# 生成覆盖率报告
pytest tests/test_ml_algorithms.py --cov=sklearn --cov-report=html
✅ 验证方法¶
1. 自动化验证¶
- 运行所有测试用例
- 检查断言是否通过
- 记录测试结果
2. 性能基准¶
- 建立性能基准
- 监控算法性能变化
- 优化算法参数
3. 对比分析¶
- 对比不同算法的性能
- 分析算法的优缺点
- 选择最优算法
📝 测试报告¶
测试报告应包含:
- 测试概览
- 测试用例数量
- 通过/失败统计
-
算法性能对比
-
详细结果
- 每个算法的测试结果
- 性能指标
-
推荐算法
-
问题分析
- 失败原因分析
- 改进建议
- 后续计划
测试完成标准: 所有测试用例通过 推荐测试频率: 每次算法更新 测试维护周期: 每周