跳转至

机器学习算法测试用例

测试目标: 验证机器学习算法的功能和性能 测试类型: 单元测试、集成测试、性能测试 涉及组件: 分类算法、回归算法、聚类算法、评估指标


📋 测试概述

测试目标

  1. 功能测试: 验证算法的正确性
  2. 性能测试: 评估算法的计算效率
  3. 鲁棒性测试: 测试算法对异常数据的稳定性
  4. 对比测试: 比较不同算法的性能

测试环境

  • Python版本: 3.8+
  • 机器学习框架: scikit-learn
  • 测试框架: pytest
  • 数值计算: NumPy

🧪 测试用例列表

1. 分类算法测试

测试用例1.1: 逻辑回归

测试目标: 验证逻辑回归算法

测试代码:

Python
import pytest
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn.datasets import make_classification
from sklearn.metrics import accuracy_score

def test_logistic_regression():
    """测试逻辑回归"""
    # 创建测试数据
    X, y = make_classification(
        n_samples=100,
        n_features=10,
        n_classes=2,
        random_state=42,
    )

    # 划分数据
    from sklearn.model_selection import train_test_split
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=42
    )

    # 训练模型
    model = LogisticRegression(random_state=42)
    model.fit(X_train, y_train)

    # 预测
    y_pred = model.predict(X_test)

    # 验证
    accuracy = accuracy_score(y_test, y_pred)
    assert accuracy > 0.7, f"准确率过低: {accuracy}"

    # 验证概率输出
    y_proba = model.predict_proba(X_test)
    assert y_proba.shape == (len(y_test), 2)
    assert np.allclose(y_proba.sum(axis=1), 1.0), "概率和不为1"

    print(f"✓ 逻辑回归测试通过 (准确率: {accuracy:.2%})")

预期结果: 准确率>70%,概率和为1


测试用例1.2: 决策树

测试目标: 验证决策树算法

测试代码:

Python
def test_decision_tree():
    """测试决策树"""
    from sklearn.tree import DecisionTreeClassifier

    # 创建测试数据
    X, y = make_classification(
        n_samples=100,
        n_features=10,
        n_classes=3,
        random_state=42,
    )

    # 划分数据
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=42
    )

    # 训练模型
    model = DecisionTreeClassifier(random_state=42, max_depth=5)
    model.fit(X_train, y_train)

    # 预测
    y_pred = model.predict(X_test)

    # 验证
    accuracy = accuracy_score(y_test, y_pred)
    assert accuracy > 0.6, f"准确率过低: {accuracy}"

    # 验证树深度
    assert model.get_depth() <= 5, "树深度超过限制"

    # 验证特征重要性
    importance = model.feature_importances_
    assert len(importance) == 10
    assert np.allclose(importance.sum(), 1.0), "特征重要性和不等于1"

    print(f"✓ 决策树测试通过 (准确率: {accuracy:.2%}, 树深度: {model.get_depth()})")

预期结果: 准确率>60%,树深度≤5


测试用例1.3: 随机森林

测试目标: 验证随机森林算法

测试代码:

Python
def test_random_forest():
    """测试随机森林"""
    from sklearn.ensemble import RandomForestClassifier

    # 创建测试数据
    X, y = make_classification(
        n_samples=200,
        n_features=20,
        n_classes=2,
        random_state=42,
    )

    # 划分数据
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=42
    )

    # 训练模型
    model = RandomForestClassifier(
        n_estimators=100,
        random_state=42,
        n_jobs=-1,
    )
    model.fit(X_train, y_train)

    # 预测
    y_pred = model.predict(X_test)

    # 验证
    accuracy = accuracy_score(y_test, y_pred)
    assert accuracy > 0.8, f"准确率过低: {accuracy}"

    # 验证树的数量
    assert len(model.estimators_) == 100, "树的数量不正确"

    # 验证特征重要性
    importance = model.feature_importances_
    assert len(importance) == 20
    assert np.allclose(importance.sum(), 1.0), "特征重要性和不等于1"

    print(f"✓ 随机森林测试通过 (准确率: {accuracy:.2%})")

预期结果: 准确率>80%


测试用例1.4: 支持向量机

测试目标: 验证SVM算法

测试代码:

Python
def test_svm():
    """测试支持向量机"""
    from sklearn.svm import SVC

    # 创建测试数据
    X, y = make_classification(
        n_samples=100,
        n_features=10,
        n_classes=2,
        random_state=42,
    )

    # 划分数据
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=42
    )

    # 训练模型
    model = SVC(kernel='rbf', random_state=42)
    model.fit(X_train, y_train)

    # 预测
    y_pred = model.predict(X_test)

    # 验证
    accuracy = accuracy_score(y_test, y_pred)
    assert accuracy > 0.7, f"准确率过低: {accuracy}"

    # 验证支持向量
    assert len(model.support_) > 0, "没有支持向量"
    assert len(model.support_) <= len(X_train), "支持向量数量超过训练样本"

    print(f"✓ SVM测试通过 (准确率: {accuracy:.2%}, 支持向量数: {len(model.support_)})")

预期结果: 准确率>70%


2. 回归算法测试

测试用例2.1: 线性回归

测试目标: 验证线性回归算法

测试代码:

Python
def test_linear_regression():
    """测试线性回归"""
    from sklearn.linear_model import LinearRegression
    from sklearn.datasets import make_regression
    from sklearn.metrics import mean_squared_error, r2_score

    # 创建测试数据
    X, y = make_regression(
        n_samples=100,
        n_features=5,
        noise=0.1,
        random_state=42,
    )

    # 划分数据
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=42
    )

    # 训练模型
    model = LinearRegression()
    model.fit(X_train, y_train)

    # 预测
    y_pred = model.predict(X_test)

    # 验证
    mse = mean_squared_error(y_test, y_pred)
    r2 = r2_score(y_test, y_pred)

    assert r2 > 0.5, f"R²过低: {r2}"

    # 验证系数
    assert len(model.coef_) == 5, "系数数量不正确"
    assert model.intercept_ is not None, "截距不存在"

    print(f"✓ 线性回归测试通过 (R²: {r2:.4f}, MSE: {mse:.4f})")

预期结果: R²>0.5


测试用例2.2: Ridge回归

测试目标: 验证Ridge回归

测试代码:

Python
def test_ridge_regression():
    """测试Ridge回归"""
    from sklearn.linear_model import Ridge

    # 创建测试数据
    X, y = make_regression(
        n_samples=100,
        n_features=10,
        noise=0.1,
        random_state=42,
    )

    # 划分数据
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=42
    )

    # 训练模型
    model = Ridge(alpha=1.0, random_state=42)
    model.fit(X_train, y_train)

    # 预测
    y_pred = model.predict(X_test)

    # 验证
    mse = mean_squared_error(y_test, y_pred)
    r2 = r2_score(y_test, y_pred)

    assert r2 > 0.5, f"R²过低: {r2}"

    # 验证正则化效果
    # Ridge的系数应该比普通线性回归小
    from sklearn.linear_model import LinearRegression
    lr = LinearRegression()
    lr.fit(X_train, y_train)

    ridge_norm = np.linalg.norm(model.coef_)  # np.linalg线性代数运算
    lr_norm = np.linalg.norm(lr.coef_)

    assert ridge_norm < lr_norm, "Ridge正则化未生效"

    print(f"✓ Ridge回归测试通过 (R²: {r2:.4f}, 系数范数: {ridge_norm:.4f})")

预期结果: R²>0.5,系数范数小于普通线性回归


测试用例2.3: Lasso回归

测试目标: 验证Lasso回归

测试代码:

Python
def test_lasso_regression():
    """测试Lasso回归"""
    from sklearn.linear_model import Lasso

    # 创建测试数据
    X, y = make_regression(
        n_samples=100,
        n_features=10,
        n_informative=5,
        noise=0.1,
        random_state=42,
    )

    # 划分数据
    X_train, X_test, y_train, y_test = train_test_split(
        X, y, test_size=0.2, random_state=42
    )

    # 训练模型
    model = Lasso(alpha=0.1, random_state=42)
    model.fit(X_train, y_train)

    # 预测
    y_pred = model.predict(X_test)

    # 验证
    mse = mean_squared_error(y_test, y_pred)
    r2 = r2_score(y_test, y_pred)

    assert r2 > 0.5, f"R²过低: {r2}"

    # 验证稀疏性
    # Lasso应该产生稀疏系数
    zero_coef_count = np.sum(np.abs(model.coef_) < 1e-10)
    assert zero_coef_count > 0, "Lasso未产生稀疏系数"

    print(f"✓ Lasso回归测试通过 (R²: {r2:.4f}, 零系数数: {zero_coef_count})")

预期结果: R²>0.5,产生稀疏系数


3. 聚类算法测试

测试用例3.1: K-Means

测试目标: 验证K-Means聚类

测试代码:

Python
def test_kmeans():
    """测试K-Means聚类"""
    from sklearn.cluster import KMeans
    from sklearn.datasets import make_blobs
    from sklearn.metrics import adjusted_rand_score

    # 创建测试数据
    X, y_true = make_blobs(
        n_samples=300,
        centers=4,
        cluster_std=0.60,
        random_state=42,
    )

    # 训练模型
    model = KMeans(n_clusters=4, random_state=42, n_init=10)
    y_pred = model.fit_predict(X)

    # 验证
    ari = adjusted_rand_score(y_true, y_pred)
    assert ari > 0.8, f"ARI过低: {ari}"

    # 验证聚类中心
    assert model.cluster_centers_.shape == (4, 2), "聚类中心维度错误"

    # 验证标签
    assert len(np.unique(y_pred)) == 4, "聚类数量不正确"

    print(f"✓ K-Means测试通过 (ARI: {ari:.4f})")

预期结果: ARI>0.8


测试用例3.2: DBSCAN

测试目标: 验证DBSCAN聚类

测试代码:

Python
def test_dbscan():
    """测试DBSCAN聚类"""
    from sklearn.cluster import DBSCAN
    from sklearn.datasets import make_moons

    # 创建测试数据
    X, y_true = make_moons(n_samples=300, noise=0.05, random_state=42)

    # 训练模型
    model = DBSCAN(eps=0.3, min_samples=5)
    y_pred = model.fit_predict(X)

    # 验证
    ari = adjusted_rand_score(y_true, y_pred)
    assert ari > 0.8, f"ARI过低: {ari}"

    # 验证噪声点
    n_noise = np.sum(y_pred == -1)
    assert n_noise >= 0, "噪声点数量异常"

    # 验证聚类数量
    n_clusters = len(set(y_pred)) - (1 if -1 in y_pred else 0)
    assert n_clusters == 2, f"聚类数量不正确: {n_clusters}"

    print(f"✓ DBSCAN测试通过 (ARI: {ari:.4f}, 噪声点数: {n_noise})")

预期结果: ARI>0.8,识别出2个聚类


4. 评估指标测试

测试用例4.1: 分类评估指标

测试目标: 验证分类评估指标

测试代码:

Python
def test_classification_metrics():
    """测试分类评估指标"""
    from sklearn.metrics import (
        accuracy_score, precision_score, recall_score,
        f1_score, confusion_matrix, roc_auc_score,
    )

    # 创建测试数据
    y_true = np.array([0, 1, 0, 1, 0, 1, 0, 1])
    y_pred = np.array([0, 1, 0, 0, 0, 1, 1, 1])
    y_proba = np.array([0.1, 0.9, 0.2, 0.4, 0.3, 0.8, 0.7, 0.6])

    # 计算指标
    accuracy = accuracy_score(y_true, y_pred)
    precision = precision_score(y_true, y_pred)
    recall = recall_score(y_true, y_pred)
    f1 = f1_score(y_true, y_pred)
    cm = confusion_matrix(y_true, y_pred)
    auc = roc_auc_score(y_true, y_proba)

    # 验证指标范围
    assert 0 <= accuracy <= 1, "准确率范围错误"
    assert 0 <= precision <= 1, "精确率范围错误"
    assert 0 <= recall <= 1, "召回率范围错误"
    assert 0 <= f1 <= 1, "F1分数范围错误"
    assert 0 <= auc <= 1, "AUC范围错误"

    # 验证混淆矩阵
    assert cm.shape == (2, 2), "混淆矩阵维度错误"

    print(f"✓ 分类评估指标测试通过")
    print(f"  准确率: {accuracy:.4f}")
    print(f"  精确率: {precision:.4f}")
    print(f"  召回率: {recall:.4f}")
    print(f"  F1分数: {f1:.4f}")
    print(f"  AUC: {auc:.4f}")

预期结果: 所有指标在合理范围内


测试用例4.2: 回归评估指标

测试目标: 验证回归评估指标

测试代码:

Python
def test_regression_metrics():
    """测试回归评估指标"""
    from sklearn.metrics import (
        mean_squared_error, mean_absolute_error,
        r2_score, mean_absolute_percentage_error,
    )

    # 创建测试数据
    y_true = np.array([1.0, 2.0, 3.0, 4.0, 5.0])
    y_pred = np.array([1.1, 1.9, 3.2, 3.8, 5.1])

    # 计算指标
    mse = mean_squared_error(y_true, y_pred)
    mae = mean_absolute_error(y_true, y_pred)
    r2 = r2_score(y_true, y_pred)
    mape = mean_absolute_percentage_error(y_true, y_pred)

    # 验证指标
    assert mse >= 0, "MSE应该非负"
    assert mae >= 0, "MAE应该非负"
    assert r2 <= 1.0, "R²应该<=1"
    assert mape >= 0, "MAPE应该非负"

    # 验证完美预测
    y_pred_perfect = y_true.copy()
    mse_perfect = mean_squared_error(y_true, y_pred_perfect)
    r2_perfect = r2_score(y_true, y_pred_perfect)

    assert mse_perfect == 0, "完美预测的MSE应该为0"
    assert r2_perfect == 1.0, "完美预测的R²应该为1"

    print(f"✓ 回归评估指标测试通过")
    print(f"  MSE: {mse:.4f}")
    print(f"  MAE: {mae:.4f}")
    print(f"  R²: {r2:.4f}")
    print(f"  MAPE: {mape:.4f}")

预期结果: 所有指标在合理范围内


5. 性能测试

测试用例5.1: 训练速度测试

测试目标: 测试算法训练速度

测试代码:

Python
import time

def test_training_speed():
    """测试训练速度"""
    from sklearn.ensemble import RandomForestClassifier

    # 创建测试数据
    X, y = make_classification(
        n_samples=10000,
        n_features=20,
        n_classes=2,
        random_state=42,
    )

    # 训练模型
    model = RandomForestClassifier(
        n_estimators=100,
        random_state=42,
        n_jobs=-1,
    )

    start_time = time.time()
    model.fit(X, y)
    end_time = time.time()

    training_time = end_time - start_time

    print(f"✓ 训练速度测试通过")
    print(f"  训练时间: {training_time:.2f}s")
    print(f"  样本数: {len(X)}")
    print(f"  每秒样本数: {len(X)/training_time:.2f}")

    # 验证训练时间在合理范围内
    assert training_time < 30, "训练时间过长"

预期结果: 训练时间<30s


测试用例5.2: 预测速度测试

测试目标: 测试算法预测速度

测试代码:

Python
def test_prediction_speed():
    """测试预测速度"""
    from sklearn.ensemble import RandomForestClassifier

    # 创建测试数据
    X_train, y_train = make_classification(
        n_samples=1000,
        n_features=20,
        random_state=42,
    )
    X_test, _ = make_classification(
        n_samples=10000,
        n_features=20,
        random_state=42,
    )

    # 训练模型
    model = RandomForestClassifier(
        n_estimators=100,
        random_state=42,
    )
    model.fit(X_train, y_train)

    # 预测
    start_time = time.time()
    y_pred = model.predict(X_test)
    end_time = time.time()

    prediction_time = end_time - start_time
    throughput = len(X_test) / prediction_time

    print(f"✓ 预测速度测试通过")
    print(f"  预测时间: {prediction_time:.2f}s")
    print(f"  样本数: {len(X_test)}")
    print(f"  每秒样本数: {throughput:.2f}")

    # 验证预测速度
    assert prediction_time < 5, "预测时间过长"
    assert throughput > 1000, "吞吐量过低"

预期结果: 预测时间<5s,吞吐量>1000 samples/s


📊 测试执行

运行所有测试

Bash
# 运行所有测试
pytest tests/test_ml_algorithms.py -v

# 运行特定测试
pytest tests/test_ml_algorithms.py::test_logistic_regression -v

# 生成覆盖率报告
pytest tests/test_ml_algorithms.py --cov=sklearn --cov-report=html

✅ 验证方法

1. 自动化验证

  • 运行所有测试用例
  • 检查断言是否通过
  • 记录测试结果

2. 性能基准

  • 建立性能基准
  • 监控算法性能变化
  • 优化算法参数

3. 对比分析

  • 对比不同算法的性能
  • 分析算法的优缺点
  • 选择最优算法

📝 测试报告

测试报告应包含:

  1. 测试概览
  2. 测试用例数量
  3. 通过/失败统计
  4. 算法性能对比

  5. 详细结果

  6. 每个算法的测试结果
  7. 性能指标
  8. 推荐算法

  9. 问题分析

  10. 失败原因分析
  11. 改进建议
  12. 后续计划

测试完成标准: 所有测试用例通过 推荐测试频率: 每次算法更新 测试维护周期: 每周