01-卷积神经网络基础¶

📌 章节定位：本文档隶属于深度学习教程体系，侧重CNN的数学原理与理论推导。 - 本文档重点：卷积运算的数学公式、参数计算推导、感受野计算、各种卷积变体原理 - 应用实践方向：如需了解CNN在图像分类、目标检测等CV任务中的实际应用、预训练模型使用、迁移学习等内容，请参考计算机视觉/05-卷积神经网络基础.md

学习时间: 约6-8小时 难度级别: ⭐⭐⭐ 中级 前置知识: 神经网络基础、反向传播算法、PyTorch 基础 学习目标: 深入理解卷积运算原理，掌握CNN核心组件及PyTorch实现

目录¶

1. 卷积运算详解
2. 卷积层参数计算
3. 填充与步长
4. 池化层
5. 感受野计算
6. 转置卷积（反卷积）
7. 深度可分离卷积
8. 空洞卷积
9. 完整CNN结构示例
10. 练习与自我检查

1. 卷积运算详解¶

1.1 从全连接到卷积¶

全连接层处理图像的问题： - 参数爆炸: 一张 $224 \times 224 \times 3$ 的图像展平后有 150,528 个输入，若隐藏层有 1000 个神经元，需要 $\approx$ 1.5 亿个参数 - 忽略空间结构: 将图像展平丢失了像素间的空间关系 - 无平移不变性: 同一物体在不同位置需要重新学习

卷积层通过三个关键特性解决这些问题： 1. 局部连接（Local Connectivity）: 每个神经元只连接输入的一个小区域 2. 参数共享（Parameter Sharing）: 同一卷积核在所有位置共享参数 3. 平移等变性（Translation Equivariance）: 输入平移 → 输出平移

1.2 一维卷积（1D Convolution）¶

一维卷积常用于序列数据（时间序列、文本等）。

对于输入信号 $x$ 和核 $w$，离散卷积定义为：

\[(x * w)[n] = \sum_{k=0}^{K-1} w[k] \cdot x[n + k]\]

注：在深度学习中通常使用互相关（Cross-correlation）而非数学意义上的卷积（需翻转核），但习惯称之为"卷积"。

Python

import torch
import torch.nn as nn
import torch.nn.functional as F

# ===== 手动实现 1D 卷积 =====
def conv1d_manual(input_signal, kernel):
    """手动实现1D卷积（互相关）"""
    input_len = len(input_signal)
    kernel_len = len(kernel)
    output_len = input_len - kernel_len + 1
    output = torch.zeros(output_len)

    for i in range(output_len):
        output[i] = torch.sum(input_signal[i:i+kernel_len] * kernel)

    return output

# 测试
signal = torch.tensor([1.0, 2, 3, 4, 5, 6, 7, 8])
kernel = torch.tensor([1.0, 0, -1])
print("手动1D卷积:", conv1d_manual(signal, kernel))

# PyTorch 1D 卷积
# 输入: (batch_size, in_channels, length)
conv1d = nn.Conv1d(in_channels=1, out_channels=4, kernel_size=3, padding=1)
x = torch.randn(2, 1, 100)  # batch=2, channels=1, length=100
output = conv1d(x)
print(f"Conv1d 输出 shape: {output.shape}")  # (2, 4, 100)

1.3 二维卷积（2D Convolution）¶

二维卷积是图像处理中最核心的操作。

对于输入特征图 $X \in \mathbb{R}^{H \times W}$ 和卷积核 $K \in \mathbb{R}^{k_h \times k_w}$：

\[(X * K)[i, j] = \sum_{p=0}^{k_h-1}\sum_{q=0}^{k_w-1} K[p, q] \cdot X[i+p, j+q]\]

对于多通道输入 $X \in \mathbb{R}^{C_{in} \times H \times W}$ 和多通道输出，卷积核 $K \in \mathbb{R}^{C_{out} \times C_{in} \times k_h \times k_w}$：

\[Y[c_{out}, i, j] = b[c_{out}] + \sum_{c_{in}=0}^{C_{in}-1} \sum_{p=0}^{k_h-1}\sum_{q=0}^{k_w-1} K[c_{out}, c_{in}, p, q] \cdot X[c_{in}, i+p, j+q]\]

Python

# ===== 手动实现 2D 卷积 =====
def conv2d_manual(input_tensor, kernel, bias=None):
    """
    手动实现2D卷积
    input_tensor: (C_in, H, W)
    kernel: (C_out, C_in, kH, kW)
    """
    C_out, C_in, kH, kW = kernel.shape
    _, H, W = input_tensor.shape
    out_H = H - kH + 1
    out_W = W - kW + 1

    output = torch.zeros(C_out, out_H, out_W)

    for co in range(C_out):
        for i in range(out_H):
            for j in range(out_W):
                # 提取局部区域，与对应核做元素乘积求和
                receptive_field = input_tensor[:, i:i+kH, j:j+kW]
                output[co, i, j] = torch.sum(receptive_field * kernel[co])

        if bias is not None:
            output[co] += bias[co]

    return output

# 测试
x = torch.randn(3, 8, 8)      # 3通道, 8x8
k = torch.randn(16, 3, 3, 3)  # 16个3x3卷积核
out = conv2d_manual(x, k)
print(f"手动2D卷积输出 shape: {out.shape}")  # (16, 6, 6)

# PyTorch 2D 卷积
# 输入: (batch_size, C_in, H, W)
conv2d = nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1)
x = torch.randn(4, 3, 32, 32)  # batch=4, 3通道, 32x32
output = conv2d(x)
print(f"Conv2d 输出 shape: {output.shape}")  # (4, 16, 32, 32)

1.4 三维卷积（3D Convolution）¶

三维卷积用于视频数据（时间 + 空间）或体积数据（如医学3D扫描）：

Python

# 3D 卷积
# 输入: (batch_size, C_in, D, H, W) — D 是深度/时间维
conv3d = nn.Conv3d(in_channels=3, out_channels=64, kernel_size=(3, 3, 3), padding=(1, 1, 1))
x = torch.randn(2, 3, 16, 112, 112)  # batch=2, 3通道, 16帧, 112x112
output = conv3d(x)
print(f"Conv3d 输出 shape: {output.shape}")  # (2, 64, 16, 112, 112)

2. 卷积层参数计算¶

2.1 输出尺寸公式¶

对于输入尺寸 $H_{in}$，卷积核大小 $k$，填充 $p$，步长 $s$：

\[H_{out} = \left\lfloor \frac{H_{in} + 2p - k}{s} \right\rfloor + 1\]

2.2 参数数量¶

\[\text{参数量} = C_{out} \times (C_{in} \times k_h \times k_w + 1)\]

最后的 $+1$ 是偏置项（如果有的话）。

2.3 计算量（FLOPs）¶

每个输出位置的乘加次数为 $C_{in} \times k_h \times k_w$，总计算量：

\[\text{FLOPs} = C_{out} \times H_{out} \times W_{out} \times (2 \times C_{in} \times k_h \times k_w)\]

因子 2 是因为每次乘法后跟一次加法。

Python

def calc_conv_params(in_channels, out_channels, kernel_size,
                     input_size, padding=0, stride=1, bias=True):
    """计算卷积层的参数数量和计算量"""
    if isinstance(kernel_size, int):  # isinstance检查对象类型
        kh = kw = kernel_size
    else:
        kh, kw = kernel_size

    if isinstance(padding, int):
        ph = pw = padding
    else:
        ph, pw = padding

    if isinstance(stride, int):
        sh = sw = stride
    else:
        sh, sw = stride

    if isinstance(input_size, int):
        H_in = W_in = input_size
    else:
        H_in, W_in = input_size

    # 输出尺寸
    H_out = (H_in + 2*ph - kh) // sh + 1
    W_out = (W_in + 2*pw - kw) // sw + 1

    # 参数量
    params = out_channels * (in_channels * kh * kw)
    if bias:
        params += out_channels

    # FLOPs（乘加操作）
    flops = out_channels * H_out * W_out * (2 * in_channels * kh * kw)

    print(f"输出尺寸: ({out_channels}, {H_out}, {W_out})")
    print(f"参数量: {params:,}")
    print(f"FLOPs: {flops:,}")

    return {'output_size': (out_channels, H_out, W_out), 'params': params, 'flops': flops}

# 示例：ResNet 的第一个卷积层
calc_conv_params(in_channels=3, out_channels=64, kernel_size=7,
                 input_size=224, padding=3, stride=2)

3. 填充与步长¶

3.1 填充（Padding）¶

Valid 填充（无填充，$p=0$）：输出会缩小 Same 填充：保持输出与输入尺寸相同，$p = \lfloor k/2 \rfloor$（当 $s=1$ 时）

Python

# 不同填充模式
conv_valid = nn.Conv2d(3, 16, kernel_size=3, padding=0)   # 输出缩小 2
conv_same = nn.Conv2d(3, 16, kernel_size=3, padding=1)    # 输出大小不变
conv_full = nn.Conv2d(3, 16, kernel_size=3, padding=2)    # 输出增大 2

x = torch.randn(1, 3, 32, 32)
print(f"Valid: {conv_valid(x).shape}")  # (1, 16, 30, 30)
print(f"Same:  {conv_same(x).shape}")  # (1, 16, 32, 32)
print(f"Full:  {conv_full(x).shape}")  # (1, 16, 34, 34)

# 零填充 vs 反射填充 vs 复制填充
x = torch.randn(1, 3, 32, 32)
x_pad_zero = F.pad(x, (1, 1, 1, 1), mode='constant', value=0)
x_pad_reflect = F.pad(x, (1, 1, 1, 1), mode='reflect')
x_pad_replicate = F.pad(x, (1, 1, 1, 1), mode='replicate')

3.2 步长（Stride）¶

步长控制卷积核每次移动的距离，$s > 1$ 会产生下采样效果：

Python

conv_s1 = nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1)
conv_s2 = nn.Conv2d(3, 16, kernel_size=3, stride=2, padding=1)

x = torch.randn(1, 3, 32, 32)
print(f"Stride=1: {conv_s1(x).shape}")  # (1, 16, 32, 32)
print(f"Stride=2: {conv_s2(x).shape}")  # (1, 16, 16, 16) — 尺寸减半

4. 池化层¶

4.1 最大池化（Max Pooling）¶

取局部区域的最大值，保留最显著的特征：

\[y[i,j] = \max_{0 \leq p < k, 0 \leq q < k} x[i \cdot s + p, \; j \cdot s + q]\]

4.2 平均池化（Average Pooling）¶

取局部区域的平均值：

\[y[i,j] = \frac{1}{k^2} \sum_{p=0}^{k-1}\sum_{q=0}^{k-1} x[i \cdot s + p, \; j \cdot s + q]\]

4.3 全局平均池化（Global Average Pooling）¶

对整个特征图取平均，将 $(C, H, W)$ 变为 $(C, 1, 1)$：

\[y[c] = \frac{1}{H \times W} \sum_{i=0}^{H-1}\sum_{j=0}^{W-1} x[c, i, j]\]

Python

# ===== 手动实现池化 =====
def max_pool2d_manual(x, kernel_size=2, stride=2):
    """手动实现2D最大池化"""
    C, H, W = x.shape
    out_H = (H - kernel_size) // stride + 1
    out_W = (W - kernel_size) // stride + 1
    output = torch.zeros(C, out_H, out_W)

    for c in range(C):
        for i in range(out_H):
            for j in range(out_W):
                h_start = i * stride
                w_start = j * stride
                region = x[c, h_start:h_start+kernel_size, w_start:w_start+kernel_size]
                output[c, i, j] = region.max()

    return output

# ===== PyTorch 池化层 =====
# 最大池化
max_pool = nn.MaxPool2d(kernel_size=2, stride=2)

# 平均池化
avg_pool = nn.AvgPool2d(kernel_size=2, stride=2)

# 全局平均池化
global_avg_pool = nn.AdaptiveAvgPool2d(output_size=(1, 1))

# 全局最大池化
global_max_pool = nn.AdaptiveMaxPool2d(output_size=(1, 1))

x = torch.randn(1, 64, 32, 32)
print(f"MaxPool: {max_pool(x).shape}")          # (1, 64, 16, 16)
print(f"AvgPool: {avg_pool(x).shape}")          # (1, 64, 16, 16)
print(f"GAP: {global_avg_pool(x).shape}")       # (1, 64, 1, 1)
print(f"GMP: {global_max_pool(x).shape}")       # (1, 64, 1, 1)

# 自适应池化（指定输出大小，自动计算 kernel_size 和 stride）
adaptive_pool = nn.AdaptiveAvgPool2d(output_size=(7, 7))
print(f"Adaptive: {adaptive_pool(x).shape}")    # (1, 64, 7, 7)

5. 感受野计算¶

5.1 什么是感受野¶

感受野（Receptive Field）是指输出特征图上一个元素所"看到"的输入图像区域大小。感受野越大，网络捕获的上下文信息越丰富。

5.2 感受野计算公式¶

对于第 $l$ 层：

\[RF_l = RF_{l-1} + (k_l - 1) \times \prod_{i=1}^{l-1} s_i\]

或者等价地，从第一层开始递推：

\[RF_l = 1 + \sum_{i=1}^{l}(k_i - 1) \prod_{j=1}^{i-1} s_j\]

其中 $k_i$ 是第 $i$ 层的卷积核大小，$s_i$ 是第 $i$ 层的步长。

Python

def compute_receptive_field(layers):
    """
    计算感受野
    layers: [(kernel_size, stride), ...] 从浅到深
    """
    rf = 1     # 初始感受野
    stride_product = 1  # 累积步长

    for k, s in layers:
        rf = rf + (k - 1) * stride_product
        stride_product *= s

    return rf

# VGG-16 的前几层感受野
vgg_layers = [
    (3, 1), (3, 1), (2, 2),  # conv-conv-pool
    (3, 1), (3, 1), (2, 2),  # conv-conv-pool
    (3, 1), (3, 1), (3, 1), (2, 2),  # conv-conv-conv-pool
]
print(f"VGG 感受野: {compute_receptive_field(vgg_layers)}")

# ResNet-18 的前几层
resnet_layers = [
    (7, 2),  # conv1
    (3, 2),  # maxpool
    (3, 1), (3, 1),  # res block
    (3, 1), (3, 1),  # res block
]
print(f"ResNet 感受野: {compute_receptive_field(resnet_layers)}")

# 3个3x3卷积 vs 1个7x7卷积
print(f"3个3x3: RF = {compute_receptive_field([(3,1),(3,1),(3,1)])}")  # 7
print(f"1个7x7: RF = {compute_receptive_field([(7,1)])}")             # 7
# 参数量：3 × (3×3×C²) = 27C²  vs  7×7×C² = 49C²，3个3x3更高效！

6. 转置卷积（反卷积）¶

6.1 原理¶

转置卷积（Transposed Convolution）用于上采样，将低分辨率特征图映射回高分辨率。它不是卷积的逆运算，而是卷积运算的转置。

对于普通卷积 $Y = X * K$（展开为矩阵运算 $\mathbf{y} = \mathbf{C}\mathbf{x}$），转置卷积是 $\mathbf{x}' = \mathbf{C}^T\mathbf{y}$。

输出尺寸： $$H_{out} = (H_{in} - 1) \times s - 2p + k + \text{output\_padding}$$

Python

# ===== 转置卷积 =====
# 上采样：从 (1, 64, 8, 8) → (1, 32, 16, 16)
trans_conv = nn.ConvTranspose2d(64, 32, kernel_size=4, stride=2, padding=1)
x = torch.randn(1, 64, 8, 8)
print(f"转置卷积输出: {trans_conv(x).shape}")  # (1, 32, 16, 16)

# 棋盘格伪影（Checkerboard Artifacts）的解决方案
# 方案1：使用 kernel_size 能被 stride 整除的配置
trans_conv_good = nn.ConvTranspose2d(64, 32, kernel_size=4, stride=2, padding=1)

# 方案2：先上采样再卷积（推荐）
upsample_conv = nn.Sequential(
    nn.Upsample(scale_factor=2, mode='bilinear', align_corners=True),
    nn.Conv2d(64, 32, kernel_size=3, padding=1)
)

# 方案3：PixelShuffle（亚像素卷积）
pixel_shuffle = nn.Sequential(
    nn.Conv2d(64, 128, kernel_size=3, padding=1),  # 通道数需为 输出通道×r²
    nn.PixelShuffle(upscale_factor=2)              # (128, H, W) → (32, 2H, 2W)
)
x = torch.randn(1, 64, 8, 8)
print(f"PixelShuffle 输出: {pixel_shuffle(x).shape}")  # (1, 32, 16, 16)

7. 深度可分离卷积¶

7.1 原理¶

深度可分离卷积（Depthwise Separable Convolution）将标准卷积分解为两步：

深度卷积（Depthwise Convolution）: 每个通道独立地进行空间卷积
逐点卷积（Pointwise Convolution）: 使用 $1 \times 1$ 卷积混合通道信息

7.2 参数量对比¶

标准卷积参数量: $C_{out} \times C_{in} \times k \times k$

深度可分离卷积参数量: $C_{in} \times k \times k + C_{out} \times C_{in}$

压缩比: $\frac{1}{C_{out}} + \frac{1}{k^2}$

当 $C_{out}=256$, $k=3$ 时，参数量仅为标准卷积的 $\frac{1}{256} + \frac{1}{9} \approx 11.5\%$

Python

# ===== 手动实现深度可分离卷积 =====
class DepthwiseSeparableConv(nn.Module):
    def __init__(self, in_channels, out_channels, kernel_size=3, stride=1, padding=1):  # __init__构造方法，创建对象时自动调用
        super().__init__()  # super()调用父类方法
        # 深度卷积：groups=in_channels，每个通道独立卷积
        self.depthwise = nn.Conv2d(
            in_channels, in_channels, kernel_size,
            stride=stride, padding=padding, groups=in_channels
        )
        # 逐点卷积：1x1 卷积混合通道
        self.pointwise = nn.Conv2d(in_channels, out_channels, kernel_size=1)
        self.bn1 = nn.BatchNorm2d(in_channels)
        self.bn2 = nn.BatchNorm2d(out_channels)

    def forward(self, x):
        x = F.relu(self.bn1(self.depthwise(x)))
        x = F.relu(self.bn2(self.pointwise(x)))
        return x

# 参数量对比
standard_conv = nn.Conv2d(64, 128, 3, padding=1)
dw_sep_conv = DepthwiseSeparableConv(64, 128, 3, padding=1)

standard_params = sum(p.numel() for p in standard_conv.parameters())
dw_sep_params = sum(p.numel() for p in dw_sep_conv.parameters())
print(f"标准卷积参数量: {standard_params:,}")
print(f"深度可分离卷积参数量: {dw_sep_params:,}")
print(f"压缩比: {dw_sep_params/standard_params:.2%}")

8. 空洞卷积¶

8.1 原理¶

空洞卷积（Dilated/Atrous Convolution）在卷积核元素之间插入"空洞"（零），以增大感受野而不增加参数量。

膨胀率（dilation rate）$d$ 控制空洞大小。有效卷积核大小为：

\[k_{\text{effective}} = k + (k-1)(d-1) = d(k-1) + 1\]

当 $d=1$ 时退化为普通卷积。

Python

# ===== 空洞卷积 =====
# 普通卷积 vs 空洞卷积
conv_normal = nn.Conv2d(64, 64, kernel_size=3, padding=1, dilation=1)
conv_dilated2 = nn.Conv2d(64, 64, kernel_size=3, padding=2, dilation=2)
conv_dilated4 = nn.Conv2d(64, 64, kernel_size=3, padding=4, dilation=4)

x = torch.randn(1, 64, 32, 32)
print(f"dilation=1: {conv_normal(x).shape}")    # 感受野 3x3
print(f"dilation=2: {conv_dilated2(x).shape}")  # 感受野 5x5 (等效)
print(f"dilation=4: {conv_dilated4(x).shape}")  # 感受野 9x9 (等效)

# 多尺度空洞卷积（Atrous Spatial Pyramid Pooling - ASPP，用于语义分割）
class ASPP(nn.Module):
    def __init__(self, in_channels, out_channels, rates=[6, 12, 18]):
        super().__init__()
        self.conv1x1 = nn.Sequential(
            nn.Conv2d(in_channels, out_channels, 1),
            nn.BatchNorm2d(out_channels),
            nn.ReLU()
        )
        self.dilated_convs = nn.ModuleList([
            nn.Sequential(
                nn.Conv2d(in_channels, out_channels, 3, padding=r, dilation=r),
                nn.BatchNorm2d(out_channels),
                nn.ReLU()
            ) for r in rates
        ])
        self.global_pool = nn.Sequential(
            nn.AdaptiveAvgPool2d(1),
            nn.Conv2d(in_channels, out_channels, 1),
            nn.BatchNorm2d(out_channels),
            nn.ReLU()
        )
        self.project = nn.Sequential(
            nn.Conv2d(out_channels * (2 + len(rates)), out_channels, 1),
            nn.BatchNorm2d(out_channels),
            nn.ReLU()
        )

    def forward(self, x):
        size = x.shape[2:]  # 切片操作取子序列
        features = [self.conv1x1(x)]
        for conv in self.dilated_convs:
            features.append(conv(x))
        global_feat = self.global_pool(x)
        global_feat = F.interpolate(global_feat, size=size, mode='bilinear', align_corners=True)
        features.append(global_feat)
        return self.project(torch.cat(features, dim=1))

9. 完整CNN结构示例¶

Python

import torch
import torch.nn as nn
import torch.nn.functional as F

class CompleteCNN(nn.Module):
    """包含所有核心组件的CNN示例"""
    def __init__(self, num_classes=10):
        super().__init__()

        # 特征提取部分
        self.features = nn.Sequential(
            # Block 1: 标准卷积
            nn.Conv2d(3, 32, kernel_size=3, padding=1),
            nn.BatchNorm2d(32),
            nn.ReLU(inplace=True),
            nn.Conv2d(32, 32, kernel_size=3, padding=1),
            nn.BatchNorm2d(32),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),

            # Block 2: 深度可分离卷积
            DepthwiseSeparableConv(32, 64, kernel_size=3, padding=1),
            DepthwiseSeparableConv(64, 64, kernel_size=3, padding=1),
            nn.MaxPool2d(kernel_size=2, stride=2),

            # Block 3: 空洞卷积
            nn.Conv2d(64, 128, kernel_size=3, padding=2, dilation=2),
            nn.BatchNorm2d(128),
            nn.ReLU(inplace=True),
            nn.Conv2d(128, 128, kernel_size=3, padding=1),
            nn.BatchNorm2d(128),
            nn.ReLU(inplace=True),
            nn.MaxPool2d(kernel_size=2, stride=2),
        )

        # 全局平均池化
        self.global_avg_pool = nn.AdaptiveAvgPool2d((1, 1))

        # 分类头
        self.classifier = nn.Sequential(
            nn.Dropout(0.5),
            nn.Linear(128, num_classes)
        )

        self._initialize_weights()

    def _initialize_weights(self):
        for m in self.modules():
            if isinstance(m, nn.Conv2d):
                nn.init.kaiming_normal_(m.weight, mode='fan_out', nonlinearity='relu')
            elif isinstance(m, nn.BatchNorm2d):
                nn.init.ones_(m.weight)
                nn.init.zeros_(m.bias)

    def forward(self, x):
        x = self.features(x)        # 特征提取
        x = self.global_avg_pool(x)  # (N, 128, 1, 1)
        x = x.view(x.size(0), -1)   # (N, 128)
        x = self.classifier(x)       # (N, num_classes)
        return x

# 测试
model = CompleteCNN(num_classes=10)
x = torch.randn(4, 3, 32, 32)
output = model(x)
print(f"输出 shape: {output.shape}")

# 统计参数量
total_params = sum(p.numel() for p in model.parameters())
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
print(f"总参数量: {total_params:,}")
print(f"可训练参数量: {trainable_params:,}")

10. 练习与自我检查¶

练习题¶

手动卷积: 给定一个 $5 \times 5$ 矩阵和一个 $3 \times 3$ 核，手动计算卷积结果。然后用 PyTorch 验证。
参数计算: 画出一个 CNN 架构（3层卷积+2层全连接），计算每层的参数量和输出尺寸。
感受野: 设计一个网络使感受野覆盖整个 $224 \times 224$ 的输入图像。
深度可分离卷积: 实现一个 MobileNet 风格的网络块，对比标准卷积的参数量和推理速度。
转置卷积: 实现一个简单的编码器-解码器结构（用卷积下采样，转置卷积上采样），在 MNIST 上做自编码器。
CIFAR-10 分类: 使用本章学到的组件构建 CNN，在 CIFAR-10 上达到 90%+ 测试准确率。

自我检查清单¶

理解卷积运算的数学定义（1D/2D/3D）
能手动计算卷积层的输出尺寸和参数量
理解 Max/Average/Global Average Pooling 的区别
能计算任意网络的感受野
理解转置卷积的上采样原理和棋盘格伪影
理解深度可分离卷积及其效率优势
了解空洞卷积如何扩大感受野
能用 PyTorch 搭建完整的 CNN 结构

下一篇: 02-经典CNN架构 — 学习影响深远的CNN架构设计