17 - 图神经网络 (GNN)¶
🌐 什么是图数据?¶
图的基本概念¶
Text Only
图 G = (V, E)
- V: 节点集合 (Vertices/Nodes)
- E: 边集合 (Edges)
属性:
- 节点特征:X_v ∈ R^d
- 边特征:e_uv ∈ R^k
- 图标签:y_G
图数据示例¶
Text Only
社交网络:
- 节点:用户
- 边:好友关系
- 特征:用户画像
分子结构:
- 节点:原子
- 边:化学键
- 任务:分子性质预测
推荐系统:
- 节点:用户、物品
- 边:交互行为
- 任务:链接预测
知识图谱:
- 节点:实体
- 边:关系
- 任务:知识推理
为什么需要GNN?¶
Text Only
传统ML的问题:
❌ 无法处理不规则数据结构
❌ 无法利用关系信息
❌ 节点没有固定顺序(置换不变性)
GNN的优势:
✅ 直接处理图结构数据
✅ 学习节点和边的表示
✅ 保持置换不变性
✅ 支持归纳学习
🧠 图神经网络基础¶
消息传递框架 (Message Passing)¶
Text Only
核心思想:邻居聚合
对于每个节点v:
1. 收集邻居信息(消息)
2. 聚合消息
3. 更新自身表示
数学表达:
h_v^(l+1) = UPDATE^(l)(h_v^(l), AGGREGATE^(l)({h_u^(l), ∀u ∈ N(v)}))
其中:
- h_v^(l): 节点v在第l层的特征
- N(v): 节点v的邻居集合
- AGGREGATE: 聚合函数(mean, sum, max)
- UPDATE: 更新函数(通常用神经网络)
GNN通用框架¶
Python
import torch
import torch.nn as nn
import torch.nn.functional as F
class GNNLayer(nn.Module): # 继承nn.Module定义神经网络层
def __init__(self, in_dim, out_dim): # __init__构造方法,创建对象时自动调用
super().__init__() # super()调用父类方法
self.linear = nn.Linear(in_dim, out_dim)
def forward(self, x, adj):
"""
x: 节点特征 (N, in_dim)
adj: 邻接矩阵 (N, N)
"""
# 1. 线性变换
x = self.linear(x)
# 2. 聚合邻居信息
x = torch.matmul(adj, x) # 邻接矩阵乘法 = 邻居聚合
# 3. 激活函数
x = F.relu(x)
return x
🎯 经典GNN模型¶
GCN (Graph Convolutional Network)¶
核心思想¶
谱图卷积的局部一阶近似
Text Only
卷积公式:
H^(l+1) = σ(D̃^(-1/2) Ã D̃^(-1/2) H^(l) W^(l))
其中:
- Ã = A + I: 添加自环的邻接矩阵
- D̃: Ã的度矩阵
- D̃^(-1/2) Ã D̃^(-1/2): 对称归一化
- W^(l): 可学习参数
- σ: 激活函数
直观理解¶
代码实现¶
Python
import torch
import torch.nn as nn
import torch.nn.functional as F
class GCNLayer(nn.Module):
def __init__(self, in_features, out_features):
super().__init__()
self.linear = nn.Linear(in_features, out_features)
def forward(self, x, adj):
"""
x: (N, in_features)
adj: 归一化后的邻接矩阵 (N, N)
"""
# 线性变换
support = self.linear(x)
# 图卷积:聚合邻居
output = torch.matmul(adj, support)
return output
class GCN(nn.Module):
def __init__(self, nfeat, nhid, nclass, dropout=0.5):
super().__init__()
self.gc1 = GCNLayer(nfeat, nhid)
self.gc2 = GCNLayer(nhid, nclass)
self.dropout = dropout
def forward(self, x, adj):
x = F.relu(self.gc1(x, adj))
x = F.dropout(x, self.dropout, training=self.training)
x = self.gc2(x, adj)
return F.log_softmax(x, dim=1)
# 归一化邻接矩阵
def normalize_adj(adj):
"""对称归一化邻接矩阵"""
adj = adj + torch.eye(adj.size(0)) # 添加自环
deg = adj.sum(dim=1)
deg_inv_sqrt = torch.pow(deg, -0.5)
deg_inv_sqrt[torch.isinf(deg_inv_sqrt)] = 0
adj_norm = deg_inv_sqrt.unsqueeze(1) * adj * deg_inv_sqrt.unsqueeze(0) # unsqueeze增加一个维度
return adj_norm
GraphSAGE¶
核心创新¶
归纳式学习 + 多种聚合方式
算法流程¶
Text Only
对于每个节点v:
1. 采样:从邻居中采样固定数量k
2. 聚合:聚合邻居特征
- Mean aggregator
- LSTM aggregator
- Pooling aggregator
3. 拼接:将聚合结果与自身特征拼接
4. 变换:线性变换 + 非线性激活
代码实现¶
Python
class SAGELayer(nn.Module):
def __init__(self, in_dim, out_dim, aggregator='mean'):
super().__init__()
self.aggregator = aggregator
self.linear = nn.Linear(in_dim * 2, out_dim)
def forward(self, x, adj, sample_size=10):
"""
x: (N, in_dim)
adj: 邻接矩阵 (N, N)
"""
N = x.size(0)
h_neigh = []
for i in range(N):
# 获取邻居
neighbors = adj[i].nonzero(as_tuple=True)[0]
# 采样邻居
if len(neighbors) > sample_size:
sampled = neighbors[torch.randperm(len(neighbors))[:sample_size]]
else:
sampled = neighbors
# 聚合邻居特征
if len(sampled) > 0:
neigh_feat = x[sampled]
if self.aggregator == 'mean':
agg = neigh_feat.mean(dim=0)
elif self.aggregator == 'max':
agg = neigh_feat.max(dim=0)[0]
else:
agg = torch.zeros_like(x[i])
h_neigh.append(agg)
h_neigh = torch.stack(h_neigh)
# 拼接自身和邻居特征
h_concat = torch.cat([x, h_neigh], dim=1)
# 线性变换
h = F.relu(self.linear(h_concat))
return h
GAT (Graph Attention Network)¶
核心创新¶
引入注意力机制,为不同邻居分配不同权重
注意力机制¶
Text Only
对于节点i和邻居j:
注意力系数:
e_ij = LeakyReLU(a^T [Wh_i || Wh_j])
归一化:
α_ij = softmax_j(e_ij) = exp(e_ij) / Σ_k exp(e_ik)
聚合:
h_i' = σ(Σ_j α_ij Wh_j)
多头注意力:
h_i' = ||_{k=1}^K σ(Σ_j α_ij^k W^k h_j)
代码实现¶
Python
class GATLayer(nn.Module):
def __init__(self, in_dim, out_dim, num_heads=8, dropout=0.6):
super().__init__()
self.num_heads = num_heads
self.out_dim = out_dim
self.head_dim = out_dim // num_heads
# 每个头独立的线性变换
self.W = nn.Linear(in_dim, out_dim, bias=False)
# 注意力参数
self.a = nn.Parameter(torch.Tensor(num_heads, 2 * self.head_dim, 1))
self.dropout = nn.Dropout(dropout)
self.leakyrelu = nn.LeakyReLU(0.2)
nn.init.xavier_uniform_(self.a)
def forward(self, x, adj):
"""
x: (N, in_dim)
adj: 邻接矩阵 (N, N)
"""
N = x.size(0)
# 线性变换
h = self.W(x) # (N, out_dim)
h = h.view(N, self.num_heads, self.head_dim) # (N, num_heads, head_dim)
# 计算注意力系数
# 重复h以计算所有点对
h_i = h.unsqueeze(2).repeat(1, 1, N, 1) # (N, num_heads, N, head_dim)
h_j = h.unsqueeze(0).repeat(N, 1, 1, 1) # (N, num_heads, N, head_dim)
# 拼接
concat = torch.cat([h_i, h_j], dim=-1) # (N, num_heads, N, 2*head_dim)
# 计算注意力分数
e = self.leakyrelu(torch.matmul(concat, self.a)).squeeze(-1) # (N, num_heads, N)
# Mask:只保留邻居
mask = adj.unsqueeze(1).expand(-1, self.num_heads, -1)
e = e.masked_fill(mask == 0, float('-inf'))
# Softmax
alpha = F.softmax(e, dim=2) # (N, num_heads, N)
alpha = self.dropout(alpha)
# 聚合
h_prime = torch.matmul(alpha, h) # (N, num_heads, head_dim)
h_prime = h_prime.view(N, -1) # (N, out_dim)
return F.elu(h_prime)
📊 GNN任务类型¶
节点分类 (Node Classification)¶
Python
# 节点分类示例
class NodeClassifier(nn.Module):
def __init__(self, nfeat, nhid, nclass):
super().__init__()
self.gcn1 = GCNLayer(nfeat, nhid)
self.gcn2 = GCNLayer(nhid, nclass)
def forward(self, x, adj):
h = F.relu(self.gcn1(x, adj))
h = self.gcn2(h, adj)
return F.log_softmax(h, dim=1)
# 训练
model = NodeClassifier(nfeat=1433, nhid=16, nclass=7)
optimizer = torch.optim.Adam(model.parameters())
for epoch in range(200):
model.train() # train()开启训练模式
optimizer.zero_grad() # 清零梯度,防止梯度累积
output = model(features, adj_norm)
loss = F.nll_loss(output[train_mask], labels[train_mask])
loss.backward() # 反向传播计算梯度
optimizer.step() # 根据梯度更新模型参数
图分类 (Graph Classification)¶
Python
class GraphClassifier(nn.Module):
def __init__(self, nfeat, nhid, nclass):
super().__init__()
self.conv1 = GCNLayer(nfeat, nhid)
self.conv2 = GCNLayer(nhid, nhid)
self.fc = nn.Linear(nhid, nclass)
def forward(self, x, adj, batch):
"""
batch: 指示每个节点属于哪个图
"""
# 图卷积
h = F.relu(self.conv1(x, adj))
h = self.conv2(h, adj)
# 全局平均池化
h = global_mean_pool(h, batch)
# 分类
out = self.fc(h)
return F.log_softmax(out, dim=1)
def global_mean_pool(x, batch):
"""全局平均池化
x: 节点特征 (N, d)
batch: 批次索引 (N,),指示每个节点属于哪个图
"""
# 方法1: 使用PyTorch Geometric
# from torch_scatter import scatter_mean
# return scatter_mean(x, batch, dim=0)
# 方法2: 使用PyTorch原生实现
num_graphs = batch.max().item() + 1
out = torch.zeros(num_graphs, x.size(1), device=x.device)
for i in range(num_graphs):
mask = (batch == i)
out[i] = x[mask].mean(dim=0)
return out
链接预测 (Link Prediction)¶
Python
class LinkPredictor(nn.Module):
def __init__(self, nfeat, nhid):
super().__init__()
self.encoder = GCN(nfeat, nhid, nhid)
self.decoder = nn.Linear(nhid * 2, 1)
def forward(self, x, adj, edge_index):
# 编码节点
z = self.encoder(x, adj)
# 解码边
src, dst = edge_index
z_src = z[src]
z_dst = z[dst]
# 拼接并预测
z_edge = torch.cat([z_src, z_dst], dim=1)
pred = torch.sigmoid(self.decoder(z_edge))
return pred
🛠️ GNN实践技巧¶
使用PyTorch Geometric¶
Python
import torch
from torch_geometric.nn import GCNConv, GATConv, global_mean_pool
from torch_geometric.data import Data
from torch_geometric.loader import DataLoader # PyG 2.0+ 将 DataLoader 移至 loader 模块
# 定义数据
edge_index = torch.tensor([[0, 1, 1, 2],
[1, 0, 2, 1]], dtype=torch.long)
x = torch.tensor([[-1], [0], [1]], dtype=torch.float) # 每个节点一个标量特征
data = Data(x=x, edge_index=edge_index)
# 定义模型
class GNN(torch.nn.Module):
def __init__(self):
super().__init__()
self.conv1 = GCNConv(dataset.num_features, 16)
self.conv2 = GCNConv(16, dataset.num_classes)
def forward(self, data):
x, edge_index = data.x, data.edge_index
x = self.conv1(x, edge_index)
x = F.relu(x)
x = F.dropout(x, training=self.training)
x = self.conv2(x, edge_index)
return F.log_softmax(x, dim=1)
处理大图¶
Text Only
大图挑战:
- 内存:无法存储整个图的邻接矩阵
- 计算:无法一次性前向传播
解决方案:
1. 邻居采样 (Neighbor Sampling)
2. 图采样 (GraphSAGE风格)
3. 子图采样 (Cluster-GCN)
Python
from torch_geometric.loader import NeighborLoader
# 邻居采样
loader = NeighborLoader(
data,
num_neighbors=[10, 10], # 每层采样10个邻居
batch_size=64,
input_nodes=train_mask,
)
for batch in loader:
optimizer.zero_grad()
out = model(batch.x, batch.edge_index)
loss = criterion(out, batch.y)
loss.backward()
optimizer.step()
📈 GNN应用案例¶
1. 分子性质预测¶
Python
from torch_geometric.datasets import MoleculeNet
from torch_geometric.nn import GCNConv, global_mean_pool
# 加载分子数据集
dataset = MoleculeNet(root='./data', name='ESOL')
class MolecularGNN(torch.nn.Module):
def __init__(self, hidden_dim=64):
super().__init__()
self.conv1 = GCNConv(dataset.num_features, hidden_dim)
self.conv2 = GCNConv(hidden_dim, hidden_dim)
self.fc = torch.nn.Linear(hidden_dim, 1)
def forward(self, data):
x, edge_index, batch = data.x, data.edge_index, data.batch
x = F.relu(self.conv1(x, edge_index))
x = self.conv2(x, edge_index)
# 全局池化
x = global_mean_pool(x, batch)
return self.fc(x).squeeze()
2. 推荐系统¶
Python
class GNNRecommender(torch.nn.Module):
def __init__(self, num_users, num_items, emb_dim=64):
super().__init__()
self.num_users = num_users
self.user_emb = torch.nn.Embedding(num_users, emb_dim)
self.item_emb = torch.nn.Embedding(num_items, emb_dim)
self.conv1 = GCNConv(emb_dim, emb_dim)
self.conv2 = GCNConv(emb_dim, emb_dim)
def forward(self, edge_index, user_ids, item_ids):
# 构建节点特征
x = torch.cat([self.user_emb.weight, self.item_emb.weight], dim=0)
# 图卷积
x = F.relu(self.conv1(x, edge_index))
x = self.conv2(x, edge_index)
# 预测评分
user_emb = x[user_ids]
item_emb = x[self.num_users + item_ids]
return (user_emb * item_emb).sum(dim=1)
💡 总结¶
Text Only
GNN的核心:
1. 消息传递:聚合邻居信息
2. 置换不变性:节点顺序不影响结果
3. 归纳能力:支持新节点
经典模型演进:
GCN → GraphSAGE → GAT
↓ ↓ ↓
谱方法 归纳学习 注意力机制
应用方向:
- 节点分类:社交网络分析
- 图分类:分子性质预测
- 链接预测:推荐系统
- 图生成:药物发现
学习建议:
1. 先理解消息传递框架
2. 掌握GCN、GAT核心原理
3. 使用PyTorch Geometric实践
4. 从节点分类任务开始
下一步:学习 18-NLP与Transformer详解.md,掌握自然语言处理核心技术!