05 - 模型配置与优化¶
⚠️ 时效性说明:本章涉及前沿模型/价格/榜单等信息,可能随版本快速变化;请以论文原文、官方发布页和 API 文档为准。
模型选择、参数配置、性能优化
📖 章节概述¶
本章将深入介绍如何在Dify中配置和优化模型,包括模型选择策略、参数调优技巧、性能优化方法、成本控制策略以及实际应用场景。通过详细的代码示例和实践指导,帮助读者掌握模型配置和优化的核心技能。
🎯 学习目标¶
完成本章后,你将能够:
- 深入理解不同模型的特点和适用场景
- 掌握模型参数的调优方法
- 理解提示词工程的原理和技巧
- 能够优化模型性能和降低成本
- 掌握模型监控和评估的方法
- 能够根据实际需求选择最优配置
1. 模型选择¶
1.1 模型类型详解¶
1.1.1 OpenAI模型¶
GPT-4o: - 特点:多模态模型,速度快、能力强(注意:OpenAI已于2025年推出GPT-5系列,GPT-4o不再是最新旗舰) - 适用场景:复杂推理、代码生成、多模态任务、专业领域 - 成本:中等 - 延迟:低
GPT-4o-mini: - 特点:轻量高效,性价比极高,替代GPT-3.5-turbo - 适用场景:一般对话、文本生成、分类、简单任务 - 成本:极低 - 延迟:极低
代码示例 - OpenAI模型配置:
import requests
from typing import Dict, List, Optional
class OpenAIModelConfig:
"""OpenAI模型配置"""
def __init__(self, api_key):
self.api_key = api_key
self.headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
self.base_url = "https://api.dify.ai/v1"
def configure_gpt4(self, app_id: str, config: Dict = None) -> Dict:
"""
配置GPT-4模型
Args:
app_id: 应用ID
config: 配置参数
"""
if config is None:
config = {
"temperature": 0.7,
"top_p": 0.9,
"max_tokens": 2048,
"presence_penalty": 0,
"frequency_penalty": 0
}
return self._configure_model(app_id, "openai", "gpt-4o", config)
def configure_gpt4o_mini(self, app_id: str, config: Dict = None) -> Dict:
"""
配置GPT-4o-mini模型
Args:
app_id: 应用ID
config: 配置参数
"""
if config is None:
config = {
"temperature": 0.7,
"top_p": 0.9,
"max_tokens": 2048,
"presence_penalty": 0,
"frequency_penalty": 0
}
return self._configure_model(app_id, "openai", "gpt-4o-mini", config)
def _configure_model(self, app_id: str, provider: str,
model_name: str, config: Dict) -> Dict:
"""
配置模型(概念示例)
注意:Dify不提供通过API配置应用模型的接口
模型配置需在Dify Web界面完成:
应用设置 -> 模型配置 -> 选择提供商和模型 -> 调整参数
"""
model_config = {
"provider": provider,
"model_name": model_name,
"completion_params": {
"temperature": config.get("temperature", 0.7),
"top_p": config.get("top_p", 0.9),
"max_tokens": config.get("max_tokens", 2048),
"presence_penalty": config.get("presence_penalty", 0),
"frequency_penalty": config.get("frequency_penalty", 0)
}
}
print(f"请在Dify Web界面配置模型: {model_config}")
return model_config
def get_app_parameters(self, app_id: str) -> Dict:
"""获取应用参数信息(这是真实可用的API)"""
url = f"{self.base_url}/parameters"
try: # try/except捕获异常
response = requests.get(url, headers=self.headers)
response.raise_for_status()
return response.json()
except requests.exceptions.RequestException as e:
return {"error": str(e)}
# 使用示例
if __name__ == "__main__":
config = OpenAIModelConfig(api_key="your_api_key_here")
# 配置GPT-4
result = config.configure_gpt4(
app_id="app_id_here",
config={
"temperature": 0.5,
"top_p": 0.95,
"max_tokens": 4096,
"presence_penalty": 0.1,
"frequency_penalty": 0.1
}
)
print(f"GPT-4配置结果: {result}")
# 配置GPT-4o-mini
result = config.configure_gpt4o_mini(
app_id="app_id_here",
config={
"temperature": 0.8,
"top_p": 0.9,
"max_tokens": 2048
}
)
print(f"GPT-4o-mini配置结果: {result}")
1.1.2 Anthropic模型¶
Claude Sonnet 4.6: - 特点:2026年最新旗舰级Claude模型,性能卓越,支持超长上下文和复杂推理 - 适用场景:复杂推理、代码生成、长文本处理、多模态任务 - 成本:中等 - 延迟:低
Claude Opus 4.6: - 特点:顶级能力模型,适合最复杂的任务,推理能力最强 - 适用场景:高级研究、复杂分析、创意写作 - 成本:高 - 延迟:中等
Claude Haiku 4.6: - 特点:轻量快速模型,适合高吞吐场景,性价比极高 - 适用场景:一般对话、文本生成、分类任务 - 成本:低 - 延迟:极低
代码示例 - Anthropic模型配置:
class AnthropicModelConfig:
"""Anthropic模型配置"""
def __init__(self, api_key):
self.api_key = api_key
self.headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
self.base_url = "https://api.dify.ai/v1"
def configure_claude_opus(self, app_id: str, config: Dict = None) -> Dict:
"""配置Claude Opus 4.6"""
if config is None:
config = {
"temperature": 0.7,
"top_p": 0.9,
"max_tokens": 8192,
"top_k": 0
}
return self._configure_model(app_id, "anthropic", "claude-opus-4-6-20260215", config)
def configure_claude_sonnet(self, app_id: str, config: Dict = None) -> Dict:
"""配置Claude Sonnet 4.6"""
if config is None:
config = {
"temperature": 0.7,
"top_p": 0.9,
"max_tokens": 8192,
"top_k": 0
}
return self._configure_model(app_id, "anthropic", "claude-sonnet-4-6-20260215", config)
def configure_claude_haiku(self, app_id: str, config: Dict = None) -> Dict:
"""配置Claude Haiku 4.6"""
if config is None:
config = {
"temperature": 0.7,
"top_p": 0.9,
"max_tokens": 4096,
"top_k": 0
}
return self._configure_model(app_id, "anthropic", "claude-haiku-4-6-20260215", config)
def _configure_model(self, app_id: str, provider: str,
model_name: str, config: Dict) -> Dict:
"""
配置Anthropic模型(概念示例)
注意:模型配置需在Dify Web界面完成
"""
model_config = {
"provider": provider,
"model_name": model_name,
"completion_params": {
"temperature": config.get("temperature", 0.7),
"top_p": config.get("top_p", 0.9),
"max_tokens": config.get("max_tokens", 4096),
"top_k": config.get("top_k", 0)
}
}
print(f"请在Dify Web界面配置模型: {model_config}")
return model_config
# 使用示例
if __name__ == "__main__":
config = AnthropicModelConfig(api_key="your_api_key_here")
# 配置Claude Opus 4.6
result = config.configure_claude_opus(
app_id="app_id_here",
config={
"temperature": 0.5,
"top_p": 0.95,
"max_tokens": 8192
}
)
print(f"Claude Opus 4.6配置结果: {result}")
1.1.3 本地模型¶
Llama 3: - 特点:开源免费,可私有部署 - 适用场景:数据敏感、成本控制、定制需求 - 成本:低(仅需硬件成本) - 延迟:取决于硬件
Qwen: - 特点:中文能力强,开源免费 - 适用场景:中文应用、本地部署 - 成本:低 - 延迟:取决于硬件
代码示例 - 本地模型配置:
class LocalModelConfig:
"""本地模型配置"""
def __init__(self, api_key):
self.api_key = api_key
self.headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
self.base_url = "https://api.dify.ai/v1"
def configure_llama3(self, app_id: str, endpoint: str,
config: Dict = None) -> Dict:
"""
配置Llama 3模型
Args:
app_id: 应用ID
endpoint: 模型服务端点
config: 配置参数
"""
if config is None:
config = {
"temperature": 0.7,
"top_p": 0.9,
"max_tokens": 2048
}
return self._configure_local_model(
app_id,
"ollama/llama3.1",
endpoint,
config
)
def configure_qwen(self, app_id: str, endpoint: str,
config: Dict = None) -> Dict:
"""配置Qwen模型"""
if config is None:
config = {
"temperature": 0.7,
"top_p": 0.9,
"max_tokens": 2048
}
return self._configure_local_model(
app_id,
"ollama/qwen",
endpoint,
config
)
def _configure_local_model(self, app_id: str, model_name: str,
endpoint: str, config: Dict) -> Dict:
"""
配置本地模型(概念示例)
注意:本地模型配置需在Dify Web界面完成:
设置 -> 模型提供商 -> 添加Ollama提供商 -> 填写服务地址
Ollama在Dify中是独立的提供商,不是通过OpenAI提供商配置
"""
model_config = {
"provider": "ollama", # Ollama是独立的提供商
"model_name": model_name, # 例如: llama2, qwen
"base_url": endpoint, # 例如: http://localhost:11434
"completion_params": {
"temperature": config.get("temperature", 0.7),
"top_p": config.get("top_p", 0.9),
"max_tokens": config.get("max_tokens", 2048)
}
}
print(f"请在Dify Web界面配置本地模型: {model_config}")
return model_config
# 使用示例
if __name__ == "__main__":
config = LocalModelConfig(api_key="your_api_key_here")
# 配置Llama 3
result = config.configure_llama3(
app_id="app_id_here",
endpoint="http://localhost:11434/v1",
config={
"temperature": 0.7,
"top_p": 0.9,
"max_tokens": 2048
}
)
print(f"Llama 3配置结果: {result}")
1.2 模型选择策略¶
代码示例 - 智能模型选择器:
class ModelSelector:
"""智能模型选择器"""
def __init__(self):
self.model_capabilities = {
"gpt-4o": {
"reasoning": 10,
"coding": 10,
"creativity": 9,
"speed": 5,
"cost": 1
},
"gpt-4o-mini": {
"reasoning": 7,
"coding": 8,
"creativity": 7,
"speed": 9,
"cost": 9
},
"claude-opus-4-6": {
"reasoning": 10,
"coding": 10,
"creativity": 10,
"speed": 6,
"cost": 2
},
"claude-sonnet-4-6": {
"reasoning": 9,
"coding": 9,
"creativity": 9,
"speed": 8,
"cost": 6
},
"claude-haiku-4-6": {
"reasoning": 7,
"coding": 7,
"creativity": 7,
"speed": 10,
"cost": 9
}
}
def select_model(self, task_type: str,
priority: str = "quality") -> str:
"""
根据任务类型和优先级选择模型
Args:
task_type: 任务类型 (reasoning/coding/creativity/general)
priority: 优先级 (quality/speed/cost)
"""
# 根据优先级调整权重
if priority == "quality":
weights = {
"reasoning": 0.4,
"coding": 0.3,
"creativity": 0.2,
"speed": 0.05,
"cost": 0.05
}
elif priority == "speed":
weights = {
"reasoning": 0.2,
"coding": 0.2,
"creativity": 0.2,
"speed": 0.3,
"cost": 0.1
}
else: # cost
weights = {
"reasoning": 0.2,
"coding": 0.2,
"creativity": 0.2,
"speed": 0.1,
"cost": 0.3
}
# 计算每个模型的得分
scores = {}
for model, capabilities in self.model_capabilities.items():
score = 0
for capability, weight in weights.items():
score += capabilities.get(capability, 0) * weight
scores[model] = score
# 根据任务类型调整
if task_type == "reasoning":
scores = {k: v * 1.2 for k, v in scores.items()}
elif task_type == "coding":
scores = {k: v * 1.1 for k, v in scores.items()}
elif task_type == "creativity":
scores = {k: v * 1.1 for k, v in scores.items()}
# 返回得分最高的模型
return max(scores, key=scores.get)
def get_model_recommendation(self, task_description: str,
constraints: Dict = None) -> Dict:
"""
获取模型推荐
Args:
task_description: 任务描述
constraints: 约束条件 (max_cost, min_speed, etc.)
"""
# 分析任务类型
task_type = self._analyze_task(task_description)
# 确定优先级
priority = constraints.get("priority", "quality") if constraints else "quality"
# 选择模型
model = self.select_model(task_type, priority)
# 获取模型配置
model_config = self.model_capabilities[model]
return {
"recommended_model": model,
"task_type": task_type,
"priority": priority,
"model_capabilities": model_config,
"reason": f"基于{task_type}任务和{priority}优先级选择"
}
def _analyze_task(self, task_description: str) -> str:
"""分析任务类型"""
task_lower = task_description.lower()
if any(keyword in task_lower for keyword in
["推理", "reasoning", "逻辑", "logic", "分析", "analysis"]):
return "reasoning"
elif any(keyword in task_lower for keyword in
["代码", "code", "编程", "programming", "开发", "development"]):
return "coding"
elif any(keyword in task_lower for keyword in
["创意", "creative", "写作", "writing", "生成", "generation"]):
return "creativity"
else:
return "general"
# 使用示例
if __name__ == "__main__":
selector = ModelSelector()
# 示例1:复杂推理任务
recommendation = selector.get_model_recommendation(
task_description="需要解决复杂的数学推理问题",
constraints={"priority": "quality"}
)
print(f"推荐模型: {recommendation['recommended_model']}")
print(f"原因: {recommendation['reason']}")
# 示例2:代码生成任务
recommendation = selector.get_model_recommendation(
task_description="生成Python代码实现快速排序",
constraints={"priority": "speed"}
)
print(f"\n推荐模型: {recommendation['recommended_model']}")
print(f"原因: {recommendation['reason']}")
# 示例3:创意写作任务
recommendation = selector.get_model_recommendation(
task_description="写一个科幻小说开头",
constraints={"priority": "quality"}
)
print(f"\n推荐模型: {recommendation['recommended_model']}")
print(f"原因: {recommendation['reason']}")
2. 参数配置¶
2.1 基础参数详解¶
2.1.1 Temperature(温度)¶
作用:控制输出的随机性 - 范围:0.0 - 2.0 - 低值(0.0-0.3):输出更确定、更一致 - 中值(0.4-0.7):平衡创造性和一致性 - 高值(0.8-2.0):输出更随机、更有创造性
适用场景: - 低温度:代码生成、事实问答、技术文档 - 中温度:一般对话、内容生成 - 高温度:创意写作、头脑风暴
代码示例 - 温度调优:
class TemperatureOptimizer:
"""温度参数优化器"""
def __init__(self, api_key, app_id):
self.api_key = api_key
self.app_id = app_id
self.headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
self.base_url = "https://api.dify.ai/v1"
def test_temperature(self, prompt: str, temperatures: List[float]) -> Dict:
"""
测试不同温度参数的效果
Args:
prompt: 测试提示词
temperatures: 温度参数列表
"""
results = {}
for temp in temperatures:
# 配置温度
self._set_temperature(temp)
# 发送请求
response = self._send_message(prompt)
results[temp] = {
"response": response.get("answer", ""),
"tokens": response.get("metadata", {}).get("usage", {}).get("total_tokens", 0)
}
return results
def _set_temperature(self, temperature: float):
"""
设置温度参数
注意:模型参数配置需在Dify Web界面完成
此方法仅作为测试框架的概念展示
实际测试时可在Web界面手动调整温度后再调用API
"""
print(f"请在Dify Web界面将温度设置为: {temperature}")
def _send_message(self, prompt: str) -> Dict:
"""发送消息"""
url = f"{self.base_url}/chat-messages"
payload = {
"inputs": {},
"query": prompt,
"response_mode": "blocking",
"user": "temperature_test"
}
response = requests.post(url, headers=self.headers, json=payload)
return response.json()
def recommend_temperature(self, task_type: str) -> float:
"""
根据任务类型推荐温度参数
Args:
task_type: 任务类型
"""
recommendations = {
"code_generation": 0.2,
"factual_qa": 0.3,
"technical_writing": 0.4,
"general_chat": 0.7,
"creative_writing": 0.9,
"brainstorming": 1.0
}
return recommendations.get(task_type, 0.7)
# 使用示例
if __name__ == "__main__":
optimizer = TemperatureOptimizer(
api_key="your_api_key_here",
app_id="app_id_here"
)
# 测试不同温度
results = optimizer.test_temperature(
prompt="写一个Python函数计算斐波那契数列",
temperatures=[0.2, 0.5, 0.8, 1.0]
)
print("温度测试结果:")
for temp, result in results.items():
print(f"\n温度 {temp}:")
print(f"响应: {result['response'][:100]}...")
print(f"Token数: {result['tokens']}")
# 获取推荐温度
recommended_temp = optimizer.recommend_temperature("code_generation")
print(f"\n推荐温度: {recommended_temp}")
2.1.2 Top P(核采样)¶
作用:控制输出的多样性 - 范围:0.0 - 1.0 - 低值(0.1-0.3):输出更集中、更保守 - 高值(0.7-1.0):输出更多样、更丰富
与Temperature的区别: - Temperature控制整体的随机性 - Top P控制从哪些可能的词中选择
代码示例 - Top P调优:
class TopPOptimizer:
"""Top P参数优化器"""
def __init__(self, api_key, app_id):
self.api_key = api_key
self.app_id = app_id
self.headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
self.base_url = "https://api.dify.ai/v1"
def test_top_p(self, prompt: str, top_p_values: List[float]) -> Dict:
"""测试不同Top P值的效果"""
results = {}
for top_p in top_p_values:
# 配置Top P
self._set_top_p(top_p)
# 发送请求
response = self._send_message(prompt)
results[top_p] = {
"response": response.get("answer", ""),
"tokens": response.get("metadata", {}).get("usage", {}).get("total_tokens", 0)
}
return results
def _set_top_p(self, top_p: float):
"""
设置Top P参数
注意:模型参数配置需在Dify Web界面完成
"""
print(f"请在Dify Web界面将Top P设置为: {top_p}")
def _send_message(self, prompt: str) -> Dict:
"""发送消息"""
url = f"{self.base_url}/chat-messages"
payload = {
"inputs": {},
"query": prompt,
"response_mode": "blocking",
"user": "top_p_test"
}
response = requests.post(url, headers=self.headers, json=payload)
return response.json()
# 使用示例
if __name__ == "__main__":
optimizer = TopPOptimizer(
api_key="your_api_key_here",
app_id="app_id_here"
)
# 测试不同Top P值
results = optimizer.test_top_p(
prompt="请介绍人工智能的发展历程",
top_p_values=[0.1, 0.5, 0.9, 1.0]
)
print("Top P测试结果:")
for top_p, result in results.items():
print(f"\nTop P {top_p}:")
print(f"响应: {result['response'][:100]}...")
2.2 高级参数详解¶
2.2.1 Presence Penalty(存在惩罚)¶
作用:惩罚模型重复相同的内容 - 范围:-2.0 到 2.0 - 正值:鼓励模型谈论新话题 - 负值:鼓励模型重复相同内容
适用场景: - 正值:需要多样性、避免重复 - 负值:需要一致性、强调重点
2.2.2 Frequency Penalty(频率惩罚)¶
作用:惩罚模型频繁使用相同的词 - 范围:-2.0 到 2.0 - 正值:减少重复用词 - 负值:增加重复用词
代码示例 - 惩罚参数调优:
class PenaltyOptimizer:
"""惩罚参数优化器"""
def __init__(self, api_key, app_id):
self.api_key = api_key
self.app_id = app_id
self.headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
self.base_url = "https://api.dify.ai/v1"
def test_penalties(self, prompt: str,
presence_penalties: List[float],
frequency_penalties: List[float]) -> Dict:
"""
测试不同惩罚参数组合
Args:
prompt: 测试提示词
presence_penalties: 存在惩罚列表
frequency_penalties: 频率惩罚列表
"""
results = {}
for presence in presence_penalties:
for frequency in frequency_penalties:
# 配置惩罚参数
self._set_penalties(presence, frequency)
# 发送请求
response = self._send_message(prompt)
results[f"p{presence}_f{frequency}"] = {
"response": response.get("answer", ""),
"tokens": response.get("metadata", {}).get("usage", {}).get("total_tokens", 0)
}
return results
def _set_penalties(self, presence: float, frequency: float):
"""
设置惩罚参数
注意:模型参数配置需在Dify Web界面完成
"""
print(f"请在Dify Web界面设置 presence_penalty={presence}, frequency_penalty={frequency}")
def _send_message(self, prompt: str) -> Dict:
"""发送消息"""
url = f"{self.base_url}/chat-messages"
payload = {
"inputs": {},
"query": prompt,
"response_mode": "blocking",
"user": "penalty_test"
}
response = requests.post(url, headers=self.headers, json=payload)
return response.json()
# 使用示例
if __name__ == "__main__":
optimizer = PenaltyOptimizer(
api_key="your_api_key_here",
app_id="app_id_here"
)
# 测试不同惩罚参数组合
results = optimizer.test_penalties(
prompt="写一篇关于人工智能的文章",
presence_penalties=[0, 0.5, 1.0],
frequency_penalties=[0, 0.5, 1.0]
)
print("惩罚参数测试结果:")
for key, result in results.items():
print(f"\n{key}:")
print(f"响应: {result['response'][:100]}...")
3. 性能优化¶
3.1 提示词优化¶
代码示例 - 提示词优化器:
class PromptOptimizer:
"""提示词优化器"""
def __init__(self, api_key, app_id):
self.api_key = api_key
self.app_id = app_id
self.headers = {
"Authorization": f"Bearer {api_key}",
"Content-Type": "application/json"
}
self.base_url = "https://api.dify.ai/v1"
def optimize_prompt(self, original_prompt: str,
iterations: int = 3) -> Dict:
"""
优化提示词
Args:
original_prompt: 原始提示词
iterations: 迭代次数
"""
current_prompt = original_prompt
best_prompt = original_prompt
best_score = 0
for i in range(iterations):
# 测试当前提示词
score = self._evaluate_prompt(current_prompt)
# 更新最佳提示词
if score > best_score:
best_score = score
best_prompt = current_prompt
# 使用AI优化提示词
current_prompt = self._improve_prompt(current_prompt, score)
return {
"original_prompt": original_prompt,
"optimized_prompt": best_prompt,
"improvement": best_score
}
def _evaluate_prompt(self, prompt: str) -> float:
"""评估提示词质量"""
# 发送测试请求
response = self._send_message(prompt)
# 简单评估:基于响应长度和相关性
answer = response.get("answer", "")
length_score = min(len(answer) / 500, 1.0)
# 这里可以添加更复杂的评估逻辑
return length_score
def _improve_prompt(self, prompt: str, current_score: float) -> str:
"""改进提示词"""
improvement_prompt = f"""
请改进以下提示词,使其更清晰、更有效:
当前提示词:{prompt}
当前得分:{current_score}
请提供改进后的提示词。
"""
response = self._send_message(improvement_prompt)
return response.get("answer", prompt)
def _send_message(self, prompt: str) -> Dict:
"""发送消息"""
url = f"{self.base_url}/chat-messages"
payload = {
"inputs": {},
"query": prompt,
"response_mode": "blocking",
"user": "prompt_optimizer"
}
response = requests.post(url, headers=self.headers, json=payload)
return response.json()
# 使用示例
if __name__ == "__main__":
optimizer = PromptOptimizer(
api_key="your_api_key_here",
app_id="app_id_here"
)
# 优化提示词
result = optimizer.optimize_prompt(
original_prompt="写一个Python函数",
iterations=3
)
print(f"原始提示词: {result['original_prompt']}")
print(f"优化提示词: {result['optimized_prompt']}")
print(f"改进程度: {result['improvement']}")
3.2 缓存策略¶
代码示例 - 缓存管理器:
import hashlib
import json
from typing import Dict, Optional
import time
class CacheManager:
"""缓存管理器"""
def __init__(self):
self.cache = {}
self.cache_stats = {
"hits": 0,
"misses": 0,
"total_requests": 0
}
def get(self, key: str) -> Optional[Dict]:
"""获取缓存"""
self.cache_stats["total_requests"] += 1
if key in self.cache:
cache_item = self.cache[key]
# 检查是否过期
if time.time() < cache_item["expiry"]:
self.cache_stats["hits"] += 1
return cache_item["data"]
else:
# 删除过期缓存
del self.cache[key]
self.cache_stats["misses"] += 1
return None
def set(self, key: str, data: Dict, ttl: int = 3600):
"""
设置缓存
Args:
key: 缓存键
data: 缓存数据
ttl: 生存时间(秒)
"""
self.cache[key] = {
"data": data,
"expiry": time.time() + ttl,
"created_at": time.time()
}
def generate_key(self, prompt: str, model: str,
params: Dict) -> str:
"""
生成缓存键
Args:
prompt: 提示词
model: 模型名称
params: 模型参数
"""
key_string = f"{prompt}|{model}|{json.dumps(params, sort_keys=True)}" # json.dumps将Python对象转为JSON字符串
return hashlib.md5(key_string.encode()).hexdigest()
def get_stats(self) -> Dict:
"""获取缓存统计"""
hit_rate = 0
if self.cache_stats["total_requests"] > 0:
hit_rate = self.cache_stats["hits"] / self.cache_stats["total_requests"]
return {
**self.cache_stats,
"hit_rate": hit_rate,
"cache_size": len(self.cache)
}
def clear(self):
"""清空缓存"""
self.cache.clear()
self.cache_stats = {
"hits": 0,
"misses": 0,
"total_requests": 0
}
# 使用示例
if __name__ == "__main__":
cache = CacheManager()
# 生成缓存键
key = cache.generate_key(
prompt="写一个Python函数",
model="gpt-4o-mini",
params={"temperature": 0.7, "max_tokens": 2048}
)
# 设置缓存
cache.set(key, {"answer": "这是一个Python函数..."}, ttl=3600)
# 获取缓存
cached_data = cache.get(key)
print(f"缓存数据: {cached_data}")
# 获取统计
stats = cache.get_stats()
print(f"缓存统计: {stats}")
4. 练习题¶
基础练习¶
- 配置模型
- 选择模型
- 配置参数
- 测试效果
进阶练习¶
- 优化模型性能
- 调整温度参数
- 优化提示词
- 实现缓存
5. 最佳实践¶
✅ 推荐做法¶
- 参数调优
- 测试不同参数
- 记录效果
-
选择最优配置
-
成本控制
- 监控Token使用
- 选择合适模型
- 优化提示词
❌ 避免做法¶
- 过度调优
- 根据实际需求
- 避免过度优化
- 平衡性能和成本
6. 常见问题¶
Q1: 如何选择合适的模型?¶
A: 选择标准: - 任务需求:简单任务用小模型,复杂任务用大模型 - 性能要求:实时性要求高用小模型,质量要求高用大模型 - 成本考虑:平衡性能和成本
Q2: 如何优化提示词?¶
A: 优化技巧: - 清晰明确:明确表达需求 - 提供示例:给出期望的输出格式 - 分步骤:将复杂任务分解为多个步骤
7. 总结¶
本章深入介绍了模型配置与优化的核心内容,包括:
- 模型选择:OpenAI、Anthropic、本地模型
- 参数配置:Temperature、Top P、惩罚参数
- 性能优化:提示词优化、缓存策略
通过本章的学习,你应该能够熟练配置和优化Dify模型了。
8. 下一步¶
继续学习06-部署与发布,深入了解应用部署和发布的方法。
最后更新日期:2026-02-12 适用版本:Dify实战教程 v2026