解密prompt系列57. Agent Context Engineering - 多智能体代码剖析

剩鹄逅 · 8 小时前

承接上篇对Context Engineering的探讨，本文将聚焦多智能体框架中的上下文管理实践。我们将深入剖析两个代表性框架：字节跳动开源的基于预定义角色与Supervisor-Worker模式的 Deer-Flow ，以及在其基础上引入动态智能体构建能力的清华 **CoorAgent ** 。通过对它们设计思路和实现细节的拆解，提炼出多智能体协作中高效管理上下文的关键策略。
Context Engineering Tips

从开源框架和我们日常的开发实践中，优化多智能体上下文管理有以下一些思路。（注：只是一种Practice，并不是Best Practice，毕竟见山不是山嘛哈哈哈哈）

降低上下文长度与复杂度 (Reduce Length & Complexity)
- 上文丢弃 (Discard Context)： 如Deer-Flow的Coordinator。若用户提问与智能体任务无关，在回复用户后，清空历史对话消息 (Message History = 0)。
- 上文隔离 (Isolate Context)： Supervisor-Worker模式的核心。每个Worker智能体拥有独立的上文环境，仅接收自身任务目标作为启动指令，与Supervisor及其他Worker隔离。
- 上文分解 (Decompose Context)： Supervisor（Planner）将复杂任务拆解为独立步骤。每个步骤复杂度降低，对应负责Worker的上文也随之简化。
降低上下文噪音 (Reduce Context Noise)
- 关注结果而非过程 (Focus on Results)： 关键策略！尤其适用于RePlan、Reflection等判断模块。无需传递冗长的执行过程消息，只需基于当前结果与目标状态即可进行后续规划或反思。直接传递原始Message往往是便捷但非最优解。
- 过滤/选择上文 (Filter/Select Context)： 类比RAG中的Re-ranking。例如Coding Agent中，子Agent仅回传关键stdout而非完整代码。
- 压缩上文 (Compress Context)： 如Deer-Flow中，Researcher对搜索步骤结果进行摘要；Reporter仅基于压缩后的摘要而非原始引用进行最终总结。在Coding Agent设计中也很常见。

框架1：Deer-Flow

Deer Flow

Deer Flow是字节跳动推出的多智能体框架，后面的很多agent框架里都能看到它的影子，Deer Flow同样使用Langgraph构建，包含以下几个核心节点

协调者节点 (coordinator): 负责与用户交互，判断是否需要转交给规划者
核心规划模块（Planner）：负责生成详细的任务执行计划
- 前置背景调查节点 (background_investigator): 执行初步网络搜索获取背景信息
- 后置人工反馈节点 (human_feedback): 允许用户审查和修改计划
研究团队节点 (research_team): 协调研究者和编码者
- 研究者节点 (researcher): 执行信息收集和研究任务
- 编码者节点 (coder): 执行数据处理和代码分析任务
报告者节点 (reporter): 生成最终研究报告

下面我们分别展开说下每个部分的亮点
前置分流（Coordinator）：智能体的“防火墙

作用： 拦截并直接处理无需启动复杂流程的请求（如问候"你好"、能力询问"你能干啥"、不当言论"你好讨厌"）。
价值： 避免无关上下文污染核心流程，提升效率与用户体验。响应后终止对话，清空历史消息（Discard Context策略）。

核心规划模块（Plan）:结构化推理与上下文处理

设计：
- 采用外层Plan + 内层Steps结构。Step定义了类型 (RESEARCH/PROCESSING)两种智能体和每一步的详细要求

class StepType(str, Enum):
RESEARCH = "research"
PROCESSING = "processing"
class Step(BaseModel):
need_web_search: bool = Field(
..., description="Must be explicitly set for each step"
)
title: str
description: str = Field(..., description="Specify exactly what data to collect")
step_type: StepType = Field(..., description="Indicates the nature of the step")
execution_res: Optional[str] = Field(
default=None, description="The Step execution result"
)
class Plan(BaseModel):
locale: str = Field(
..., description="e.g. 'en-US' or 'zh-CN', based on the user's language"
)
has_enough_context: bool
thought: str
title: str
steps: List[Step] = Field(
default_factory=list,
description="Research & Processing steps to get more context",
)

复制代码

子模块：Plan里面还有两个子模块
- Human FeedBack：用户后置判断规划是否合理
- Background_Investigator思考： 当前是用户选择是否开启背景调研，但其实自动化执行更合理，类似RAG的多步思考，或者Pre-search策略，对于一些高时效性（超过模型训练时间）的内容可以提高plan步骤的质量，但需警惕噪音的引入。
上下文处理：
- Background_Investigator: 仅基于用户query搜索，搜索结果会处理成Human Message append到原有对话列表中。但会加入“background investigation results of user query:\n”的Prefix用于区分和原始用户query的关系。
- Human_Feedback: 用户反馈作为HumanMessage附加。
- Planner: 使用所有历史Messages进行推理。

核心执行模块（Research Team）：严格隔离与结果共享

负责执行Plan的就是Research Team节点，里面包含两个智能体分别是拥有搜索和爬虫工具的researcher和拥有python代码工具的coder。这二者都使用了langgraph原生提供的react agent。

上下文处理：
- 上下文隔离 (Isolate Context)： Coder和Researcher的输入仅包含当前Step的任务描述 (title, description)。彼此执行环境独立，且与Planner隔离。
- 结构化输入 (Reduce Noise)： 避免直接传递原始Message，而是将Step结构体转换为清晰的任务指令模板，消除指代歧义（直接传递Message的问题就是在Prompt中很难指代什么是任务？）
  1. agent_input = {
  2. "messages": [HumanMessage(content=f"#Task\n\n##title\n\n{step.title}\n\n##description\n\n{step.description}")]
  3. }
  复制代码
- 结果共享： Coder/Researcher输出保存在observations中，并作为HumanMessage更新全局Message列表（供Planner后续决策）。

这里的一个点是更新的Message列表会作为Plan的上文去进一步推理是否终止任务还是继续收集加工信息，所以对于复杂任务，这一步其实给到Plan的压力会比较大。原因有两个

长度：如果前面规划的step太多，这一步会有很多的message上文，长度超长，且内容多样
复杂度：过多的内容给到plan模块，模型特有的完美主义特色，往往让模型总能找到更更多的优化点，去进一步开启信息收集任务。

所以其实可以在输出给Plan的Mesaage上文这一步进行信息压缩处理，只返回任务完成状态和一句话的信息收集概要。
最终报告撰写(Reporter)

进入Reporter的条件： 最大循环次数、Plan规划失败、或Planner判定信息充足。
报告撰写细节点
- 引用的处理：最终推理的上文是前面多个coder和researcher共同生成的，前面的步骤采用了在文本最后加入- [title](url)的markdown格式引用，最后推理也是相同。因此引用这里就没有处理段落内部inline序号的问题了。但是对于过长的推理结果，文末引用并不太友好，但确实是最simple，native的解决方案
- 上文处理：项目使用了最原始的直接使用message列表，在不需要处理引用的情况下也okay
- 任务处理：和coder、reearcher相同reporter拥有独立的任务上文也就是planner生成的最后推理总结任务。（但有个问题如果是超过最大循环次数或者失败似乎没看到任务生成的逻辑?）

框架总结

Deer-Flow是典型的线性、预定义角色、Supervisor-Worker架构（总-分-总模式）。其上下文管理核心在于严格隔离（Coordinator, Researcher, Coder, Background Investigator, Reporter 各自独立）和 分阶段结果传递（Reporter可见Researcher/Coder输出）。有效运用了丢弃、隔离、压缩等策略降低上下文负担。
框架2：CoorAgent

Cooragnet

CoorAgent是清华推出的协作多智能体框架，在Deer-Flow基础上，核心创新在于研究阶段采用动态生成的智能体（Agent Factory模式），而非预定义的Researcher/Coder。其上下文管理策略与Deer-Flow类似，下文重点分析差异点。核心节点：

协调者 (Coordinator)： 同Deer-Flow（分流）。
规划者 (Planner)： 同Deer-Flow（生成计划），关键差异在于计划中包含新智能体定义。
分发节点 (Publisher)： 分配待执行任务和智能体。
智能体工厂 (Agent Factory)： 根据Planner定义动态创建智能体。
智能体代理 (Agent Proxy)： 动态加载Agent配置，调用LangGraph预置的ReAct智能体执行任务。

核心规划模块（Planer）：任务规划

设计： Planner不仅分解任务(Steps)，还直接定义执行该步骤所需的新智能体 (NewAgent)及其详细配置（名称name、角色role、能力capabilities、贡献contribution）。Step包含指定agent_name、任务title/description和输出要求note。
思考： 另一种设计是将智能体创建职责完全下放给Agent Factory，Planner仅专注于任务分解（目标、输出格式、约束）。这能实现更好的节点解耦。

interface NewAgent {
name: string;
role: string;
capabilities: string;
contribution: string;
}
interface Step {
agent_name: string;
title: string;
description: string;
note?: string;
}
interface PlanWithAgents {
thought: string;
title: string;
new_agents_needed: NewAgent[];
steps: Step[];
}

复制代码

分发节点（Publisher）：任务路由

个人感觉分发节点不是很必要，这一步只是判断执行步骤以及是否终止，像Deer-Flow，OpenManus都采用了直接遍历Plan步骤的方案，如果是需要判断是否选择终止，其实可以直接放到Plan的设计里面去。并且考虑新智能体之间的独立无关性，应该可以走并发调用Agent Factory同时创建多智能体。
智能体工厂（Agent Factory）：动态创建

根据Planner要求使用模型直接设计新的智能体，包含智能体必备的几大要素：任务描述、模型类型（文本、视觉），工具选择，Prompt指令，如下结构体。

interface Tool {
name: string;
description: string;
}
interface AgentBuilder {
agent_name: string;
agent_description: string;
thought: string;
llm_type: string;
selected_tools: Tool[];
prompt: string;
}

复制代码

Agent Factory的system prompt相对复杂，主要就是因为需要设计新智能体的System Prompt。所以其实这里可以借鉴Meta Prompt的思路，把构建Prompt的部分单独拎出来，分两步实现更干净~
那这里简单说下OpenAI推出的Meta-Prompting，简单说就是openai从日常的prompt写作中抽象了一些规则，并把这些规则总结成了Meta-Prompt（取Meta-Learning之义）,然后使用Meta-Prompt对你原始的简单prompt进行细化，就能得到更为详细的Prompt任务描述，如下

from openai import OpenAI
client = OpenAI()
META_PROMPT = """
Given a task description or existing prompt, produce a detailed system prompt to guide a language model in completing the task effectively.
# Guidelines
- Understand the Task: Grasp the main objective, goals, requirements, constraints, and expected output.
- Minimal Changes: If an existing prompt is provided, improve it only if it's simple. For complex prompts, enhance clarity and add missing elements without altering the original structure.
- Reasoning Before Conclusions**: Encourage reasoning steps before any conclusions are reached. ATTENTION! If the user provides examples where the reasoning happens afterward, REVERSE the order! NEVER START EXAMPLES WITH CONCLUSIONS!
- Reasoning Order: Call out reasoning portions of the prompt and conclusion parts (specific fields by name). For each, determine the ORDER in which this is done, and whether it needs to be reversed.
- Conclusion, classifications, or results should ALWAYS appear last.
- Examples: Include high-quality examples if helpful, using placeholders [in brackets] for complex elements.
- What kinds of examples may need to be included, how many, and whether they are complex enough to benefit from placeholders.
- Clarity and Conciseness: Use clear, specific language. Avoid unnecessary instructions or bland statements.
- Formatting: Use markdown features for readability. DO NOT USE ``` CODE BLOCKS UNLESS SPECIFICALLY REQUESTED.
- Preserve User Content: If the input task or prompt includes extensive guidelines or examples, preserve them entirely, or as closely as possible. If they are vague, consider breaking down into sub-steps. Keep any details, guidelines, examples, variables, or placeholders provided by the user.
- Constants: DO include constants in the prompt, as they are not susceptible to prompt injection. Such as guides, rubrics, and examples.
- Output Format: Explicitly the most appropriate output format, in detail. This should include length and syntax (e.g. short sentence, paragraph, JSON, etc.)
- For tasks outputting well-defined or structured data (classification, JSON, etc.) bias toward outputting a JSON.
- JSON should never be wrapped in code blocks (```) unless explicitly requested.
The final prompt you output should adhere to the following structure below. Do not include any additional commentary, only output the completed system prompt. SPECIFICALLY, do not include any additional messages at the start or end of the prompt. (e.g. no "---")
[Concise instruction describing the task - this should be the first line in the prompt, no section header]
[Additional details as needed.]
[Optional sections with headings or bullet points for detailed steps.]
# Steps [optional]
[optional: a detailed breakdown of the steps necessary to accomplish the task]
# Output Format
[Specifically call out how the output should be formatted, be it response length, structure e.g. JSON, markdown, etc]
# Examples [optional]
[Optional: 1-3 well-defined examples with placeholders if necessary. Clearly mark where examples start and end, and what the input and output are. User placeholders as necessary.]
[If the examples are shorter than what a realistic example is expected to be, make a reference with () explaining how real examples should be longer / shorter / different. AND USE PLACEHOLDERS! ]
# Notes [optional]
[optional: edge cases, details, and an area to call or repeat out specific important considerations]
""".strip()
def generate_prompt(task_or_prompt: str):
completion = client.chat.completions.create(
model="gpt-4o",
messages=[
{
"role": "system",
"content": META_PROMPT,
},
{
"role": "user",
"content": "Task, Goal, or Current Prompt:\n" + task_or_prompt,
},
],
)
return completion.choices[0].message.content

复制代码

前面Deer-Flow的博客中就提到它们项目的Prompt就是使用该方法由模型生成的。在我们的测试中Meta-Prompt也是有一些约束条件在

任务需要相对常见：人能简单清洗描述任务目标、输出格式、限制条件的比较适合。否则还是会有大量人的调优过程在
对模型有要求：不同能力的模型使用meta-prompt效果天差地别，qwen-plus对比Deepseek-v3，你会观察到生成得到的prompt会显著更短，更简单。本质上Meta-Prompt通过对任务给出更多的要求、路径、限制条件、备注来限制模型发挥，让模型在任务上表现更稳定，那更强的模型对任务空间的划分会更加细致。
生成模型和推理模型的一致性：使用Deepseek-v3得到的Prompt在qwen上表现并不好，但使用R1得到的Prompt在V3上表现很好，问题有可能来自是否同源、或者是否能力相似

技术实现：动态工作流编译

CoorAgent因为存在动态生成agent，因此Langgraph实现会需要一些小巧思，因为graph编译要求预定义好节点。 CoorAgent选择在LangGraph外层自定义工作流引擎 (CompiledWorkflow)，管理节点间跳转。如下

class CompiledWorkflow:
def __init__(self, nodes: Dict[str, NodeFunc], edges: Dict[str, List[str]], start_node: str):
self.nodes = nodes
self.edges = edges
self.start_node = start_node
def invoke(self, state: State) -> State:
current_node = self.start_node
print(f"CompiledWorkflow current_node: {current_node}")
while current_node != "__end__":
if current_node not in self.nodes:
raise ValueError(f"Node {current_node} not found in workflow")
node_func = self.nodes[current_node]
command = node_func(state)
if hasattr(command, 'update') and command.update:
for key, value in command.update.items():
print(f"update {key} to {value}")
state[key] = value
current_node = command.goto
return state

复制代码

我们其实也碰到了类似动态智能体定义的问题，我们是选择在node内部进行动态生成，这样就绕开了动态节点的问题，还能更好利用langgraph其他branch，send之类的原生特性。
那多智能体就唠这么多，后面该唠唠MCP了~
想看更全的大模型论文·微调预训练数据·开源框架·AIGC应用 >> DecryPrompt

来源：程序园用户自行投稿发布，如果侵权，请联系站长删除
免责声明：如果侵犯了您的权益，请联系站长，我们会及时删除侵权内容，谢谢合作！

账号		自动登录	找回密码
密码			立即注册