我要投稿

当产品经理谈到用LLM Agent构建新一代智能体的时候，他们在说什么？

发布日期：2024-08-18 08:13:38 浏览次数： 5285

作者：AI 博物院

微信搜一搜，关注“AI 博物院”

什么是agent

我们知道大型语言模型(LLM)的强大之处在于其推理能力, 它能够接收输入，并进行分析、推理以及输出文本、代码或媒体。然而，它并未像人类那样具备规划与思考能力，无法运用各种工具与物理世界互动，也缺乏人类的记忆能力。

例如，在一个企业中，员工A想要知道自己的剩余年假天数，仅依赖LLM却无法给出答案。然而，如果我们能让LLM识别出用户的意图——也就是查询剩余年假，并从问题中抽取出员工A的信息，随后通过调用业务系统提供的接口，获取员工A的剩余年假时间，那将会非常有用。这正是"Agent"的概念。

Agent是一种结合了大型语言模型（LLM）的推理能力和外部工具调用能力的应用形态，它可以完成一些相对复杂的任务。

让我们用一个形象的比喻来解释：Agent接收一项任务，它使用大型语言模型（LLM）作为其“大脑”或“思考的工具”，依赖这个“大脑”来决定为了完成任务需要进行何种操作。可以将Agent视为具有战略眼光的指挥官，它不仅了解战场上每个单元的能力，还能有效地协调它们以完成更复杂的任务。

哪些业务场景可以使用agent

如果你的业务场景符合以下条件，那么采用Agent应用架构将非常适合：

业务流程需要多步骤执行，涉及复杂的流程编排。
问题可以被划分为多个子模块，每一个子模块都有清晰定义的输入、输出和功能，并且可以明确地判断其是否已完成目标。

Agent架构流程

Agent 构建在大语言模型的推理能力基础上，对大语言模型的 Planning 规划的方案使用工具执行（Action） ,并对执行的结果进行观测（Observation）, 保证任务的落地执行。

React

目前，主流的Agent对话能力实现框架是ReAct，这是一种由普林斯顿大学和Google在2022年提出的提示词方法，它成功地融合了思考和行动。该框架的历史演变可以参见下图:

Reason Only：这种方法采用"Chain-of-thought"做出多步推理。为了鼓励模型进行连贯思考，它在问题输入前加入“Let’s think step by step”的提示词，而不是直接呈现答案。然而，其明显的缺陷在于，Reason Only 只专注于内部的推导过程，并未与外部世界产生交互，因此可能会依据错误或过时的信息进行推理。
Act-Only：该策略透过单步行动Action来获得观察结果Observation。但是，它的短板在于，行动执行过快，没有充分思考，最终的输出可能并未满足用户的实际需求。
ReAct：ReAct有效地结合了思考和行动两个环节。在此框架中，系统首先进行思考，然后执行行动，并将行动的反馈回馈给系统，据此进行下一轮的思考，这个流程会反复进行，直到产生最终的答案。

2023年推出了一个新框架——自我反思（Reflexion），该框架增设了反思环节，具体细节请参考下图:

下面是ReAct论文提到的例子：

案例讲解

ReAct的Prompt是：

Answer the following questions as best you can. If it is in order, you can use some tools appropriately. You have access to the following tools:
{tools}
Use the following format:
Question: the input question you must answer1
Thought: you should always think about what to do and what tools to use.
Action: the action to take, should be one of {toool_names}
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
history: {history}
Question: {input}
Thought: {agent_scratchpad}

agent_scratchpad: Agent的思维记录, 具体代表中间action和observation的过程，会被格式成 """ Observation: {observation} Thought:{action} """

假设我们有：

用户提问（Question）："目前的黄金价格是多少？如果我想在这个价格上加价20%，我应该怎么定价？"
工具库（Tools）：{'google search': 用谷歌搜索网络开源信息的工具；'llm-calc': 用大模型和Python做数学运算的工具}

那么第一轮对话的输入为：

Answer the following questions as best you can. If it is in order, you can use some tools appropriately. You have access to the following tools:
google-search: 用谷歌Search搜索网络开源信息的工具
llm-calc: 用大模型和Python做数学运算的工具
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do and what tools to use.
Action: the action to take, should be one of [google-search, llm-calc]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
history: 
Question: 目前的黄金价格是多少？如果我想在这个价格上加价20%，我应该怎么定价？
Thought:

得到输出后解析，获取Thought、Action和Action Input：

Thought: 我应该使用搜索工具来查找黄金的当前市场价格。
Action: google-search
Action Input: 黄金当前价格

调用google-search工具，输入"黄金当前价格"，获取返回内容Observation："根据网络资料显示，每克黄金的价格为60美元。"。之后将以上内容，整理再放入ReAct的提示词模板中，开启第二轮对话的输入：

Answer the following questions as best you can. If it is in order, you can use some tools appropriately. You have access to the following tools:
google-search: 用谷歌Search搜索网络开源信息的工具
llm-calc: 用大模型和Python做数学运算的工具
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do and what tools to use.
Action: the action to take, should be one of [google-search, llm-calc]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
history: 
Question: 目前的黄金价格是多少？如果我想在这个价格上加价20%，我应该怎么定价？
Thought: 我应该使用搜索工具来查找答案，这样我可以快速地找到所需的信息。
Action: google-search
Action Input: 黄金当前价格
Observation: 根据网络资料显示，每克黄金的价格为60美元。
Thought:

得到输出后解析，获取Thought、Action和Action Input：

Thought: 我需要计算在这个价格基础上加价20%的新价格是多少。
Action: llm-calc
Action Input: 60*1.20

调用llm-calc工具，输入60*1.20，获取返回内容Observation："72"。之后将以上内容，整理再放入ReAct的提示词模板中，开启第三轮对话的输入：

Answer the following questions as best you can. If it is in order, you can use some tools appropriately. You have access to the following tools:
google-search: 用谷歌Search搜索网络开源信息的工具
llm-calc: 用大模型和Python做数学运算的工具
Use the following format:
Question: the input question you must answer
Thought: you should always think about what to do and what tools to use.
Action: the action to take, should be one of [google-search, llm-calc]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can be repeated zero or more times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question
Begin!
history: 
Question: 目前的黄金价格是多少？如果我想在这个价格上加价20%，我应该怎么定价？
Thought: 我应该使用搜索工具来查找答案，这样我可以快速地找到所需的信息。
Action: google-search
Action Input: 黄金当前价格
Observation: 根据网络资料显示，每克黄金的价格为60美元。
Thought: 我需要计算在这个价格基础上加价20%的新价格是多少。
Action: llm-calc
Action Input: 60*1.20
Observation: 72
Thought:

得到输出后解析，获取Thought、Action和Action Input：

Thought: 我知道最终答案了。
Final Answer: 如果想在当前价格上加价20%卖出黄金，应该定价为每克72美元。

53AI，企业落地大模型首选服务商

产品：场景落地咨询+大模型应用平台+行业解决方案

承诺：免费场景POC验证，效果验证后签署服务协议。零风险落地应用大模型，已交付160+中大型企业

相关资讯

2025-07-05

一文看懂 LangChain：为什么火？核心模块都干啥？

2025-07-04

上下文工程：LangGraph四大高效调度策略，Agent告别“记忆过载”！

2025-07-04

LangChain教程——提示词模板

2025-07-03

继提示词工程、RAG技术浪潮后，LangChain领域中上下文工程正成为新的热门方向!

2025-07-01

LangChain教程——LangChain基本使用

2025-06-26

LangChain的OpenAI和ChatOpenAI，到底应该调用哪个？

2025-06-26

开发AI Agent到底用什么框架——LangGraph VS. LlamaIndex

2025-06-24

大模型开发基础：LangChain基础样例

了解更多

160+中大型企业正在使用53AI

立即咨询预约演示

把握AI发展的机遇，共同探索、共同进步

2025-01-22

如何打造基于GenAI的员工服务机器人

2025-01-22

热点资讯

Spring AI和Langchain4j对比，java栈到底用哪个好？

2025-05-06

LangChain脚本如何调度及提效？

2025-05-08

智能体框架怎么选？LangChain、Dify、CrewAI、AutoGen五大框架横向对比

2025-06-05

扣子空间干掉了扣子自己（附系统提示词）

2025-04-18

AI Agent框架哪家强：Google ADK、Autogen、Langchain与CrewAI

2025-05-28

新手必看！LangGraph如何轻松搞定多语言模型协同工作

2025-05-19

怎样让LLM看懂你的接口？

2025-04-18

彻底说清 Human-in-the-Loop：企业级 Agent 系统的关键挑战与LangGraph解法【上】

2025-05-28

深度解析Agent实现，定制自己的Manus

2025-04-25

为什么大厂都在用LangChain？一文拆解大模型应用主流开发框架！

2025-04-26

大家都在问

一文看懂 LangChain：为什么火？核心模块都干啥？

2025-07-05

LangChain的OpenAI和ChatOpenAI，到底应该调用哪个？

2025-06-26

【LangChain 团队重磅实测报告】多智能体架构揭秘：谁才是性能之王？

2025-06-13

首个落地案例！WEBTOON 用 LangGraph 构建故事理解 Agent，创作者 AI 时代的范式来了？

2025-05-21

‌LangChain vs LlamaIndex：如何选择？

2025-05-19

LangChain脚本如何调度及提效？

2025-05-08

Spring AI和Langchain4j对比，java栈到底用哪个好？

2025-05-06

商用 AI Agent 的开发框架如何选择？

2025-04-22

热门标签

内容创作大模型技术个人提效 langchain llamaindex 多模态技术 RAG技术智能客服知识图谱模型微调 RAGFlow coze Dify Fastgpt Bisheng Qanything AI+汽车 AI+金融 AI+工业 AI+培训 AI+SaaS 提示词框架提示词技巧 AI+电商 AI面试数字员工 ChatBI 知识管理开源大模型智能营销智能硬件智能化改造 AI+医疗 MaxKB