我要投稿

LangChain中的Prompt模板搭建|Prompt工程Few-Shot优化方案

发布日期：2024-07-18 04:36:17 浏览次数： 2445

最近几篇文章介绍了Prompt的基本原则和不同任务的推荐Prompt模板，在实际的应用中比较有用，本篇文章将介绍，在LangChain中怎么应用和封装模板。

提示模板的类型

提示模板有助于将用户输入和参数转换为语言模型的指令。用于指导模型的响应，帮助其理解上下文并生成相关且连贯的基于语言的输出。

一般提示模板以字典作为输入，其中每个key代表提示模板中要填写的变量。下面介绍两种提示模板：

PromptTemplate
该模板的输出为一个字符串，也可以转换为字符串或消息列表。此方法的存在是为了便于在字符串和消息之间切换。

from langchain_core.prompts import PromptTemplate
prompt_template = PromptTemplate.from_template("回答下面问题：{question}")
pt = prompt_template.invoke({"question": "今天是几号？"})print(pt.to_string())
# result# '回答下面问题：今天是几号？'

ChatPromptTemplate

这个提示模板用于格式化消息列表，由模板本身的列表组成。构建和使用 ChatPromptTemplate 的常见方法如下：

from langchain_core.prompts import ChatPromptTemplate
chat_prompt_template = ChatPromptTemplate.from_messages([("system", "You are a helpful assistant"),("user", "回答下面问题：{question}")])
cpt = chat_prompt_template.invoke({"question": "今天是几号？"})print(cpt.to_messages())
# result# [SystemMessage(content='You are a helpful assistant'), HumanMessage(content='回答下面问题：今天是几号？')]

在上面的例子中，这个 ChatPromptTemplate 在调用时会构造两个消息。第一个是系统消息，没有要格式化的变量。第二个是 HumanMessage，将由用户传入的question替代。结果在上面代码最后已展示。

MessagesPlaceholder

此提示模板负责在特定位置添加消息列表。在上面的 ChatPromptTemplate 中，我们看到了如何格式化两条消息，每条消息都是一个字符串。如果我们希望将用户传过来的消息插入到消息列表的特定位置，该怎么办？这可以用 MessagesPlaceholder 来实现：

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholderfrom langchain_core.messages import HumanMessage
chat_prompt_template = ChatPromptTemplate.from_messages([("system", "You are a helpful assistant"),MessagesPlaceholder("msgs")# ("placeholder", "{msgs}")])
print(chat_prompt_template.invoke({"msgs": pt.to_messages()}))
# result# messages=[SystemMessage(content='You are a helpful assistant'), HumanMessage(content='回答下面问题：今天是几号？')]

例子的结果可以看出，把上面PromptTemplate生成的消息也传进去了，这在消息传输中非常有用，以后构建ChatBot会用到。当然MessagesPlaceholder也可以不显示表示，使用("placeholder","{msgs}")占位。

在PromptTemplate中插入Few-Shot

在prompt中添加few-shot是提升模型的表现的有效方法，如何添加、怎么有效添加是文章后面要将的重点，先看代码：

from langchain_core.prompts import PromptTemplate
example_prompt = PromptTemplate.from_template("Question: {question}\nAnswer: {answer}")

下面给出few-shot的候选集：

examples = [{"question": "这组数中的奇数加起来是偶数：4、8、9、15、12、2、1。", "answer": "将所有奇数相加（9、15、1）得到25。答案为False。"},{"question": "这组数中的奇数加起来是偶数：17、10、19、4、8、12、24。", "answer": "将所有奇数相加（17、19）得到36。答案为True。"},{"question": "这组数中的奇数加起来是偶数：16、11、14、4、8、13、24。", "answer": "将所有奇数相加（11、13）得到24。答案为True。"},{"question": "这组数中的奇数加起来是偶数：17、9、10、12、13、4、2。",     "answer": "将所有奇数相加（17、9、13）得到39。答案为False。"},    {"question": "中国首都是哪里？", "answer": "北京"}]

print(example_prompt.invoke(examples[0]).to_string())
# resultQuestion: 这组数中的奇数加起来是偶数：4、8、9、15、12、2、1。Answer: 将所有奇数相加（9、15、1）得到25。答案为False。

上面打印出了第一个样例放入prompt后的输出。后面展示使用FewShotPromptTemplate方法拼接完整few-shot到模板中：

from langchain_core.prompts import FewShotPromptTemplate
prompt = FewShotPromptTemplate(examples=examples,example_prompt=example_prompt,    suffix="Question: {input}\nAnswer: ",input_variables=["input"],)
print(prompt.invoke({"input": "这组数中的奇数加起来是偶数：15、32、5、13、82、7、1。"}).to_string())

结果如下：

Question: 这组数中的奇数加起来是偶数：4、8、9、15、12、2、1。Answer: 将所有奇数相加（9、15、1）得到25。答案为False。Question: 这组数中的奇数加起来是偶数：17、10、19、4、8、12、24。Answer: 将所有奇数相加（17、19）得到36。答案为True。Question: 这组数中的奇数加起来是偶数：16、11、14、4、8、13、24。Answer: 将所有奇数相加（11、13）得到24。答案为True。Question: 这组数中的奇数加起来是偶数：17、9、10、12、13、4、2。Answer: 将所有奇数相加（17、9、13）得到39。答案为False。Question: 中国首都是哪里？Answer: 北京Question: 这组数中的奇数加起来是偶数：15、32、5、13、82、7、1。Answer:

这样我们就把提供的例子都按格式拼接到Prompt中了；可以看出上面答案都用到了思维链的方式(Chain-of-Thought Prompting)这也是有用的prompt优化方案，目前大模型微调数据都采用了此方法。

添加更好的few-shot

上面的例子中，我们可以看到，“中国的首都是哪里？”这个问题完全没有必要出现在示例中，选择一个意思和结构相近的示例放入few-shot中往往会事半功倍，我们该怎么实现呢？在LangChain中有一个方法能帮我们实现这个功能“SemanticSimilarityExampleSelector”，下面会用到向量模型(没有api-key可以看这篇文章RAG：在LangChain中使用本地向量embedding模型)和向量数据库。

from langchain_chroma import Chromafrom langchain_core.example_selectors import SemanticSimilarityExampleSelectorfrom langchain_openai import OpenAIEmbeddings
example_selector = SemanticSimilarityExampleSelector.from_examples(# 候选示例集.examples,# 向量模型，用于计算相似度.OpenAIEmbeddings(),# 向量数据库，用于储存向量和检索.Chroma,# 检索k个示例.k=1,)
# Select the most similar example to the input.question = "这组数中的奇数加起来是偶数：15、32、5、13、82、7、1。"selected_examples = example_selector.select_examples({"question": question})print(f"Examples most similar to the input: {question}")for example in selected_examples:print("\n")for k, v in example.items():print(f"{k}: {v}")

Examples most similar to the input: 这组数中的奇数加起来是偶数：15、32、5、13、82、7、1。

answer: 将所有奇数相加（11、13）得到24。答案为True。
question: 这组数中的奇数加起来是偶数：16、11、14、4、8、13、24。

下面我来构建FewShotPrompt，prefix可以输入问题的描述，suffix可以输入问题的前缀（上篇文章有介绍，挺重要的。大模型Prompt提示设计简介(2)：有效的建议）

prompt = FewShotPromptTemplate(example_selector=example_selector,example_prompt=example_prompt,    suffix="\n待判断问题：\nQuestion: {input}\nAnswer: ",prefix="判断下面句子的对错，按下面例子的格式进行输出：\n例子：",input_variables=["input"],)
print(prompt.invoke({"input": "这组数中的奇数加起来是偶数：15、32、5、13、82、7、1。"}).to_string())

判断下面句子的对错，按下面例子的格式进行输出：例子：Question: 这组数中的奇数加起来是偶数：16、11、14、4、8、13、24。Answer: 将所有奇数相加（11、13）得到24。答案为True。
待判断问题：Question: 这组数中的奇数加起来是偶数：15、32、5、13、82、7、1。Answer:

这样我们很好的召回了相似的示例作为上下文，这样可以准确的约束模型进行推理，也能对输出的格式进行约束。