微信扫码
与创始人交个朋友
我要投稿
书接上文《DSPy的“前世今生”,从DSPy的核心论文解析其技术演进之路》,通过对DSPy的核心论文进行简述,解析其技术演进之路。本文将首先以示例驱动的方式,由浅入深的介绍DSPy框架自身的使用流程,然后结合知名大模型应用框架LangChain,进一步介绍两个框架结合的案例。
概述
DSPy,即声明式语言模型编程(Declarative Language Model Programming),旨在简化复杂语言模型应用的构建过程。由斯坦福大学的研究人员开发,DSPy允许开发者专注于应用程序的高级逻辑,同时抽象掉许多低级细节。创造了一种提示词工程的新范式,对目前提示词工程中的共识问题(如提示词的脆弱性、迭代成本高、缺乏系统方化方法、范式繁多、依靠人类经验 )等诸多问题,给出了新的解决思路。
DSPy从0到1使用教程
1、在使用DSPy之前,请先确保已经安装了 DSPy的python包,可参考Installation | DSPy (dspy-docs.vercel.app)
import dspy
# 定义并设置大模型
model_name = 'llama3'
lm = dspy.OllamaLocal(model=model_name)
dspy.settings.configure(lm=lm)
# 定义输入输出参数 类定义方式
class QA(dspy.Signature):
question = dspy.InputField()
answer = dspy.OutputField()
question = "what is the color of the sea?"
summarize = dspy.ChainOfThought(QA)
response = summarize(question=question)
print(f"问题:{question} \n答案:{response.answer}")
上述代码首先定义了大模型使用 llama3 ,然后,定义了 dspy.Signature 类,输入字段为 question, 输出字段为 answer,最后实例化 dspy.ChainOfThought 类,并输入问题调用大模型进行回答,执行结果为:
问题:what is the color of the sea?
答案:The color of the sea is typically perceived as blue.
question = "what is the color of the sky?"summarize = dspy.ChainOfThought('question -> answer')response = summarize(question=question)
# #inline方式 定义输入输出参数 - start # # 问题:what is the color of the sky? 答案:Blue # #inline方式 定义输入输出参数 - end # #
lm.inspect_history(n=1)
Question: what is the color of the sky?Reasoning: Let's think step by step in order to Question: what is the color of the sky?Reasoning: Let's think step by step in order to determine the color of the sky. The sky appears blue due to the scattering of light waves in the atmosphere. The blue light is scattered more efficiently than other colors of light, which is why the sky appears blue.Answer: Blue
import dspy model_name = 'llama3'lm = dspy.OllamaLocal(model=model_name)dspy.settings.configure(lm=lm) question = "what is the color of sky at night?"#示例内容example = dspy.Example(question="what is the color of sky?", answer="the color of sky is blue, even at night")summarize = dspy.ChainOfThought('question -> answer')response = summarize(question=question, demos=[example]) print(f"问题:{question} \n答案:{response.answer}")
问题:what is the color of sky at night? 答案:...the color of the sky at night is still blue!
---
Question: what is the color of sky?
Answer: the color of sky is blue, even at night
---
Question: what is the color of sky at night?
Reasoning: Let's think step by step in order to Question: what is the color of sky at night?
Reasoning: Let's think step by step in order to answer this question. We know that during the day, the color of the sky is blue, and we also know that the color of the sky remains relatively consistent even after sunset. Therefore...
Answer: ...the color of the sky at night is still blue!
import dspyfrom get_dataset import custom_trainset as trainset# example of transet :Example({'question': '1+5=?', 'answer': '6'}) (input_keys={'question'})gsm8k_trainset = gsm8k.train[:10] model_name = 'llama3'lm = dspy.OllamaLocal(model=model_name, timeout_s=1000)dspy.settings.configure(lm=lm) question = "3+3+5=?"#示例内容summarize = dspy.ChainOfThought('question -> answer')response = summarize(question=question, demos=gsm8k_trainset) print(f"问题:{question} \n答案:{response.answer}")
问题: 3+3+5=?答案: 8
实际上这个答案是错误的,正确的答案是11,因此提示词可以优化大模型的输出,但是也很难保证回答问题的准确性。提示词如下:
Given the fields `question`, produce the fields `answer`.
---
Follow the following format.
Question: ${question}
Reasoning: Let's think step by step in order to ${produce the answer}. We ...
Answer: ${answer}
---
Question: 1+1=?
Answer: 2
---
Question: 5*5=?
Answer: 25
---
Question: 1+5=?
Answer: 6
---
Question: 3+3=?
Answer: 6
---
Question: 6+6=?
Answer: 12
---
Question: 3+3+5=?
Reasoning: Let's think step by step in order to Here is the completed response:
---
Question: 3+3+5=?
Reasoning: Let's think step by step in order to get the answer. We need to add 3 and 3 first, which gives us 6, then we can add 5 to that result.
Answer: 8
from get_dataset import gsm_testset as test_set
import dspy
from dspy.datasets.gsm8k import gsm8k_metric
from dspy.teleprompt import BootstrapFewShot
# 定义并设置大模型
model_name = 'llama3'
lm = dspy.OllamaLocal(model=model_name, timeout_s=1000)
dspy.settings.configure(lm=lm)
class CoT(dspy.Module):
def __init__(self):
super().__init__()
self.prog = dspy.ChainOfThought("question -> answer")
def forward(self, question):
return self.prog(question=question)
cot = CoT()
evaluate = Evaluate(devset=test_set, metric=gsm8k_metric, num_threads=4, display_progress=True, display_table=0)
evaluate(cot)
from get_dataset import custom_trainset as trainset
config = dict(max_bootstrapped_demos=4, max_labeled_demos=4)
# Optimize! Use the `gsm8k_metric` here. In general, the metric is going to tell the optimizer how well it's doing.
teleprompter = BootstrapFewShot(metric=gsm8k_metric, **config)
# 可以调整 train_set 长度
optimized_cot = teleprompter.compile(CoT(), trainset=trainset)
optimized_cot.save("./test.json")
question = "3+3+5=?"
response = optimized_cot(question=question)
print(f"问题:{question} \n答案:{response.answer}")
问题: 3+3+5=?答案: 11
evaluate(optimized_cot)
optimized_cot = CoT()optimized_cot.load("./test.json")optimized_cot(question=question)
import dspy
from dspy.evaluate.evaluate import Evaluate
from dspy.teleprompt import BootstrapFewShotWithRandomSearch
from langchain_openai import OpenAI
from langchain.globals import set_llm_cache
from langchain.cache import SQLiteCache
# From LangChain, import standard modules for prompting.
from langchain_core.prompts import PromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough
colbertv2 = dspy.ColBERTv2(url='http://20.102.90.50:2017/wiki17_abstracts')
dspy.configure(rm=colbertv2)
set_llm_cache(SQLiteCache(database_path="cache.db"))
llm = OpenAI(model_name="gpt-3.5-turbo-instruct", temperature=0)
retrieve = lambda x: dspy.Retrieve(k=5)(x["question"]).passages
# Just a simple prompt for this task. It's fine if it's complex too.
prompt = PromptTemplate.from_template("Given {context}, answer the question `{question}` as a tweet.")
# This is how you'd normally build a chain with LCEL. This chain does retrieval then generation (RAG).
init_chain = RunnablePassthrough.assign(context=retrieve) | prompt | llm | StrOutputParser()
# From DSPy, import the modules that know how to interact with LangChain LCEL.
from dspy.predict.langchain import LangChainPredict, LangChainModule
# This is how to wrap it so it behaves like a DSPy program.
# Just Replace every pattern like `prompt | llm` with `LangChainPredict(prompt, llm)`.
zeroshot_chain = RunnablePassthrough.assign(context=retrieve) | LangChainPredict(prompt, llm) | StrOutputParser()
如上述代码,首先需要导入LangChainPredict类,该类集成自 langchain_core.runnables.Runnable 类 以及 dspy.Predict类,因此可以串联到链中。利用LangChainPredict包装原来的 提示词和大模型模块 并串联到链中,目前这条链与init_chain 中实现的功能完全一致。
zeroshot_chain = LangChainModule(zeroshot_chain)# then wrap the chain in a DSPy module.
question = "In what region was Eddy Mazzoleni born?"
zeroshot_chain.invoke({"question": question})
LangChainModule集成自dspy.Module,因此可以被优化器优化,此外LangChainModule实现了invoke方法,与Langchain的使用习惯一致。
from tweet_metric import metric, trainset, valset, devsetevaluate = Evaluate(metric=metric, devset=devset, num_threads=8, display_progress=True, display_table=5)evaluate(zeroshot_chain)
# Set up the optimizer. We'll use very minimal hyperparameters for this example.
# Just do random search with ~3 attempts, and in each attempt, bootstrap <= 3 traces.
optimizer = BootstrapFewShotWithRandomSearch(metric=metric, max_bootstrapped_demos=3, num_candidate_programs=3)
# Now use the optimizer to *compile* the chain. This could take 5-10 minutes, unless it's cached.
optimized_chain = optimizer.compile(zeroshot_chain, trainset=trainset, valset=valset)
evaluate(optimized_chain)
可看出,优化之后optimized_chain的准确率有着明显的提高,达到了52.4%。此外,用户可以通过调整优化器的参数,使得得到的结果进一步提高。
总结
本文首先通过一个简单的例子,解释了DSPy如何从最基础的无示例CoT模块问答、到有示例CoT模块、再到构建数据集,并利用Metrics和Optimizer优化Module获得更高的预测准确率的预测和验证过程。然后展示了利用DSPy对LangChain的链的优化、测试、验证过程。DSPy可以有效降低提示词在切换大模型时的脆弱性、并且将提示词开发从手工开发转为抽象的代码开发,提供了更加系统化的方法。DSPy所提出的新范式,将对未来的大模型应用平台技术产生深远影响。
53AI,企业落地应用大模型首选服务商
产品:大模型应用平台+智能体定制开发+落地咨询服务
承诺:先做场景POC验证,看到效果再签署服务协议。零风险落地应用大模型,已交付160+中大型企业
2024-05-28
2024-04-26
2024-08-21
2024-04-11
2024-08-13
2024-07-09
2024-07-18
2024-10-25
2024-07-01
2024-06-17