我要投稿

检索感知微调（RAFT），提升领域RAG效果的新方法

发布日期：2024-04-25 07:26:05 浏览次数： 2190 作者：AI工程化

一般来讲，让大模型应用到具体的行业领域，那就必须让大模型懂得行业里的知识。这种知识的导入一般有三种方法，一种是在预训练阶段喂给模型一些领域的文档和知识，扩充一些领域词表的方式解决。而更为常用的是另外两种做法，微调或者RAG，其中微调是以问答对的方式将领域知识训练到模型中，而RAG则是通过在Prompt中增加领域知识上下文的方式让大模型获得相关领域知识进而回答领域问题。有一个形象的比喻是，微调的方式相当于是闭卷考试，在Prompt中不添加任何的上下文内容（zero-hshot），而RAG相当于是开卷考试。虽然RAG在成本上，内容时效性，灵活性上都优于微调，但它也同时受到两个问题影响，一个是提供的上下文知识并不包含答案，或者是上下文知识里包含着杂乱，干扰的信息，导致最终导致无法正确回答问题。前者，在前面的RAG2.0中有提到解决思路。（RAG 2.0来了，它能成为生产落地的福音吗？）

而后者，近日伯克利大学、Meta以及微软的研究者将RAG的思路应用到微调领域，提出了RAFT（Retrieval Aware Fine-Tuning）的方法[1]，该方法介于微调和RAG之间，不同于传统的微调方法，样本是采用Q->A的文档对构成，而RAFT中，增加了一些背景信息内容，里面还包含了一些不相干的干扰信息，也就是说在微调过程中，就让模型学会从干扰信息中获得正确答案的能力。

训练样本包含问题、上下文、指令、 CoT 答案和最终答案。在答案中，使用 ##begin_quote## 和 ##end_quote## 表示直接从上下文中复制粘贴的引用的开头和结尾。这是一种防止模型产生幻觉并坚持所提供上下文的有效方法。下面是一个样本数据示例：

Question: The Oberoi family is part of a hotel company that has a head office in what city?
context: [The Oberoi family is an Indian family that is famous for its involvement in hotels, namely through The Oberoi Group]...[It is located in city center of Jakarta, near Mega Kuningan, adjacent to the sister JW Marriott Hotel. It is operated by The Ritz-Carlton Hotel Company. The complex has two towers that comprises a hotel and the Airlangga Apartment respectively]...[The Oberoi Group is a hotel company with its head office in Delhi.]
Instruction: Given the question, context and answer above, provide a logical reasoning for that answer. Please use the format of: ##Reason: {reason} ##Answer: {answer}.
CoT Answer: ##Reason: The document ##begin_quote## The Oberoi family is an Indian family that is famous for its involvement in hotels, namely through The Oberoi Group. ##end_quote## establishes that the Oberoi family is involved in the Oberoi group, and the document ##begin_quote## The Oberoi Group is a hotel company with its head office in Delhi. ##end_quote## establishes the head office of The Oberoi Group. Therefore, the Oberoi family is part of a hotel company whose head office is in Delhi. ##Answer: Delhi

这种思路是符合直觉的，研究者在Medical (PubMed), General-knowledge (HotPotQA)和API (Gorilla) 数据集上评估，RAFT性能提升明显，在llama2-7b+RAG基础上提升了14.5，超过了GPT-3.5+RAG的性能。