微信扫码
与创始人交个朋友
我要投稿
而后者,近日伯克利大学、Meta以及微软的研究者将RAG的思路应用到微调领域,提出了RAFT(Retrieval Aware Fine-Tuning)的方法[1],该方法介于微调和RAG之间,不同于传统的微调方法,样本是采用Q->A的文档对构成,而RAFT中,增加了一些背景信息内容,里面还包含了一些不相干的干扰信息,也就是说在微调过程中,就让模型学会从干扰信息中获得正确答案的能力。
训练样本包含问题、上下文、指令、 CoT 答案和最终答案。在答案中,使用 ##begin_quote## 和 ##end_quote## 表示直接从上下文中复制粘贴的引用的开头和结尾。这是一种防止模型产生幻觉并坚持所提供上下文的有效方法。下面是一个样本数据示例:
Question: The Oberoi family is part of a hotel company that has a head office in what city?
context: [The Oberoi family is an Indian family that is famous for its involvement in hotels, namely through The Oberoi Group]...[It is located in city center of Jakarta, near Mega Kuningan, adjacent to the sister JW Marriott Hotel. It is operated by The Ritz-Carlton Hotel Company. The complex has two towers that comprises a hotel and the Airlangga Apartment respectively]...[The Oberoi Group is a hotel company with its head office in Delhi.]
Instruction: Given the question, context and answer above, provide a logical reasoning for that answer. Please use the format of: ##Reason: {reason} ##Answer: {answer}.
CoT Answer: ##Reason: The document ##begin_quote## The Oberoi family is an Indian family that is famous for its involvement in hotels, namely through The Oberoi Group. ##end_quote## establishes that the Oberoi family is involved in the Oberoi group, and the document ##begin_quote## The Oberoi Group is a hotel company with its head office in Delhi. ##end_quote## establishes the head office of The Oberoi Group. Therefore, the Oberoi family is part of a hotel company whose head office is in Delhi. ##Answer: Delhi
这种思路是符合直觉的,研究者在Medical (PubMed), General-knowledge (HotPotQA)和API (Gorilla) 数据集上评估,RAFT性能提升明显,在llama2-7b+RAG基础上提升了14.5,超过了GPT-3.5+RAG的性能。
官方也提供了相应的训练指导[2],包含了生成数据集,微调,评估的全过程,llamaindex也实现了数据集生成的工具包[3],可方便对接llamaindex流程。
53AI,企业落地应用大模型首选服务商
产品:大模型应用平台+智能体定制开发+落地咨询服务
承诺:先做场景POC验证,看到效果再签署服务协议。零风险落地应用大模型,已交付160+中大型企业
2024-05-14
2024-04-26
2024-03-30
2024-04-12
2024-05-10
2024-07-18
2024-05-22
2024-05-28
2024-04-25
2024-04-26
2024-11-15
2024-11-14
2024-11-13
2024-11-13
2024-11-13
2024-11-12
2024-11-11
2024-11-08