我要投稿

大模型开源策略全方位拆解：Llama许可协议+使用政策

发布日期：2024-06-15 19:38:08 浏览次数： 3722 作者：涌现AI法

大模型的开发和部署目前明显分成了开源和闭源两条路径，开源非常关键，占了半壁江山，而且美国方面还在推动立法禁止对中国的开源输出，但至少目前仍然是一个非常重要的路径。

另一方面，随着国产大模型的追赶，也有像阿里和零一万物这样的公司开源自己旗下的大模型，无论如何，开源是大模型产业必须要关注的。我们就以当下最典型的meta公司的开源大模型Llama为例，全面拆解和分析一下其开源策略以及对开发者的影响。我们会采取逐条翻译和评述的方式进行。

Llama的开源策略是建立在《社区许可协议》和《可接受使用政策》的基础上的，协议主导meta和开发者的关系，使用政策主导开发者的禁止性使用行为，我们选用Llama2的版本，并同时将最新的Llama3版本的差异也一并指出。

文章较长，不建议非专业人士观看，但鼓励转发。

一、Llama2社区许可协议

License Rights and Redistribution (许可权利和再分发)

a. Grant of Rights (授予权利)

原文：You are granted a non-exclusive worldwide non-transferable and royalty-free limited license under Meta's intellectual property or other rights owned by Meta embodied in the Llama Materials to use reproduce distribute copy create derivative works of and make modifications to the Llama Materials.

翻译：授予您在Llama资源中的使用、复制、分发、制作衍生作品和修改的非排他性、全球性、不可转让和免版税的有限许可。

评述：您可以免费使用、修改、复制和分发Llama资源，但不能转让给别人，也不能独占这些权利。

b. Redistribution and Use (再分发和使用)

原文：If you distribute or make the Llama Materials or any derivative works thereof available to a third party you shall provide a copy of this Agreement to such third party.

If you receive Llama Materials or any derivative works thereof from a Licensee as part of an integrated end user product then Section 2 of this Agreement will not apply to you. You must retain in all copies of the Llama Materials that you distribute the following attribution notice within a "Notice" text file distributed as a part of such copies: "Llama 2 is licensed under the LLAMA 2 Community License Copyright (c) Meta Platforms Inc.

All Rights Reserved." Your use of the Llama Materials must comply with applicable laws and regulations (including trade compliance laws and regulations) and adhere to the Acceptable Use Policy for the Llama Materials (available athttps://ai.meta.com/llama/use-policy) which is hereby incorporated by reference into this Agreement. You will not use the Llama Materials or any output or results of the Llama Materials to improve any other large language model (excluding Llama 2 or derivative works thereof).

翻译：如果您分发或让第三方使用Llama资源或其衍生作品，您必须向第三方提供本协议的副本。

如果您作为集成的最终用户产品的一部分接收到Llama资源或其衍生作品，则本协议第2条（附加商业条款）不适用于您。

您的使用必须符合适用法律和法规（包括贸易合规法律和法规），并遵守Llama资源的《可接受使用政策》。

您不得使用Llama资源或其输出结果来改进任何其他大语言模型（除Llama 2或其衍生作品外）。

如果只是最终用户产品的一部分接收到Llama资源，附加商业条款不适用，毕竟这个协议是约束开发者的。

不能用Llama资源来提升其他大语言模型，这个要求有意思，也就是说不能用我的模型去训练其他模型，但实践中恐怕这种情况在所难免，预判后续会有纠纷。

这个 available at https://ai.meta.com/llama/use-policy的《可接受政策》也非常重要，可以视作开源协议的组成部分，其内容重点是约束开发者使用模型的禁止性行为，在本文第三部分会给出更详细的解读。

2. Additional Commercial Terms (附加商业条款)

原文：If on the Llama 2 version release date the monthly active users of the products or services made available by or for Licensee or Licensee's affiliates is greater than 700 million monthly active users in the preceding calendar month you must request a license from Meta which Meta may grant to you in its sole discretion and you are not authorized to exercise any of the rights under this Agreement unless or until Meta otherwise expressly grants you such rights.

翻译：如果在Llama 2版本发布之日，您或您的关联公司提供的产品或服务的月活跃用户超过7亿，您必须向Meta申请许可，并且在Meta明确授予这些权利之前，您无权行使本协议下的任何权利。

评述：如果开发出来的衍生模型投入商用后有超过7亿的月活跃用户，必须向Meta申请特别许可，否则不能使用Llama资源，这个非常值得关注，说白了meta设置了“熔断”保险机制，毕竟超过这个体量可能会对meta本身产生影响了，要避免“教会徒弟，饿死师傅”。当然，一般情况下，用户也很难超过七亿，如果有这个运营能力，估计从一开始就不会直接用开源，至少是深度套壳（借鉴）的自研。

3. Disclaimer of Warranty (免责声明)

原文：UNLESS REQUIRED BY APPLICABLE LAW THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS THEREFROM ARE PROVIDED ON AN "AS IS" BASIS WITHOUT WARRANTIES OF ANY KIND EITHER EXPRESS OR IMPLIED INCLUDING WITHOUT LIMITATION ANY WARRANTIES OF TITLE NON-INFRINGEMENT MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. YOU ARE SOLELY RESPONSIBLE FOR DETERMINING THE APPROPRIATENESS OF USING OR REDISTRIBUTING THE LLAMA MATERIALS AND ASSUME ANY RISKS ASSOCIATED WITH YOUR USE OF THE LLAMA MATERIALS AND ANY OUTPUT AND RESULTS.

翻译：除非适用法律要求，Llama资源及其输出结果按“现状”提供，不提供任何形式的明示或暗示保证，包括但不限于所有权、非侵权、适销性或特定用途适用性保证。您需自行承担使用或再分发Llama资源及其输出结果的所有风险。

评述：作为底座模型，很难控制衍生模型会用来做什么，所以先把责任撇清，对于衍生模型开发者而言，也不要再心存幻想，认为出了事情可以让底座兜底。当然，我们也注意到，底座模型为了吸引开发者的使用，会推出一些“内容围栏”或者“兜底”措施，尽量避免开发者因生成的内容受到维权索赔，但目前也只是openai这种闭源模型在做，llama作为开源模型不一定会跟进。

4. Limitation of Liability (责任限制)

原文：IN NO EVENT WILL META OR ITS AFFILIATES BE LIABLE UNDER ANY THEORY OF LIABILITY WHETHER IN CONTRACT TORT NEGLIGENCE PRODUCTS LIABILITY OR OTHERWISE ARISING OUT OF THIS AGREEMENT FOR ANY LOST PROFITS OR ANY INDIRECT SPECIAL CONSEQUENTIAL INCIDENTAL EXEMPLARY OR PUNITIVE DAMAGES EVEN IF META OR ITS AFFILIATES HAVE BEEN ADVISED OF THE POSSIBILITY OF ANY OF THE FOREGOING.

翻译：在任何情况下，Meta或其关联公司在任何责任理论下，无论是合同、侵权、过失、产品责任或其他情况下，均不对因本协议引起的任何利润损失或任何间接、特殊、结果性、附带、惩罚性或惩戒性损害承担责任，即使Meta或其关联公司已被告知上述损害的可能性。

评述：同上一条。

5. Intellectual Property (知识产权)

a. No trademark licenses (无商标许可)

原文：No trademark licenses are granted under this Agreement and in connection with the Llama Materials neither Meta nor Licensee may use any name or mark owned by or associated with the other or any of its affiliates except as required for reasonable and customary use in describing and redistributing the Llama Materials.

翻译：本协议不授予任何商标使用权，Meta和被许可方均不得使用对方或其关联公司的任何名称或标志，除非合理和惯常地用于描述和再分发Llama资源。

b. Ownership of Derivative Works (衍生作品的所有权)

原文：Subject to Meta's ownership of Llama Materials and derivatives made by or for Meta with respect to any derivative works and modifications of the Llama Materials that are made by you as between you and Meta you are and will be the owner of such derivative works and modifications.

翻译：对于您制作的任何衍生作品和修改，在您和Meta之间，您将拥有这些衍生作品和修改的所有权。

评述：这里的所有权是一个泛指，要结合各个国家的法律规定才能落实，但总体上表达的应该是著作权的意思，而且也符合二次开发者对新作品享有著作权的惯例，否则开源就没有意义了。当然这也意味着开发者要对新的模型承担责任，并且不能够损害底座模型的权益。

c. Termination for IP Claims (知识产权诉讼的终止)

原文：If you institute litigation or other proceedings against Meta or any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Llama Materials or Llama 2 outputs or results or any portion of any of the foregoing constitutes an infringement of intellectual property or other rights owned or licensable by you then any licenses granted to you under this Agreement shall terminate as of the date such litigation or claim is filed or instituted. You will indemnify and hold harmless Meta from and against any claim by any third party arising out of or related to your use or distribution of the Llama Materials.

翻译：如果您对Meta提起诉讼，声称Llama资源或其输出结果侵犯您的知识产权，则本协议授予您的任何许可自诉讼提起之日起终止。您将赔偿并使Meta免受因您使用或分发Llama资源而引起的任何第三方索赔。

评述：用了人家的模型就不能再告人家，这个合理，如果有人把开发者和llama一起告了，开发者要承担llama 的赔偿部分，这个可能有问题，毕竟如果大模型生成了开发者不希望看到的侵权内容，是不是要因此替大模型背锅呢。只一条至少在提示开发者，“拿来主义”是有风险的，后面要做好微调、训练以及内容围栏。

6. Term and Termination (期限和终止)

原文：The term of this Agreement will commence upon your acceptance of this Agreement or access to the Llama Materials and will continue in full force and effect until terminated in accordance with the terms and conditions herein. Meta may terminate this Agreement if you are in breach of any term or condition of this Agreement. Upon termination of this Agreement you shall delete and cease use of the Llama Materials. Sections 3 4 and 7 shall survive the termination of this Agreement.

翻译：本协议的期限自您接受本协议或访问Llama资源之日起开始，有效期持续至根据本协议的条款和条件终止。Meta可在您违反本协议任何条款或条件时终止本协议。终止本协议后，您应删除并停止使用Llama资源。本协议的第3、4和7条在本协议终止后仍然有效。

评述：协议在您接受或使用Llama资源时生效，开发者违反协议，Meta可以终止使用许可，终止后开发者必须删除Llama资源，这个条款杀伤力很强，开发者要考虑避免激怒meta导致其启动终止程序，或者在初始阶段接入不同模型，以备不时之需。

7. Governing Law and Jurisdiction (适用法律和司法管辖)

原文：This Agreement will be governed and construed under the laws of the State of California without regard to choice of law principles and the UN Convention on Contracts for the International Sale of Goods does not apply to this Agreement. The courts of California shall have exclusive jurisdiction of any dispute arising out of this Agreement.

翻译：本协议将根据加利福尼亚州的法律解释和执行，不考虑法律选择原则。联合国国际货物销售合同公约不适用于本协议。加利福尼亚州法院对因本协议引起的任何争议拥有专属管辖权。

评述：开发者可能需要在加利福尼亚州进行法律诉讼，增加法律成本。

二、Llama3社区许可协议

Llama 3社区许可协议是建立在前文Llama 2协议的基础上的，会有几点不同，主要区别在于：

再分发和使用(Redistribution and Use)：

Llama 3：要求在相关网站、用户界面、博客文章、关于页面或产品文档中显著显示“Built with Meta Llama 3”。
Llama 3：规定如果使用Llama材料创建、训练、微调或改进AI模型并分发，必须在AI模型名称开头包含“Llama 3”。

免责声明(Disclaimer of Warranty)：

Llama 3：明确指出Meta放弃所有明示或暗示的保证，而Llama 2中未明确说明这一点。

无商标许可 (No trademark licenses)：

Llama 3：允许在特定情况下使用“Llama 3”标志，但必须遵守Meta的品牌指南，并且使用产生的商誉归Meta所有。

三、可接受使用政策 (Llama2和Llama3无实质区别)

显然，Llama使用了“开源协议+使用政策”的模式约束和调整与开发者的关系，前者协议主要用来调整关系，而后者政策主要用来约束行为，我们一并逐条解析一下：

Meta is committed to promoting safe and fair use of its tools and features, including Llama 2. If you access or use Llama 2, you agree to this Acceptable Use Policy (“Policy”). The most recent copy of this policy can be found atai.meta.com/llama/use-policy.

Meta致力于促进其工具和功能的安全和公平使用，包括Llama 2。如果您访问或使用Llama 2，则表示您同意本《可接受使用政策》（“政策”）。最新版本的政策可以在ai.meta.com/llama/use-policy上找到。

被禁止的使用行为

1. 违反法律或他人权利 (Violate the law or others’ rights)

原文：Engage in, promote, generate, contribute to, encourage, plan, incite, or further illegal or unlawful activity or content, such as:

Violence or terrorism
Exploitation or harm to children, including the solicitation, creation, acquisition, or dissemination of child exploitative content or failure to report Child Sexual Abuse Material
Human trafficking, exploitation, and sexual violence
The illegal distribution of information or materials to minors, including obscene materials, or failure to employ legally required age-gating in connection with such information or materials
Sexual solicitation
Any other criminal activity

Engage in, promote, incite, or facilitate the harassment, abuse, threatening, or bullying of individuals or groups of individuals
Engage in, promote, incite, or facilitate discrimination or other unlawful or harmful conduct in the provision of employment, employment benefits, credit, housing, other economic benefits, or other essential goods and services
Engage in the unauthorized or unlicensed practice of any profession including, but not limited to, financial, legal, medical/health, or related professional practices
Collect, process, disclose, generate, or infer health, demographic, or other sensitive personal or private information about individuals without rights and consents required by applicable laws
Engage in or facilitate any action or generate any content that infringes, misappropriates, or otherwise violates any third-party rights, including the outputs or results of any products or services using the Llama 2 Materials
Create, generate, or facilitate the creation of malicious code, malware, computer viruses or do anything else that could disable, overburden, interfere with or impair the proper working, integrity, operation or appearance of a website or computer system

翻译：

参与、促进、生成、贡献、鼓励、计划、煽动或进一步非法或不合法的活动或内容，如：

暴力或恐怖主义
对儿童的剥削或伤害，包括招揽、制作、获取或传播剥削儿童的内容，或未能报告儿童性虐待材料
人口贩卖、剥削和性暴力
向未成年人非法分发信息或材料，包括淫秽材料，或未能对这些信息或材料采取法律要求的年龄限制
性招揽
任何其他犯罪活动

参与、促进、煽动或协助骚扰、虐待、威胁或欺凌个人或群体
参与、促进、煽动或协助在提供就业、就业福利、信贷、住房、其他经济利益或其他基本商品和服务中的歧视或其他非法或有害行为
参与未经授权或无证执业，包括但不限于金融、法律、医疗/健康或相关专业执业
未经法律要求的权利和同意，收集、处理、披露、生成或推断个人的健康、人口统计或其他敏感的个人或私人信息
参与或协助任何侵犯、盗用或以其他方式侵犯第三方权利的行动或生成任何内容，包括使用Llama 2资源的产品或服务的输出结果
创建、生成或协助创建恶意代码、恶意软件、计算机病毒，或做任何可能禁用、超负荷、干扰或损害网站或计算机系统正常工作、完整性、操作或外观的事情

评述：

不允许使用Llama 2从事非法活动、暴力、对儿童的伤害、骚扰、歧视、未经授权的专业执业，不能生成恶意代码或任何干扰计算机系统的行为。其中的未授权专业执业很有意思，联想到无人车。

2. 危及生命或身体健康的活动(Activities presenting a risk of death or bodily harm)

原文：

Engage in, promote, incite, facilitate, or assist in the planning or development of activities that present a risk of death or bodily harm to individuals, including use of Llama 2 related to the following:

Military, warfare, nuclear industries or applications, espionage, use for materials or activities that are subject to the International Traffic Arms Regulations (ITAR) maintained by the United States Department of State
Guns and illegal weapons (including weapon development)
Illegal drugs and regulated/controlled substances
Operation of critical infrastructure, transportation technologies, or heavy machinery
Self-harm or harm to others, including suicide, cutting, and eating disorders
Any content intended to incite or promote violence, abuse, or any infliction of bodily harm to an individual

翻译：

参与、促进、煽动、协助或协助计划或发展危及个人生命或身体健康的活动，包括与以下内容相关的Llama 2的使用：

军事、战争、核工业或应用、间谍活动、受美国国务院维护的国际武器贸易条例（ITAR）约束的材料或活动
枪支和非法武器（包括武器开发）
非法药物和受管制/控制的物质
关键基础设施、交通技术或重型机械的操作
自残或伤害他人，包括自杀、割伤和饮食失调
任何旨在煽动或促进暴力、虐待或对个人造成身体伤害的内容

评述：

不允许使用Llama 2从事任何可能导致死亡或身体伤害的活动，如军事用途、武器开发、非法药物、自残行为等。这个也很关键，但是恐怕要依赖底座模型自身的屏蔽能力了，开发者最多只能从外部做一些关键字屏蔽，以及不能在这些方面进行定向训练。

3. 故意欺骗或误导他人(Intentionally deceive or mislead others)

原文：

Intentionally deceive or mislead others, including use of Llama 2 related to the following:

Generating, promoting, or furthering fraud or the creation or promotion of disinformation
Generating, promoting, or furthering defamatory content, including the creation of defamatory statements, images, or other content
Generating, promoting, or further distributing spam
Impersonating another individual without consent, authorization, or legal right
Representing that the use of Llama 2 or outputs are human-generated
Generating or facilitating false online engagement, including fake reviews and other means of fake online engagement

翻译：

故意欺骗或误导他人，包括与以下内容相关的Llama 2的使用：

生成、促进或进一步欺诈或制造或促进虚假信息
生成、促进或进一步传播诽谤性内容，包括制造诽谤性声明、图像或其他内容
生成、促进或进一步分发垃圾邮件
未经同意、授权或合法权利，冒充他人
声称Llama 2或其输出是人工生成的
生成或促进虚假的在线参与，包括虚假评论和其他虚假在线参与手段

评述：

不允许使用Llama 2进行欺诈、制造虚假信息、诽谤、冒充他人、虚假在线参与等行为，这个除了覆盖传统意义上的黑灰产，还包括更多网络上的虚假内容。

4. 未能适当披露AI系统的已知危险(Fail to appropriately disclose known dangers of your AI system)

原文：

Fail to appropriately disclose to end users any known dangers of your AI system

翻译：

未能适当向终端用户披露您的AI系统的任何已知危险

评述：

您必须向用户披露您的AI系统可能存在的任何已知危险。

举报违规行为的方式 (Reporting Violations)

原文：

Reporting issues with the model:
https://github.com/facebookresearch/llama
Reporting risky content generated by the model: developers.facebook.com/llama_output_feedback
Reporting bugs and security concerns: facebook.com/whitehat/info
Reporting violations of the Acceptable Use Policy or unlicensed uses of Llama: LlamaUseReport@meta.com

翻译：

报告模型问题：https://github.com/facebookresearch/llama
报告模型生成的风险内容：developers.facebook.com/llama_output_feedback
报告漏洞和安全问题：facebook.com/whitehat/info
报告可接受使用政策的违规行为或未经许可的使用：LlamaUseReport@meta.com

总结

整体上Llama的开源要求还是比较合理的，该给开发者的权利也给到了，当然责任也推的很干净，必要的时候有足够的主动权对开发者可能的不理行为采取措施。而对于衍生模型在使用过程中生成的内容的权利归属并未提及，具体而言就是开发者基于Llama开发出来的大模型投入使用后，在用户给出提示指令的情况下，生成了新内容（例如一张图），这个作品的权利归属，在开源协议中有无提及。这是一个会受到普遍关注的问题，关系到实际使用的用户的权益，也关系到模型开发者的责任承担，不知道Llama是基于什么考虑没有作出约定，也许是这个问题各国法律和司法实践还没有定论吧。

另外，开源协议中没有明确规定使用者开发出的新模型必须继续开源。因此，理论上开发者基于Llama开发的新模型是否开源取决于开发者自己的决定，而不是协议中的强制要求。但这一点由于对商业应用影响比较大，最好在下一版协议中明确为好，对开发者而言，如果后续对衍生模型制定许可协议，也要考虑这个问题。