我要投稿

盘点国内外大模型推理服务 API 价格

发布日期：2024-06-12 21:22:36 浏览次数： 6900 来源：慢慢学 AIGC

本文将为你盘点截至 2024.5.17 国内各个大模型推理服务的 API 价格以及支持的特性（上下文窗口长度、是否支持多模态输入等）。注意价格可能会随时由厂商调整，请以官方最新数字为准。汇总表格在文末。本文尚未收纳的条目会在后台进行更新，需要原始文档可以私信获取。

注意：为了方便横向对比，我们的价格都以百万(1M）tokens 费用为准，其中输入 tokens 和输出 tokens 比例均为 50%，国内外费用统一换算为人民币（CNY）。

OpenAI GPT

OpenAI 是目前行业的标杆，具有目前最强大的模型 GPT-4o 以及最广泛的用户，价格贵有贵的道理。有关 GPT-4o 详细信息可以查看《重磅：OpenAI 发布 GPT-4o，支持实时音视频交互》

信息来源：https://platform.openai.com/docs/models

价格表：

Anthropic Claude

Anthropic 与 OpenAI 的渊源颇深。由于在 OpenAI 在发展方向上产生分歧，担心微软对 OpenAI 的首次 10 亿美元的投资后，会使其走上更加商业化的道路，偏离初衷，彼时为 OpenAI 研发副总裁 Dario Amodei 领导的团队选择离开并在 2021 年成立了 Anthropic。2023 年 3 月份 Anthropic 发布了一款类似 ChatGPT 的AI聊天机器人产品 Claude。Claude 有 520 亿个参数，最初通过 Slack 集成在封闭测试版中提供。2023 年 7月，Anthropic 推出了新版人工智能聊天机器人：Claude 2，上下文扩展至 100K，直接与 GPT-4 竞争。2024 年 3 月推出 Claude 3，步步紧跟 OpenAI 。

信息来源：https://www.anthropic.com/api

价格表：

Google Gemini

Google I/O 2024 上更新了 Gemini 1.5 Pro 系列模型，会议摘录见《Google I/O 2024 亮点摘录》

信息来源：https://ai.google.dev/pricing

价格表：

Mistral

Mistral AI 是由前 Google DeepMind 和前 Meta 员工组建的位于法国的初创公司。Mistral 开源模型有 Mistral 7B、Mixtral 8x7B 和 Mixtral 8x22B，性能表现如下图所示。

在官方 API 中我们看到除了开源的模型外还有闭源模型。

信息来源：

https://mistral.ai/technology/

https://console.mistral.ai/billing/subscribe/

价格表：

Groq

Groq 是前 Google TPU 团队成员组建的初创公司，主要提供面向 LLM 的定制芯片（称为语言处理单元 LPU）以及配套系统，LLaMA 3 70B 推理速度超过 300 tokens/s。

信息来源：https://wow.groq.com/

价格表：

百度千帆

信息来源：https://cloud.baidu.com/doc/WENXINWORKSHOP/s/hlrk4akp7

价格表：

腾讯混元

信息来源：https://cloud.tencent.com/document/product/1729/97731

价格表：

阿里通义千问

信息来源：https://help.aliyun.com/document_detail/2712573.html

价格表：

讯飞星火

信息来源：https://xinghuo.xfyun.cn/sparkapi

价格表：

字节豆包

信息来源：https://www.volcengine.com/docs/82379/1099320

价格表：

月之暗面 Kimi

信息来源：https://platform.moonshot.cn/docs/pricing

价格表：

智谱

信息来源：https://open.bigmodel.cn/pricing

价格表：

MiniMax

信息来源：https://www.minimaxi.com/document/price

价格表：

零一万物

信息来源：https://platform.lingyiwanwu.com/docs#%E4%BA%A7%E5%93%81%E5%AE%9A%E4%BB%B7

价格表：

百川

信息来源：https://platform.baichuan-ai.com/price

价格表：

DeepSeek

信息来源：https://platform.deepseek.com/api-docs/zh-cn/pricing/

价格表：

汇总表格

下表汇总了目前已知大模型推理服务 API 价格信息：

厂商	模型版本	时效性（截止时间）	上下文长度(tokens)	每 100 万 tokens 定价	多模态支持情况
OpenAI	GPT-4o	2023 年 10 月	128K	输入 5$, 输出 15$	TIAV->TA
	GPT-4 Turbo	2023 年 12 月	128K	输入 10$，输出 30$	TI->T
	GPT-3.5 Turbo	2021 年 9 月	16K	输入 0.5$，输出 1.5$	T->T
Anthropic	Claude 3 Haiku	未知	200K	输入 0.25$，输出 1.25$	TI->T
	Claude 3 Sonnet	未知	200K	输入 3$，输出 15$	TI->T
	Claude 3 Opus	未知	200K	输入 15$，输出 75$	TI->T
	Claude 2.1	未知	200K	输入 8$，输出 24$	TI->T
	Claude 2.0	未知	100K	输入 8$，输出 24$	T->T
	Claude Instant	未知	100K	输入 0.8$，输出 2.4$	T->T
Google	Gemini 1.5 Flash	未知	128K	输入 0.35$，输出 0.53$	TI->T
	Gemini 1.5 Flash	未知	1M	输入 0.7$，输出 1.05$	TI->T
	Gemini 1.5 Pro	未知	128K	输入 3.5$，输出 10.5$	TI->T
	Gemini 1.5 Pro	未知	2M	输入 7$，输出 21$	TI->T
	Gemini 1.0 Pro	未知	32K	输入 0.5$，输出 1.5$	TI->T
Mistral	open-mixtral-8x7b	未知	32K	输入 0.7$，输出 0.7$	T->T
	open-mixtral-8x22b	未知	64K	输入 2$，输出 6$	T->T
	mistral-small-2402	未知	32K	输入 1$，输出 3$	T->T
	mistral-medium	未知	32K	输入 2.7$，输出 8.1$	T->T
	mistral-large-2402	未知	32K	输入 4$，输出 12$	T->T
Groq	LLaMA 3 70B	未知	8K	输入 0.59$，输出 0.79$	T->T
	Mixtral 8x7B	未知	32K	输入 0.24$，输出 0.24$	T->T
	LLaMA 3 8B	未知	8K	输入0.05$，输出 0.1$	T->T
	Gemma 7B	未知	8K	输入 0.1$，输出 0.1$	T->T
百度	ERNIE 4.0	未知	4K	输入 120￥，输出 120￥	T->T
百度	ERNIE 3.5	未知	4K	输入 12￥，输出 12￥	T->T
腾讯	hunyuan-pro	未知	32K	输入 100￥，输出 100￥	T->T
	hunyuan-standard	未知	32K	输入 10￥，输出 10￥	T->T
	hunyuan-standard-256k	未知	256K	输入 120￥，输出 120￥	T->T
	hunyuan-lite	未知	4K	输入 8￥，输出 8￥	T->T
阿里巴巴	qwen-turbo	未知	8K	输入 8￥，输出 8￥	T->T
	qwen-plus	未知	32K	输入 20￥，输出20￥	T->T
	qwen-max	未知	8K	输入 120￥，输出 120￥	T->T
	qwen-max-longcontext	未知	32K	输入 120￥，输出 120￥	T->T
MiniMax	abab6.5	未知	8K	输入 30￥，输出 30￥	T->T
	abab6.5s	未知	245K	输入 10￥，输出 10￥	T->T
	abab6	未知	32K	输入 100￥，输出 100￥	T->T
	abab5.5	未知	8K	输入 15￥，输出 15￥	T->T
	abab5.5s	未知	8K	输入 5￥，输出 5￥	T->T
零一万物	yi-large	未知	16K	输入 20￥，输出20￥	T->T
	yi-large-turbo	未知	16K	输入 12￥，输出 12￥	T->T
	yi-large-rag	未知	16K	输入 25￥，输出 25￥	T->T
	yi-medium	未知	16K	输入 2.5￥，输出 2.5￥	T->T
	yi-medium-200k	未知	200K	输入 12￥，输出 12￥	T->T
	yi-spark	未知	16K	输入 1￥，输出 1￥	T->T
	yi-vision	未知	4K	输入 6￥，输出 6￥	TI->T
百川智能	Baichuan2-Turbo	未知	8K	输入 8￥，输出 8￥	T->T
	Baichuan2-Turbo-192k	未知	192K	输入 16￥，输出 16￥	T->T
	Baichuan2-53B	未知	?	输入20￥，输出 20￥	T->T
深度求索	deepseek-chat	未知	32K	输入 1￥，输出 2￥	T->T
深度求索	deepseek-coder	未知	16K	输入 1￥，输出 2￥	T->T
智谱	GLM-4	未知	128K	输入 100￥，输出 100￥	T->T
	GLM-4(Batch)	未知	128K	输入 50￥，输出 50￥	T->T
	GLM-4V	未知	2K	输入 100￥，输出 100￥	TI->T
	GLM-3-Turbo	未知	128K	输入 1￥，输出 1￥	T->T
	GLM-3-Turbo(Batch)	未知	128K	输入 0.5￥，输出 0.5￥	T->T
月之暗面	moonshot-v1-8k	未知	8K	输入 12￥，输出 12￥	T->T
	moonshot-v1-32k	未知	32K	输入 24￥，输出 24￥	T->T
	moonshot-v1-128k	未知	128K	输入 60￥，输出 60￥	T->T
讯飞	星火大模型V3.5	未知	?	输入 240￥，输出 240￥	T->T
字节豆包	Doubao-lite-4k	未知	4k	输入 0.3￥，输出 0.6￥	T->T
	Doubao-lite-32k	未知	32K	输入 0.3￥，输出 0.6￥	T->T
	Doubao-lite-128k	未知	128K	输入 0.8￥，输出 1￥	T->T
	Doubao-pro-32k	未知	32K	输入 0.8￥，输出 2￥	T->T
	Doubao-pro-128k	未知	128K	输入 5￥，输出 9￥	T->T
					模态解释
					T：Text
					I：Image
					A：Audio
					V：Video
					E：Embedding