微信扫码
与创始人交个朋友
我要投稿
01。
概述
02。
训练效率与性能
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "cerebras/Llama3-DocChat-1.0-8B"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype=torch.float16, device_map="auto")
system = "This is a chat between a user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions based on the context. The assistant should also indicate when the answer cannot be found in the context."
instruction = "Please give a full and complete answer for the question."
document = """
# Cerebras Wafer-Scale Cluster
Exa-scale performance, single device simplicity
## AI Supercomputers
Condor Galaxy (CG), the supercomputer built by G42 and Cerebras, is the simplest and fastest way to build AI models in the cloud. With over 16 ExaFLOPs of AI compute, Condor Galaxy trains the most demanding models in hours rather than days. The terabyte scale MemoryX system natively accommodates 100 billion+ parameter models, making large scale training simple and efficient.
| Cluster | ExaFLOPs | Systems | Memory |
| -------- | -------- | -------- | ------ |
| CG1 | 4 | 64 CS-2s | 82 TB |
| CG2 | 4 | 64 CS-2s | 82 TB |
| CG3 | 8 | 64 CS-3s | 108 TB |
"""
question = "How many total CS systems does Condor Galaxy 1, 2, and 3 have combined, and how many flops does this correspond to?"
user_turn = f"""<context>
{document}
</context>
{instruction} {question}"""
messages = [
{"role": "system", "content": system},
{"role": "user", "content": user_turn}
]
input_ids = tokenizer.apply_chat_template(
messages,
add_generation_prompt=True,
return_tensors="pt"
).to(model.device)
terminators = [
tokenizer.eos_token_id,
tokenizer.convert_tokens_to_ids("<|eot_id|>")
]
outputs = model.generate(
input_ids,
max_new_tokens=256,
eos_token_id=terminators,
)
response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))
03。
开源承诺
04。
基准比较
05。
面临的挑战与未来展望
参考:
https://cerebras.ai/blog/train-a-gpt-4-level-conversational-qa-in-a-few-hours
https://huggingface.co/cerebras/Llama3-DocChat-1.0-8B
53AI,企业落地应用大模型首选服务商
产品:大模型应用平台+智能体定制开发+落地咨询服务
承诺:先做场景POC验证,看到效果再签署服务协议。零风险落地应用大模型,已交付160+中大型企业
2024-10-05
ChatGPT从今天开始,Plus 和Team 用户可以通过在模型选择器中选择“GPT-4o with canvas”
2024-10-03
OpenAI革新性功能:"Prompt Caching"—— 提速降本的双赢之选
2024-10-02
OpenAI开发者日:实时API、模型蒸馏、视觉微调重磅发布!
2024-10-02
OpenAI DevDay公布五项重大创新,AI实时语音的时代来了!成本比GPT-3降低近1000倍!
2024-09-30
Word Embedding揭秘:如何用词嵌入提升NLP模型表现
2024-09-30
不要上来就大模型,从训练一个小模型开始
2024-09-30
o1的规划能力如何?LRM是未来吗?
2024-09-30
DICoT模型让AI学会自我纠错,提示词工程终结?
2024-07-11
2024-07-09
2024-07-11
2024-06-11
2024-09-18
2024-07-23
2024-07-09
2024-07-20
2024-07-12
2024-07-11