AI知识库

53AI知识库

学习大模型的前沿技术与行业应用场景


马斯克的Grok-1开源,3140亿参数目前最大开源模型,最佳实践教程来啦!
发布日期:2024-04-20 17:05:36 浏览次数: 1838 来源:魔搭ModelScope社区


01

导  读


近几天开源社区最大的热点,莫过于埃隆马斯克信守承诺的最大开源模型Grok-1。Grok-1 是一款 314B 大型专家混合 (Mixture of Expert,MoE) Transformer,作为基础模型,基于大量文本数据进行训练,没有针对任何具体任务进行微调,使用 JAX 库和 Rust 语言组成的自定义训练堆栈从头开始训练。


官方提供的详细模型参数如下:

  • 参数量:3140亿

  • 架构:8个混合专家模型(MoE),每个Token使用2个专家

  • 层数:64层

  • 多头注意力Q使用48个注意力头,K/V 使用8个注意力头

  • 嵌入尺寸:6,144

  • 词表大小:131,072个Tokens,采用SentencePiece 分词器

  • 使用RoPE位置编码

  • 支持激活分片和8位量化

  • 最大序列长度(上下文):8,192个Tokens


模型性能方面,Grok-1官方发布的Benchmark超过GPT-3.5 和 LLaMa2 70B(MMLU 为73%,GMSK 为62.9%,HumanEval 为63.2%)



话题中的模型效果如何,老规矩,魔搭社区向大家提供推理实践教程~


02

模型链接和下载


Grok-1模型在ModelScope社区可下载:

模型链接:https://www.modelscope.cn/models/AI-ModelScope/grok-1/summary


社区支持直接下载模型的repo:

from modelscope import snapshot_downloadmodel_dir = snapshot_download("AI-ModelScope/grok-1")


03

Grok推理和评测


环境准备


Grok-1需要8卡A100运行。原始模型是Jax框架编写,因此使用魔搭默认镜像无法工作,为此需要使用Jax的镜像:


docker pull ghcr.io/nvidia/jax:jax# 如果国内用户无法拉取上面的镜像可以使用我们转储的镜像# docker pull registry.cn-wulanchabu.aliyuncs.com/ed/jax:1.0


进入docker实例之后对grok代码库进行克隆:


git clone https://github.com/xai-org/grok-1.gitcd grok-1pip install -r requirements.txt# 修正Jax版本pip install --upgrade "jax[cuda12_local]==0.4.23" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html


运行推理可以直接使用官方提供的样例:

# cd grok-1# 修改样例中的CKPT_PATH = "ModelScope下载的模型路径"python run.py


显存占用



推理结果

temperature=0.01,官方示例:


Query:The answer to life the universe and everything is of courseAnswer:The answer to life the universe and everything is of course 42.
But what is the answer to the question of how to get more people to use public transport?
Well, it’s not 42.
It’s 10.
That’s the number of things that need to be done to get more people to use public transport according to a new report from the Institute for Public Policy Research.
The report, entitled ‘Why we need to re-imagine the bus’, says that buses are the most used


temperature=0.3,自定义样例:

Query:How to make a hamburger?Answer:The hamburger is one of the most popular dishes in the world. It is a sandwich made with a patty of ground meat, usually beef, and served on a bun. The hamburger can be topped with a variety of condiments, such as ketchup, mustard, mayonnaise, and pickles.
There are many different ways to make a hamburger. The most important thing is to use fresh, high-quality ingredients. The meat should be ground fresh and the buns should be soft and fresh. The toppings should be fresh and flavorful.
Here is a basic recipe for a hamburger:

1 pound ground beef
1 egg
1 tablespoon Worcestershire sauce
1 teaspoon salt
1/2 teaspoon black pepper
1/4 cup chopped onion
1/4 cup chopped green pepper
1/4 cup ketchup
1/4 cup mustard
1/4 cup mayonnaise
1/4 cup pickles
4 hamburger buns

1. Preheat the grill to medium-high heat.
2. In a large bowl, mix together the ground beef, egg, Worcestershire sauce, salt, black pepper, onion, green pepper, ketchup, mustard, mayonnaise, and pickles.
3. Divide the mixture into 4 equal portions and shape into patties.
4. Grill the patties for 4-5 minutes per side, or until they are cooked through.
5. Serve on hamburger buns with your favorite toppings.
What is the secret to juicy hamburgers?
There are many secrets to making juicy hamburgers, but the most important one is to use high-quality, fresh ingredients. The meat should be fresh and of good quality, and the buns should be soft and fresh. The toppings should also be fresh and flavorful.
Another important secret to making juicy hamburgers is to cook them properly. The patties should be cooked on a hot grill or in a hot skillet until they are browned and cooked through. Overcooking the burgers will make them dry and tough.
Finally, the burgers should be served immediately after they are cooked. If they are left to sit, they will become dry and tough.
What is the best meat to make hamburgers?
The best meat to make hamburgers is ground beef. It is important to use fresh, high-quality ground beef for the best results. The beef should be ground fresh and should be of good quality.
What is the best way to cook hamburgers?
The best way to cook hamburgers is on a hot grill or in a hot skillet. The burgers should be cooked until they are browned and cooked through. Overcooking the burgers will make them dry and tough.
How do you keep hamburgers moist on the grill?
To keep hamburgers moist on the grill, you can brush them with a little bit of oil or butter before cooking. You can also add a little bit of water to the pan or grill to help keep the burgers moist.
What is the best way to cook a hamburger?
There are many ways to cook a hamburger, but the best way is to grill it. Grilling gives the burger a nice, smoky flavor and it also helps to keep the burger juicy.
What is the best way to cook a hamburger on the stove?
The best way to cook a hamburger on the stove is to use a cast iron skillet. Cast iron skillets are great for cooking burgers because they hold heat well and they help to create a nice, crispy crust on the burger.
What is the best way to cook a hamburger in the oven?
The best way to cook a hamburger in the oven is to use a baking sheet. Baking sheets help to create a nice, even crust on the burger and they also help to keep the burger from sticking to the pan.
What is the best way to cook a hamburger on the grill?
The best way to cook a hamburger on the grill is to use a grill pan. Grill pans help to create a nice, even crust on the burger and they also help to keep the burger from sticking to the grill.
What is the best way to cook a hamburger on the stovetop?
The best way to cook a hamburger on the stovetop is to use a skillet. Skillets help to create a nice, even crust on the burger and they also help to keep the burger from sticking to the pan.
What is the best way to cook a hamburger in the microwave?
The best way to cook a hamburger in the microwave is to use a microwave-safe plate. Microwave-safe plates help to create a nice, even crust on the burger and they also help to keep the burger from sticking to the plate.
What is the best way to cook a hamburger in the air fryer?
The best way to cook a hamburger in the air fryer is to use a baking sheet. Baking sheets help to create a nice, even crust on the burger and they also help to keep the burger from sticking to the air fryer.
How do you make hamburgers taste better?
There are a few things you can do to make


评测

魔搭社区已支持在上述镜像中进行评测,首先安装评测依赖:

由于Jax镜像和PyTorch GPU版本不兼容,因此需要额外安装CPU版本的PyTorch:


pip3 install torch --index-url https://download.pytorch.org/whl/cpu


安装eval-scope (llmuses)评测工具

pip3 install llmuses

# GitHub: https://github.com/modelscope/eval-scope


安装其他依赖:


wget https://github.com/modelscope/eval-scope/blob/dev/custom_infer/requirements/requirements.txt
pip3 install -r requirements.txt

# 如果运行下面的脚本报错No module named 'transformer_engine_extensions',则卸载如下wheel:pip uninstall transformer-engine


运行如下脚本即可进行评测:

# Copyright (c) Alibaba, Inc. and its affiliates.
import osimport timefrom typing import Listimport loggingfrom model import LanguageModelConfig, TransformerConfig, QuantizedWeight8bit as QW8Bitfrom runners import InferenceRunner, ModelRunner, sample_from_modelfrom llmuses.models.custom import CustomModelfrom llmuses.run import run_taskfrom llmuses.constants import DEFAULT_ROOT_CACHE_DIRfrom llmuses.utils import yaml_to_dictfrom llmuses.summarizer import Summarizerfrom llmuses.utils.logger import get_loggerimport time
logger = get_logger()
CKPT_PATH = "/path/to/ckpt_path"

class GrokModel(CustomModel):
def __init__(self, config: dict, **kwargs):
self.grok_1_model = LanguageModelConfig(vocab_size=128 * 1024,pad_token=0,eos_token=2,sequence_len=8192,embedding_init_scale=1.0,output_multiplier_scale=0.5773502691896257,embedding_multiplier_scale=78.38367176906169,model=TransformerConfig(emb_size=48 * 128,widening_factor=8,key_size=128,num_q_heads=48,num_kv_heads=8,num_layers=64,attn_output_multiplier=0.08838834764831845,shard_activations=True,# MoE.num_experts=8,num_selected_experts=2,# Activation sharding.data_axis="data",model_axis="model",),)
self.inference_runner = InferenceRunner(pad_sizes=(1024,),runner=ModelRunner(model=self.grok_1_model,bs_per_device=0.125,checkpoint_path=CKPT_PATH,),name="local",load=CKPT_PATH,tokenizer_path="./tokenizer.model",local_mesh_config=(1, 8),between_hosts_config=(1, 1),)self.inference_runner.initialize()super(GrokModel, self).__init__(config=config, **kwargs)
def predict(self, prompt: str, **kwargs):tokens = self.inference_runner.tokenizer.encode(prompt)gen = self.inference_runner.run()ts = time.time()response = sample_from_model(gen, prompt, max_len=kwargs['infer_cfg']['max_new_tokens'] + len(tokens), temperature=kwargs['infer_cfg']['temperature'])ts = time.time() - tsresponse_tokens = self.inference_runner.tokenizer.encode(response)print('>>>[Query]' + prompt, flush=True)print('>>>[Answer]' + response, flush=True)print(f'>>>Time cost:{ts}, token num: {len(response_tokens)}, infer speed(token/s):{len(response_tokens)/ts}', flush=True)res_d: dict = {'choices': [{'index': 0,'message': {'content': response,'role': 'assistant'}}],'created': time.time(),'model': 'grok','object': 'chat.completion','usage': {'completion_tokens': 0,'prompt_tokens': 0,'total_tokens': 0}}
return res_d

if __name__ == '__main__':from llmuses.config import TaskConfig
grok_model = GrokModel(config={'model_id': 'grok'})task_config: TaskConfig = TaskConfig()print(task_config.list())# ['arc', 'gsm8k']task_config = task_config.load(custom_model=grok_model, tasks=['arc', 'gsm8k', 'bbh_mini', 'mmlu_mini', 'ceval_mini'])task_config.limit = 2# Note: limit the number of each subset to evaluate; default is None
run_task(task_cfg=task_config)
# Get the final report for your evaluation taskfinal_report: List[dict] = Summarizer.get_report_from_cfg(task_cfg=task_config)print(f'*** Final report ***\n {final_report}\n')


ARC-Challenge评测样例:


>>>Cities control the amount of pollution that is allowed to come from cars. How does this most likely help people?A. The air stays cleaner.B. Cars can travel at faster speeds.C. The skills of the drivers improve.D. It becomes safer to drive on the roads.Answer:======================Below is the answer=====================>>>A. The air stays cleaner.
The air stays cleaner.

The correct answer is A. The air stays cleaner.
The government has the power to control the amount of pollution that is allowed to come from cars. This is done by setting standards for emissions and fuel efficiency. By doing this, the government can help to reduce the amount of pollution that is released into the air. This can help to improve the quality of the air and make it safer for people to breathe.
## Explanation:
The government has the power to control the amount of pollution that is allowed to come from cars. This is done by setting standards for emissions and fuel efficiency. By doing this, the government can help to reduce the amount of pollution that is released into the air. This can help to improve the quality of the air and make it safer for people to breathe.
The government also has the power to control the amount of noise that is allowed to come from cars. This is done by setting standards for noise levels. By doing this, the government can help to reduce the amount of noise pollution that is released into the environment. This can help to improve the quality of life for people who live in areas where there is a lot of noise pollution.
The government also has the power to control the amount of traffic that is allowed on the roads. This is done by setting standards for speed limits and road closures. By doing this, the government can help to reduce the amount of traffic congestion that is caused by cars. This can help to improve the quality of life for people who live in areas where there is a lot of traffic congestion.
The government also has the power to control the amount of parking that is allowed in cities. This is done by setting standards for parking spaces and parking fees. By doing this, the government can help to reduce the amount of parking congestion that is caused by cars. This can help to improve the quality of life for people who live in areas where there is a lot of parking congestion.
The government also has the power to control the amount of land that is used for car parks. This is done by setting standards for the size of car parks and the number of car parks that are allowed in a city. By doing this, the government can help to reduce the amount of land that is used for car parks. This can help to improve the quality of life for people who live in areas where there is a lot of land that is used for car parks.
The government also has the power to control the amount of money that is spent on cars. This is done by setting standards for the price of cars and the amount of money that is allowed to be spent on cars. By doing this, the government can help to reduce the amount of money that is spent on cars. This can help to improve the quality of life for people who live

评测速度:

>>>Time cost:299.08s, token num: 577, infer speed(token/s):1.93


5-shot gsm8k评测样例:


>>>Question: Angelo and Melanie want to plan how many hours over the next week they should study together for their test next week. They have 2 chapters of their textbook to study and 4 worksheets to memorize. They figure out that they should dedicate 3 hours to each chapter of their textbook and 1.5 hours for each worksheet. If they plan to study no more than 4 hours each day, how many days should they plan to study total over the next week if they take a 10-minute break every hour, include 3 10-minute snack breaks each day, and 30 minutes for lunch each day?Let's think step by stepAngelo and Melanie think they should dedicate 3 hours to each of the 2 chapters, 3 hours x 2 chapters = 6 hours total.For the worksheets they plan to dedicate 1.5 hours for each worksheet, 1.5 hours x 4 worksheets = 6 hours total.Angelo and Melanie need to start with planning 12 hours to study, at 4 hours a day, 12 / 4 = 3 days.However, they need to include time for breaks and lunch. Every hour they want to include a 10-minute break, so 12 total hours x 10 minutes = 120 extra minutes for breaks.They also want to include 3 10-minute snack breaks, 3 x 10 minutes = 30 minutes.And they want to include 30 minutes for lunch each day, so 120 minutes for breaks + 30 minutes for snack breaks + 30 minutes for lunch = 180 minutes, or 180 / 60 minutes per hour = 3 extra hours.So Angelo and Melanie want to plan 12 hours to study + 3 hours of breaks = 15 hours total.They want to study no more than 4 hours each day, 15 hours / 4 hours each day = 3.75They will need to plan to study 4 days to allow for all the time they need.The answer is 4
Question: Mark's basketball team scores 25 2 pointers, 8 3 pointers and 10 free throws.Their opponents score double the 2 pointers but half the 3 pointers and free throws.What's the total number of points scored by both teams added together?Let's think step by stepMark's team scores 25 2 pointers, meaning they scored 25*2= 50 points in 2 pointers.His team also scores 6 3 pointers, meaning they scored 8*3= 24 points in 3 pointersThey scored 10 free throws, and free throws count as one point so they scored 10*1=10 points in free throws.All together his team scored 50+24+10= 84 pointsMark's opponents scored double his team's number of 2 pointers, meaning they scored 50*2=100 points in 2 pointers.His opponents scored half his team's number of 3 pointers, meaning they scored 24/2= 12 points in 3 pointers.They also scored half Mark's team's points in free throws, meaning they scored 10/2=5 points in free throws.All together Mark's opponents scored 100+12+5=117 pointsThe total score for the game is both team's scores added together, so it is 84+117=201 pointsThe answer is 201
Question: Bella has two times as many marbles as frisbees. She also has 20 more frisbees than deck cards. If she buys 2/5 times more of each item, what would be the total number of the items she will have if she currently has 60 marbles?Let's think step by stepWhen Bella buys 2/5 times more marbles, she'll have increased the number of marbles by 2/5*60 = 24The total number of marbles she'll have is 60+24 = 84If Bella currently has 60 marbles, and she has two times as many marbles as frisbees, she has 60/2 = 30 frisbees.If Bella buys 2/5 times more frisbees, she'll have 2/5*30 = 12 more frisbees.The total number of frisbees she'll have will increase to 30+12 = 42Bella also has 20 more frisbees than deck cards, meaning she has 30-20 = 10 deck cardsIf she buys 2/5 times more deck cards, she'll have 2/5*10 = 4 more deck cards.The total number of deck cards she'll have is 10+4 = 14Together, Bella will have a total of 14+42+84 = 140 itemsThe answer is 140
Question: A group of 4 fruit baskets contains 9 apples, 15 oranges, and 14 bananas in the first three baskets and 2 less of each fruit in the fourth basket. How many fruits are there?Let's think step by stepFor the first three baskets, the number of apples and oranges in one basket is 9+15=24In total, together with bananas, the number of fruits in one basket is 24+14=38 for the first three baskets.Since there are three baskets each having 38 fruits, there are 3*38=114 fruits in the first three baskets.The number of apples in the fourth basket is 9-2=7There are also 15-2=13 oranges in the fourth basketThe combined number of oranges and apples in the fourth basket is 13+7=20The fourth basket also contains 14-2=12 bananas.In total, the fourth basket has 20+12=32 fruits.The four baskets together have 32+114=146 fruits.The answer is 146
Question: Jared is trying to increase his typing speed. He starts with 47 words per minute (WPM). After some lessons the next time he tests his typing speed it has increased to 52 WPM. If he continues to increase his typing speed once more by 5 words, what will be the average of the three measurements?Let's think step by step
Answer:===================Below is the answer====================>>>2 = 12 oranges in the fourth basketThe combined number of oranges and apples in the fourth basket is 13+7=20The fourth basket also contains 14-2=12 bananas.In total, the fourth basket has 20+12=32 fruits.The four baskets together have 32+114=146 fruits. # Read Less
## About this GRE PrepPal
#### The best way to practice
Understanding why you got a question right or wrong is the key to effective learning. With detailed explanations for every question, you'll always know where you stand.
#### See every wrong answer
When you get a question wrong, we'll immediately show you the content you've forgotten and offer clear, step-by-step guidance to help you learn.
#### Make the most of your study time
On average, it takes 150 hours and 3,000 practice questions to achieve a 90th percentile score on the GRE. That's why we built our GRE prep course to be as efficient as possible.
#### Track your progress every step of the way
At the end of each practice session, we'll identify your weaknesses and tell you exactly what you need to study to improve. It's a smarter, more focused way to master the GRE.
#### Learn from the very best
Work with an elite test prep tutor from one of America's top universities. With PrepScholar, you get the combined brainpower of the entire team.
#### Get into your dream school
PrepScholar has helped thousands of students improve their scores and get into their dream schools. We guarantee you'll get a higher score and get into the school of your choice.
## Frequently Asked Questions
#### How does PrepScholar's GRE prep course work?
Our GRE prep course is designed to be efficient and comprehensive. We've broken down the GRE into its key components and developed the best strategies and most effective learning methods for each area, saving you hours of wasted time.
#### How is PrepScholar's GRE prep course different from other courses?
PrepScholar is the world's most advanced, efficient GRE prep course. Rather than wasting your time on repetitive problems and out-of-date strategies, PrepScholar constantly analyzes your progress and creates a study plan customized to your particular strengths and weaknesses.
#### How will PrepScholar's GRE prep course help me improve my score?
PrepScholar is the only GRE prep system that adapts to your strengths and weaknesses, helping you study 10 times more efficiently than with other prep courses. Rather than wasting your time on repetitive problems and out-of-date strategies, PrepScholar constantly analyzes your progress and creates a study plan customized to your particular strengths and weaknesses.
#### How much time should I plan on studying for the GRE?
The amount of time it takes to prepare for the GRE depends on the score you need to achieve your goals. To improve your GRE score, you'll need to put in hard work and targeted preparation. On average, it takes 150 hours and 3,000 practice questions to achieve a 90th percentile score on the GRE.
#### How is my PrepScholar course customized for me?
Every PrepScholar GRE course is customized to your particular strengths and weaknesses. Your course will constantly analyze your progress and create a study plan tailored to your particular area of need.
#### How do I access my GRE prep course?
You can access your GRE prep course on your computer, tablet, or smartphone.
#### How does the higher score guarantee work?
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free.
#### What's included with my GRE prep course?
Every PrepScholar GRE course comes with everything you need to raise your score, including 3,000+ practice questions, 100+ lessons, 150+ hours of learning, and 10+ GRE strategy guides.
#### How do I pay for my GRE prep course?
You can pay for your GRE prep course with a credit card or with PayPal. If you don't love your course, we'll refund your tuition no questions asked.
#### How do I know PrepScholar is right for me?
If you're serious about getting a great GRE score, PrepScholar is the world's most efficient prep course. We guarantee you'll get a higher score and get into the school of your choice. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less
## Get a higher score guaranteed
PrepScholar guarantees you'll get a higher score and get into the school of your choice

评测速度:

>>>Time cost:488.51s, token num: 1978, infer speed(token/s):4.05





53AI,企业落地应用大模型首选服务商

产品:大模型应用平台+智能体定制开发+落地咨询服务

承诺:先做场景POC验证,看到效果再签署服务协议。零风险落地应用大模型,已交付160+中大型企业

联系我们

售前咨询
186 6662 7370
预约演示
185 8882 0121

微信扫码

与创始人交个朋友

回到顶部

 
扫码咨询