微信扫码
与创始人交个朋友
我要投稿
“虽然AIOS设计的是一个LLM Agent操作系统(OS),将LLM作为OS的大脑(一个有灵魂的OS),奔着AGI去的,但是就落地实处的角度出发,抛去OS,它作为一个Multi-Agent框架蛮好的。
图1:一个激励性的例子,展示了之智能体(例如,旅行智能体)在完成任务时需要LLM级别和操作系统级别的资源和功能。
智能体调度器(Agent Scheduler)
它的主要作用是对智能体(agents)的请求进行有效管理,以优化大型语言模型(LLM)的利用效率。智能体调度器采用不同的调度策略,如先进先出(FIFO)、轮询(Round Robin)等算法,来处理智能体任务的执行顺序。
图3:智能体调度器的示意图
上下文管理器(Context Manager)
负责处理LLM生成过程中的上下文信息和状态的关键模块。它的主要功能包括上下文快照(snapshot)和恢复(restoration)以及上下文窗口管理。
上下文快照和恢复功能允许系统在智能体请求被调度器挂起时(即使LLM尚未完成对当前请求的响应生成),保存当前生成过程的状态。这样,一旦资源再次可用,系统就可以从之前保存的状态恢复生成过程,从而继续生成响应,确保了临时挂起不会丢失进度,优化了资源的使用效率。
上下文窗口管理功能则用于处理长上下文信息,这些信息可能超出LLM的处理能力。通过基本的文本摘要和扩展技术,上下文管理器能够有效地管理上下文窗口,增强LLM处理和理解大量上下文信息的能力,同时保持信息的完整性和相关性。
图4:上下文快照和恢复,使用束搜索(束宽=1)作为一个示例搜索算法来说明这个生成性解码过程
工具管理器(Tool Manager)
图5:AIOS中管理的工具,最后一列显示了每个工具所需的输入和输出格式。
https://github.com/agiresearch/AIOS/tree/main/src
agents:负责管理不同的智能体应用,目前包括数学智能体(MathAgent)、小说智能体(NarrativeAgent)、推荐餐馆或酒店智能体(RecAgent)、规划旅行智能体(TravelAgent)。
{"name": "MathAgent","description": "You are an expert who is good at solving mathematical problems, given a mathematical problem, you need to break down this problem into smaller sub-problems. Solve a part of the problem step by step with explanations and finally build up to the final solution."},{"name": "NarrativeAgent","description": "You are an expert who is good at writing novels, given a theme or background, you need to write a short story with a well-developed plot and characters, develop different sections of the story, such as introduction, rising action, climax, and conclusion."},{"name": "RecAgent","description": "You are an expert who is good at recommending restraunts or hotels for users, given a request, you need to first determine the right recommendation direction and then provide the recommendation lists."},{"name": "TravelAgent","description": ["You are a proficient planner. ","Based on the provided information and query, please give me a detailed plan, including specifics such as flight numbers (e.g., F0123456 ), restaurant names, and accommodation names. ","Note that all the information in your plan should be derived from the provided data. ", "You must adhere to the format given in the example. Additionally, all details should align with commonsense. ", "The symbol '-' indicates that information is unnecessary. ", "For example, in the provided sample, you do not need to plan after returning to the departure city. ", "When you travel to two cities in one day , you should note it in the 'Current City ' section as in the example ( i . e . , from A to B ) ."],"flow": ["Step 1:::Process:::Based on the input query, determine the duration, departure city, and destination.:::next::step 2","Step 2:::Decision:::Is the destination a state or a city?:::city::step 4:::state::step 3","Step 3:::Process:::Select a city as the new destination city from the destination state:::next::step 4","Step 4:::Process:::Estimate the cost of taking a taxi from departure city to the destination city.:::next::Step 5","Step 5:::Process:::Estimate the cost of self-driving from departure city to the destination city.:::next::Step 6","Step 6:::Process:::Estimate the cost of taking a flight on the start date from departure city to the destination city.:::next::Step 7","Step 7:::Decision:::Is there a reasonable transportation based on the results of taxi, self-driving and flight cost?:::yes::Step 8:::no::Step 3","Step 8:::Process:::Record the most reasonable transportation method from departure city to the first destination city. Move to the first destination city.:::next::Step 9","Step 9:::Process:::Record an unvisited restaurant for today's breakfast at current city:::next::Step 10","Step 10:::Process:::Record an unvisited restaurant for today's lunch at current city:::next::Step 11","Step 11:::Process:::Record an unvisited restaurant for today's dinner at current city:::next::Step 12","Step 12:::Process:::Record an unvisited attraction for today's plan at current city:::next::Step 13","Step 13:::Decision:::Is today the last day of the trip?:::yes::Step 14:::no::Step 19","Step 14:::Process:::Estimate the cost of taking a taxi from current city to the departure city.:::next::Step 15","Step 15:::Process:::Estimate the cost of self-driving from current city to the departure city.:::next::Step 16","Step 16:::Process:::Estimate the cost of taking a flight on the last date from current city to the departure city.:::next::Step 17","Step 17:::Process:::Record the most reasonable transportation method from current city to the departure city.:::next::Step 18","Step 18:::Terminal:::Output all the plans in json.:::","Step 19:::Process:::Find a reasonable accommodation at current city.:::next::Step 20","Step 20:::Decision:::Is there a reasonable accommodation at current city?:::yes::Step 21:::no::Step 3","Step 21:::Process:::Record the accommodation at current city. Start planning the next day. Now, what is the date today?:::next::Step 22","Step 22:::Decision:::Is today the third day of the trip?:::no::Step 23:::yes::Step 24","Step 23:::Decision:::Is today the fifth day of the trip?:::no::Step 9:::yes::Step 24","Step 24:::Process:::Select an unvisited city as the new destination city from the destination state.:::next::step 4"],"tool_info": ["Avaiable tools: ","google_search"]}
智能体的主要能力包括:工具调用、工具参数解析、prompt、llm执行(agent_process由调度模块执行)、答案总结、flow工作流执行(指定step)。
from src.agents.agent_process import (
AgentProcess,
)
class BaseAgent:
def get_response(self, prompt, temperature=0.0):
agent_process = AgentProcess(self.agent_name, prompt, temperature)
agent_process.set_created_time(time.time())
self.agent_process_queue.put(agent_process)
thread = CustomizedThread(target=self.listen, args=(agent_process,))
thread.start()
# print(result)
result = thread.join()
waiting_time = agent_process.get_start_time() - agent_process.get_created_time()
turnaround_time = agent_process.get_end_time() - agent_process.get_created_time()
result = result.replace("\n", "")
return result, waiting_time, turnaround_time
def check_tool_use(self, prompt, tool_info, temperature=0.):
prompt = f'You are allowed to use the following tools: \n\n```{tool_info}```\n\n' \
f'Do you think the response ```{prompt}``` calls any tool?\n' \
f'Only answer "Yes" or "No".'
while True:
response = self.get_response(prompt, temperature)
temperature += .5
print(f'Tool use check: {response}')
if 'yes' in response.lower():
return True
if 'no' in response.lower():
return False
print(f'Temperature: {temperature}')
if temperature > 2:
break
print('No valid format output when calling "Tool use check".')
# exit(1)
def get_prompt(self, tool_info, flow_ptr, task_description, cur_progress):
progress_str = '\n'.join(cur_progress)
prompt = f'{tool_info}\n\nCurrent Progress:\n{progress_str}\n\nTask description: {task_description}\n\n' \
f'Question: {flow_ptr.get_instruction()}\n\nOnly answer the current instruction and do not be verbose.'
return prompt
def get_tool_arg(self, prompt, tool_info, selected_tool):
prompt = f'{tool_info}\n\n' \
f'You attempt to use the tool ```{selected_tool}```. ' \
f'What is the input argument to call tool for this step: ```{prompt}```? ' \
f'Respond "None" if no arguments are needed for this tool. Separate by comma if there are multiple arguments. Do not be verbose!'
response = self.get_response(prompt)
print(f'Parameters: {response}')
return response
def get_final_result(self, prompt):
prompt = f"Given the interaction history: {prompt}, give the answer to the task input and don't be verbose!"
final_result, waiting_time, turnaround_time = self.get_response(prompt)
final_result.replace("\n", "")
return final_result, waiting_time, turnaround_time
llm:负责管接入不同的底座大模型能力,比如gemma-2b-it、llamaLlama-2-13b-chat、Mixtral-8x7B。
{"model_type": "causal_lm","open_sourced": true,"model_name": "google/gemma-2b-it"}
scheduler:对不同agent进行调度,调用llm进行具体实现。
from src.agents.agent_process import AgentProcess
import time
class BaseScheduler:
def __init__(self, llm):
self.active = False # start/stop the scheduler
self.thread = Thread(target=self.run)
self.llm = llm
def run(self):
pass
def start(self):
"""start the scheduler"""
self.active = True
self.thread.start()
def stop(self):
"""stop the scheduler"""
self.active = False
self.thread.join()
def execute_request(self, agent_process: AgentProcess):
agent_process.set_status("Executing")
logger.info(f"[{agent_process.agent_name}] is executing.")
agent_process.set_start_time(time.time())
response = self.llm.address_request(agent_process.prompt)
agent_process.set_response(response)
agent_process.set_end_time(time.time())
agent_process.set_status("Done")
memory/storage:短期记忆与长期记忆就不细讲了,实现的不复杂,短期记忆通过dict进行内存存储检索,长期记忆通过db或file进行长期存储检索
tool:工具这块实现了8个,比如论文arxiv,搜索(bing/goolge)等,每个工具具体实现主要是api接口url、参数配置、执行、结果解析。
class BingSearch(BaseTool):
"""Bing Search Tool, refactored from langchain.
In order to set this up, follow instructions at:
https://levelup.gitconnected.com/api-tutorial-how-to-use-bing-web-search-api-in-python-4165d5592a7e
"""
def __init__(self):
super().__init__()
self.url = "https://api.bing.microsoft.com/v7.0/search" # temporarily
self.bing_subscription_key = get_from_env("BING_SUBSCRIPTION_KEY")
self.k: int = 10 # topk searched results
# search_kwargs: dict
def _bing_search_results(self, search_term: str, count: int) -> List[dict]:
headers = {"Ocp-Apim-Subscription-Key": self.bing_subscription_key}
params = {
"q": search_term,
"count": count,
"textDecorations": True,
"textFormat": "HTML",
# **self.search_kwargs,
}
response = requests.get(
self.bing_search_url,
headers=headers,
params=params,# type: ignore
)
response.raise_for_status()
search_results = response.json()
if "webPages" in search_results:
return search_results["webPages"]["value"]
return []
def run(self, query: str) -> str:
"""Run query through BingSearch and parse result."""
response = self._bing_search_results(query, count=self.k)
result = self.parse_result(response)
return result
def parse_result(self, response):
snippets = []
if len(response) == 0:
return "No good Bing Search Result was found"
for result in response:
snippets.append(result["snippet"])
return " ".join(snippets)
从Agent到多模态Agent再到多模态Multi-Agents系统的发展与案例讲解(1.2万字,20+文献,27张图)
AIOS: LLM Agent Operating Systemhttps://arxiv.org/pdf/2403.18243.pdf
53AI,企业落地应用大模型首选服务商
产品:大模型应用平台+智能体定制开发+落地咨询服务
承诺:先做场景POC验证,看到效果再签署服务协议。零风险落地应用大模型,已交付160+中大型企业
2024-03-30
2024-04-26
2024-05-10
2024-04-12
2024-05-28
2024-05-14
2024-04-25
2024-07-18
2024-04-26
2024-05-06
2024-12-22
2024-12-21
2024-12-21
2024-12-21
2024-12-21
2024-12-20
2024-12-20
2024-12-19