我要投稿

Google ADK框架：打造多智能体系统的开源利器

发布日期：2025-04-27 18:03:25 浏览次数： 2000

作者：AI真香笔记

微信搜一搜，关注“AI真香笔记”

人工智能领域正在经历一场代理革命（Agent Revolution）。随着大语言模型(LLM)能力的不断提升，基于LLM构建能够自主规划、决策和执行的智能代理（Agent）成为了行业热点。但开发功能完备、可靠稳定的AI代理系统并不容易，需要解决工具调用、多代理协同、评估测试等诸多挑战。Google近期推出的开源项目"Agent Development Kit"（简称ADK）正是为解决这些难题而生，它提供了一套完整的Python工具包，帮助开发者快速构建、评估和部署复杂的AI代理系统。本文将深入解析ADK框架的架构设计和核心特性，帮助读者快速入门这一强大框架。

ps：github地址https://github.com/google/adk-python，目前星数7.8k。

一、框架概述

ADK（Agent Development Kit）是Google开发的开源框架，它为AI代理开发提供了一套代码优先（code-first）的Python工具库。与其他Agent框架相比，ADK的设计理念是让Agent开发更像传统软件开发，提供灵活的模块化组件，支持从简单任务到复杂工作流的各类代理架构。

ADK虽然针对Google Gemini模型和Google生态系统进行了优化，但它采用了模型无关、部署无关的设计，兼容其他框架，支持多种LLM后端，这使得开发者可以根据自己的需求选择合适的模型和部署方式。考虑到大部分读者无法使用Gemini模型，后面快速入门示例本文将会使用阿里通义千问大模型进行示例开发。

二、核心架构

ADK框架采用模块化设计，主要由以下几个关键组件构成：

1. 代理系统（Agents）

ADK的核心是其代理系统，主要包括以下几种代理类型：

BaseAgent：所有代理的基类，定义了代理的基本属性和行为。
LlmAgent：基于大语言模型的代理，是最常用的代理类型，支持工具调用、输入/输出控制等。
SequentialAgent：顺序执行多个子代理的代理。
ParallelAgent：并行执行多个子代理的代理。
LoopAgent：循环执行子代理的代理。
LanggraphAgent：支持使用LangGraph构建的代理，传入LangGraph编译好的图结构。

代理系统支持层次化组合，可以构建复杂的多代理系统，其中一个协调器代理可以管理多个专门的子代理。代理之间的交互和控制转移由模型自动决定，使得系统能够灵活应对各种任务场景。

2. 工具系统（Tools）

ADK提供了丰富的工具系统，使代理能够执行各种实际任务：

Function Tools：将普通Python函数转换为代理可调用的工具。
Built-in Tools：内置的Google搜索工具、代码执行工具（每个agent只能使用一个内置工具，使用了内置工具的agent不能作为root agent的子agent）。
Agent Tools：将其他代理作为工具调用，实现代理间协作。
OpenApi Tools：通过OpenAPI规范自动生成工具，轻松集成各种API服务。
Retrieval Tools：支持检索增强生成(RAG)的工具，能够从文档中检索信息。
MCP Tools：无缝对接MCP协议动态扩展工具。

工具系统采用统一的接口设计，开发者可以轻松扩展自定义工具，满足特定应用场景的需求。

3. 模型集成（Models）

ADK支持多种LLM后端，主要包括：

GoogleLLM：集成Google Gemini模型系列。
AnthropicLLM：支持Anthropic的Claude系列模型。
LiteLLM：通过LiteLLM库支持OpenAI、Azure OpenAI等多种模型。

模型系统采用抽象接口设计，使得代理逻辑与具体模型实现分离，确保代码可移植性和灵活性。

4. 运行时系统（Runners）

ADK的Runner系统负责代理的实际执行，管理会话、状态和资源：

InMemoryRunner：用于本地测试的内存运行器。
Runner：标准运行器，支持各种服务集成。

Runner系统处理代理执行的完整生命周期，包括消息处理、事件生成和与各种服务的交互。

5. 会话管理（Sessions）

ADK的会话管理系统负责维护代理与用户之间的对话历史：

Session：代表一个用户与代理系统的会话。
SessionService：管理会话的创建、获取和更新。

6. 评估系统（Evaluation）

ADK提供了强大的评估工具，帮助开发者测试和改进代理性能：

AgentEvaluator：评估代理在测试案例上的表现。
ResponseEvaluator：评估代理响应的质量。
TrajectoryEvaluator：评估代理决策轨迹的合理性。

评估系统支持自动生成测试案例，使用LLM评估代理性能，以及与人类评估结果比较，为代理开发提供全方位的质量保障。

三、技术特点

1. 代码优先（Code-First）开发方式

ADK采用代码优先的开发方式，而非配置文件驱动。这种方式具有几个明显优势：

from google.adk.agents import Agent
from google.adk.tools import google_search

root_agent = Agent(
    name="search_assistant",
    model="gemini-2.0-flash", 
    instruction="你是一个有帮助的助手。需要时使用Google搜索回答用户问题。",
    description="一个能够搜索网络的助手。",
    tools=[google_search]
)

灵活性：可以直接使用Python的全部功能和生态系统。
可测试性：便于编写单元测试和集成测试。
版本控制：代码可以通过Git等工具进行版本管理。
集成能力：易于与现有Python代码和系统集成。

2. 多代理协作系统

ADK支持构建多代理系统，实现复杂任务的分解和协作：

from google.adk.agents import LlmAgent

# 定义各个专用代理
greeter = LlmAgent(name="greeter", model="gemini-2.0-flash", ...)
task_executor = LlmAgent(name="task_executor", model="gemini-2.0-flash", ...)

# 创建协调器代理
coordinator = LlmAgent(
    name="Coordinator",
    model="gemini-2.0-flash",
    description="我协调问候和任务执行。",
    sub_agents=[greeter, task_executor]
)

通过子代理系统，ADK能够构建层次化的代理结构，每个代理专注于特定任务，共同完成复杂目标。ADK引擎和模型会引导各代理协同工作，实现任务的自动分解与执行。

3. 丰富的工具生态

ADK提供了丰富的预构建工具，并支持多种方式扩展工具能力：

自定义函数：将普通Python函数转换为代理工具。
OpenAPI集成：通过OpenAPI规范自动生成工具。
Google生态集成：内置支持Google搜索、Vertex AI等服务。
MCP协议集成：直接通过MCP协议获取工具。

工具系统采用统一的接口设计，确保一致性和可扩展性。

4. 部署灵活性

ADK支持多种部署方式，适应不同的应用场景：

本地开发：使用InMemoryRunner在本地运行和测试代理。
容器化部署：轻松将代理打包为容器，部署到Cloud Run等服务。
无服务器部署：与Google Cloud Functions等无服务器平台集成。
Vertex AI集成：与Vertex AI Agent Engine无缝集成，实现大规模部署。

5. 完备的开发和评估工具

ADK提供了全面的开发和评估工具，加速代理开发流程：

开发UI：内置的Web界面，用于测试、调试和展示代理。
评估框架：自动评估代理性能，支持多种评估指标。
测试工具：支持单元测试、集成测试和端到端测试。

四、快速入门

使用python的uv进行项目管理

1. 初始化项目

uv init adk-demo --python 3.12.0
cd adk-demo
# 创建multi-agent目录
mkdir multi-agent
cd multi-agent
# 创建agent.py文件
touch agent.py
# 创建__init__.py文件
touch agent.py
# 修改__init__.py文件内容为：
from . import agent

2. 安装google-adk、litellm、httpx

uv add google-adk litellm httpx

2. agent.py文件引入模块及使用通义千问模型构建llm引擎

import asyncio
from typing import Any
from google.adk.agents import LlmAgent
from google.adk.models.lite_llm import LiteLlm
from google.adk.sessions import InMemorySessionService
from google.adk.runners import Runner
from google.genai import types
import httpx

llm = LiteLlm(
    model="openai/qwen-max-2025-01-25",
    api_key='your api key',
    base_url='https://dashscope.aliyuncs.com/compatible-mode/v1',
    temperature=0.3,
)

3. 构建天气预报Agent

API_BASE = "https://devapi.qweather.com/v7"
API_KEY = "和风天气的api key"


asyncdef query_weather(url: str) -> dict[str, Any] | None:
    """Make a request to the NWS API with proper error handling."""
    headers = {
        "X-QW-Api-Key": API_KEY,
    }
    asyncwith httpx.AsyncClient() as client:
        try:
            response = await client.get(url, headers=headers, timeout=30.0)
            response.raise_for_status()
            return response.json()
        except Exception:
            returnNone
        
asyncdef get_forecast(latitude: float, longitude: float) -> str:
    """Get weather forecast for a location.

    Args:
        latitude: Latitude of the location
        longitude: Longitude of the location
    """
    # First get the forecast grid endpoint
    weather_url = f"{API_BASE}/weather/7d?location={longitude},{latitude}"
    weather_data = await query_weather(weather_url)
    
    ifnot weather_data or'code'in weather_data and weather_data['code'] != '200':
        error_msg = weather_data.get('fxLink', '未知错误') if weather_data else'无法连接到天气API'
        returnf"获取天气数据失败: {error_msg}"
    
    if'daily'notin weather_data ornot weather_data['daily']:
        return"天气数据格式错误，无法解析"

    forecasts = []
    for period in weather_data['daily']:  # Only show next 5 periods
        forecast = f"""
{period['fxDate']} {period['textDay']}:
Temperature: {period['tempMin']}~{period['tempMax']}°C
Wind: {period['windSpeedDay']} {period['windDirDay']}
"""
        forecasts.append(forecast)

    return"\n---\n".join(forecasts)

4. 构建汇率查询Agent

def get_exchange_rate(
    currency_from: str = "USD",
    currency_to: str = "EUR",
    currency_date: str = "latest",
):
    """Use this to get current exchange rate.

    Args:
        currency_from: The currency to convert from (e.g., "USD").
        currency_to: The currency to convert to (e.g., "EUR").
        currency_date: The date for the exchange rate or "latest". Defaults to "latest".

    Returns:
        A dictionary containing the exchange rate data, or an error message if the request fails.
    """    
    try:
        response = httpx.get(
            f"https://api.frankfurter.app/{currency_date}",
            params={"from": currency_from, "to": currency_to},
        )
        response.raise_for_status()

        data = response.json()
        if"rates"notin data:
            return {"error": "Invalid API response format."}
        return data
    except httpx.HTTPError as e:
        return {"error": f"API request failed: {e}"}
    except ValueError:
        return {"error": "Invalid JSON response from API."}

SYSTEM_INSTRUCTION = (
        "You are a specialized assistant for currency conversions. "
        "Your sole purpose is to use the 'get_exchange_rate' tool to answer questions about currency exchange rates. "
        "If the user asks about anything other than currency conversion or exchange rates, "
        "politely state that you cannot help with that topic and can only assist with currency-related queries. "
        "Do not attempt to answer unrelated questions or use tools for other purposes."
        "Set response status to input_required if the user needs to provide more information."
        "Set response status to error if there is an error while processing the request."
        "Set response status to completed if the request is complete."
    )

rate_agent = LlmAgent(
    model=llm,
    name="rate_agent",
    instruction=SYSTEM_INSTRUCTION,
    description="Can respond to user when they ask about currency exchange rates.",
    tools=[get_exchange_rate]
)

5. 构建代码助手Agent

code_agent = LlmAgent(
    model=llm,
    name="code_agent",
    description="Can respond to user when they ask about code.",
    instruction="You can write code and fix bugs.",
)

6. 构建多代理系统

root_agent = LlmAgent(
    model=llm,
    name="coordinator",
    instruction="Route user requests:Use rate_agent to answer questions about currency exchange rates, use code_agent to write code and fix bugs, use weather_agent to get weather forecast for a location.",
    description="Main agent for routing user requests to the appropriate sub-agents.",
    sub_agents=[rate_agent, code_agent, weather_agent],
)

APP_NAME = "multi_app"
USER_ID = "user_1"
SESSION_ID = "session_001"# Using a fixed ID for simplicity

session_service = InMemorySessionService()

session = session_service.create_session(
    app_name=APP_NAME,
    user_id=USER_ID,
    session_id=SESSION_ID
)

runner = Runner(
    agent=root_agent,
    app_name=APP_NAME,
    session_service=session_service,
)

asyncdef call_agent_async(query: str, runner, user_id, session_id):
    """Sends a query to the agent and prints the final response."""
    print(f"\n>>> User Query: {query}")

    # Prepare the user's message in ADK format
    final_response_text = "Agent did not produce a final response."# Default
    
    content = types.Content(role="user", parts=[types.Part(text=query)])

    # Key Concept: run_async executes the agent logic and yields Events.
    # We iterate through events to find the final answer.
    asyncfor event in runner.run_async(
        user_id=user_id, session_id=session_id, new_message=content
    ):
        # You can uncomment the line below to see *all* events during execution
        # print(f"  [Event] Author: {event.author}, Type: {type(event).__name__}, Final: {event.is_final_response()}, Content: {event.content}")

        # Key Concept: is_final_response() marks the concluding message for the turn.
        if event.is_final_response():
            if event.content and event.content.parts:
                # Assuming text response in the first part
                final_response_text = event.content.parts[0].text
            elif (
                event.actions and event.actions.escalate
            ):  # Handle potential errors/escalations
                final_response_text = (
                    f"Agent escalated: {event.error_message or 'No specific message.'}"
                )
            # Add more checks here if needed (e.g., specific error codes)
            break# Stop processing events once the final response is found

    print(f"<<< Agent Response: {final_response_text}")

asyncdef run_conversation():
    await call_agent_async(
        "上海今天天气怎么样?",
        runner=runner,
        user_id=USER_ID,
        session_id=SESSION_ID,
    )

    await call_agent_async(
        "当前美元兑人民币汇率是多少?", runner=runner, user_id=USER_ID, session_id=SESSION_ID
    )  # Expecting the tool's error message

    await call_agent_async(
        "帮我写一个python代码，计算1+1",
        runner=runner,
        user_id=USER_ID,
        session_id=SESSION_ID,
    )

if __name__ == "__main__":
    asyncio.run(run_conversation())

7. 运行系统

# 激活虚拟环境
.\.venv\Scripts\activate
# web ui模式运行（启动后访问http://localhost:8000/）
adk web
# cli模式运行
uv run .\multi_agent\agent.py

8. 运行截图

结语

Google ADK框架为AI代理开发提供了一套完整、灵活且强大的工具链，从代理定义、工具集成到评估部署，覆盖了开发生命周期的各个环节。无论是构建简单的对话助手，还是复杂的多代理协作系统，ADK都能提供必要的支持和便利。

ADK框架正在飞速发展，结合目前大火的MCP协议以及谷歌自家的A2A协议，使其开发分布式的多智能体系统变得尤为简单。如果你正在考虑开发AI代理应用，ADK无疑是一个值得尝试的强大工具。立即访问GitHub项目（https://github.com/google/adk-python）开始你的代理开发之旅吧！掌握AI，人人都是AI的指挥官，加油。

53AI，企业落地大模型首选服务商

产品：场景落地咨询+大模型应用平台+行业解决方案

承诺：免费场景POC验证，效果验证后签署服务协议。零风险落地应用大模型，已交付160+中大型企业