Search⌘ K
AI Features

The Anatomy of an ADK Agent

Explore the fundamental components of an AI agent built with the Google ADK framework. Learn how the LlmAgent acts as the agent's brain, how tools extend its capabilities, and how the Runner orchestrates execution. Understand how these parts work together to create modular, scalable AI agents.

In software development, building any complex system requires a clear understanding of its fundamental components. Just as a web application is composed of distinct parts like a database, a server-side framework, and a user interface, an AI agent built with a professional framework is also made of well-defined, interconnected components.

Having seen a basic agent run, we will now explore its architectural anatomy. A solid grasp of these core building blocks is essential for moving beyond simple examples and beginning to design and build powerful, custom agentic applications. This lesson breaks down the essential Python classes of the Google Agent Development Kit, focusing on the three primary components: the LlmAgent (the brain), the Tools (the capabilities), and the Runner (the engine).

The core component: LlmAgent

At the very center of any intelligent application built with the ADK is the agent itself. The primary class we will work with for this purpose is the LlmAgent.

The LlmAgent, also known as Agent, is a core component in the ADK that acts as the “thinking” part of an application. Its primary function is to leverage the power of an LLM for reasoning, understanding natural language, making decisions, generating responses, and interacting with tools. It is the component where we define the agent’s identity and its core logic. When we create an instance of the LlmAgent class, we configure its behavior through a series of parameters.

Here is the code snippet of the LlmAgent class, demonstrating the use of its primary parameters:

python
from google.adk.agents.llm_agent import LlmAgent
root_agent = LlmAgent(
name='greeting_agent',
model='gemini-2.5-flash',
description='An agent that provides a friendly greeting in a specified language.',
instruction='You are a friendly agent. Greet the user in their specified language.',
)

Let’s explore the parameters and their usage:

  • name (Required): Every agent needs a unique string identifier. This name is crucial for internal operations, especially in multi-agent systems where different agents need a way to refer to or delegate tasks to each other. It also serves as a clear label in logs and debugging outputs. It is best to choose a descriptive name that reflects the agent’s function.

  • model (Required): This parameter specifies the underlying LLM that will power the agent’s reasoning. The choice of model directly impacts the agent’s capabilities, performance, and cost. Different models have different strengths, so selecting the right one is a key design decision.

  • description: This parameter is a concise, human-readable summary of the agent’s capabilities. While it may seem secondary in a system with only one agent, its importance grows significantly in multi-agent architectures. It is primarily used by other agents to determine if they should route a task. For example, if a manager agent receives a user query, it will look at the descriptions of all the worker agents it controls to decide which specialist is best suited for the job.

  • instruction: This parameter is the agent’s core directive. It is a string that serves as the system prompt, sent to the LLM at the beginning of every interaction. A well-crafted instruction is the primary tool we have for guiding the agent. It is used to define:

    • Its core task or goal.

    • Its personality or persona.

    • Constraints on its behavior.

    • How and when to use its tools.

    • The desired format for its output.

Beyond these core parameters, the LlmAgent offers several other optional arguments for better control over its behavior.

LLM response generation

We can control how the underlying LLM generates responses using generate_content_config parameter. This allows us to pass a configuration object that controls how the LLM generates responses. We can adjust parameters like temperature (to control randomness), max_output_tokens (to limit response length), and safety settings.

python
from google.adk.agents.llm_agent import LlmAgent
from google.genai import types
root_agent = LlmAgent(
name='greeting_agent',
model='gemini-2.5-flash',
description='An agent that provides a friendly greeting in a specified language.',
instruction='You are a friendly agent. Greet the user in their specified language.',
generate_content_config=types.GenerateContentConfig(
temperature=0.2,
max_output_tokens=250,
safety_settings=[
types.SafetySetting(
category=types.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT,
threshold=types.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,
)
]
)
)

Structured data

The ADK also provides the ability to define expected input and desired output formats. Here are the parameters that help us achieve this.

  • output_schema: For scenarios requiring structured data, this parameter can be used to define a schema (often a Pydantic BaseModel) for the agent’s final output. If set, the ADK will ensure that the agent’s response is a JSON object conforming to this schema.

  • input_schema: This is the counterpart to output_schema. It defines the expected structure of the message content passed to the agent. If set, the input must be a JSON string that conforms to this schema, which is useful for ensuring structured data flow in agent-to-agent communication.

  • output_key: This parameter provides a string key. If set, the final text response from the agent will be automatically saved to the session’s state dictionary under this key. This is the primary mechanism for passing results between different agents or steps in a workflow.

python
from google.adk.agents.llm_agent import LlmAgent
from pydantic import BaseModel, Field
# Define schemas for structured input and output
class GreetingRequest(BaseModel):
"""Input schema for specifying the language of the greeting."""
language: str = Field(description="The language to greet the user in.")
class GreetingResponse(BaseModel):
"""Output schema for the structured greeting response."""
greeting_message: str = Field(description="The final, formatted greeting.")
root_agent = LlmAgent(
name='greeting_agent',
model='gemini-2.5-flash',
description='A helpful assistant for user questions.',
instruction='Answer user questions to the best of your knowledge',
input_schema=GreetingRequest,
output_schema=GreetingResponse,
output_key='final_greeting',
)

Context management

The ADK provides the include_contents parameter to control whether the agent receives the prior conversation history. It can be set to 'default' (the agent receives all relevant history) or 'none' (the agent receives no history). Setting it to 'none' is useful for creating stateless agents that should only consider the current input.

python
from google.adk.agents.llm_agent import LlmAgent
root_agent = LlmAgent(
name='greeting_agent',
model='gemini-2.5-flash',
description='An agent that provides a friendly greeting in a specified language.',
instruction='You are a friendly agent. Greet the user in their specified language.',
include_contents='none',
)

Note: By default, the include_contents parameter is automatically set to 'default'. This means that unless explicitly configured otherwise, the ADK will always provide the relevant conversation history to the agent, enabling stateful, multi-turn conversations without any additional setup.

Planning

To enable multi-step reasoning and planning before execution, we use the planner parameter. There are two main types:

  • BuiltInPlanner: It leverages the model’s native thinking or planning capabilities (like those in Gemini).

    • thinking_budget: It controls the number of thinking tokens when generating a response.

    • include_thoughts: It controls whether the model returns its internal reasoning process along with the final answer.

python
from google.adk.agents.llm_agent import LlmAgent
from google.adk.planners import BuiltInPlanner
from google.genai import types
root_agent = LlmAgent(
name='greeting_agent',
model='gemini-2.5-flash',
description='An agent that provides a friendly greeting in a specified language.',
instruction='You are a friendly agent. Greet the user in their specified language.',
planner=BuiltInPlanner(
thinking_config=types.ThinkingConfig(
include_thoughts=True,
thinking_budget=1024,
)
),
)
  • PlanReActPlanner: It instructs the model to follow a specific Plan -> Action -> Reason structure, which is useful for models that do not have a built-in thinking feature.

python
from google.adk.agents.llm_agent import LlmAgent
from google.adk.planners import PlanReActPlanner
root_agent = LlmAgent(
name='greeting_agent',
model='gemini-2.0-flash',
description='An agent that provides a friendly greeting in a specified language.',
instruction='You are a friendly agent. Greet the user in their specified language.',
planner=PlanReActPlanner(),
)

Code execution

The ADK provides the code_executor parameter, enabling the agent to execute blocks of code (e.g., Python) generated as part of its response. By providing an instance of a built-in code executor, like the BuiltInCodeExecutor, we give the agent the ability to perform tasks like calculations, data manipulation, or running small scripts.

python
from google.adk.agents.llm_agent import LlmAgent
from google.adk.code_executors import BuiltInCodeExecutor
root_agent = LlmAgent(
name='greeting_agent',
model='gemini-2.5-flash',
description='An agent that provides a friendly greeting in a specified language.',
instruction='You are a friendly agent. Greet the user in their specified language.',
code_executor=BuiltInCodeExecutor(),
)

Now that we have defined the agent’s brain, let’s explore how to give it capabilities to act.

The agent’s capabilities: Understanding tools

An agent’s ability to reason is powerful, but its true utility comes from its capacity to act. An LLM’s knowledge is confined to its training data. To perform useful, real-world tasks, an agent needs to be able to interact with external systems, fetch real-time data, and execute specific actions. In the ADK, this is achieved through tools. A tool is a capability provided to an agent that allows it to perform actions beyond the LLM’s built-in knowledge. This could be anything from fetching a web page to querying a database or calling a proprietary enterprise API.

The ADK supports a flexible ecosystem of tools, allowing us to grant capabilities in several ways.

Custom function tools

One of the most powerful and elegant features of the ADK in Python is how it handles the creation of custom tools. Any regular Python function can be transformed into a tool that an LlmAgent can use. We don’t need to write complex wrapper classes or API definitions. The framework handles this transformation automatically by inspecting the Python function’s signature.

def create_greeting(name: str, language: str = "English") -> str:
"""Creates a personalized greeting for a user in a specified language.
This function's docstring is crucial. The ADK framework reads this
description and the arguments below to create a schema that the LLM
can understand and decide when to use this tool.
Args:
name (str): The name of the person to greet.
language (str): The language for the greeting. Defaults to English.
"""
if language.lower() == "spanish":
return f"Hola, {name}! Cómo estás?"
else:
return f"Hello, {name}! How are you?"
The tool definition

When we provide the above function in the tools list of an LlmAgent, the ADK examines its:

  • Name: The function’s name (e.g., create_greeting) becomes the name of the tool.

  • Parameters and type hints: The function’s arguments (e.g., name: str) define the parameters the LLM must provide when calling the tool.

  • Docstring: This is the most critical piece. The function’s docstring is used as the description of the tool. A detailed docstring that clearly explains what the tool does, what each parameter means, and what it returns is essential for the LLM to understand when and how to use the tool correctly.

The complete code is given below:

from google.adk.agents.llm_agent import LlmAgent
def create_greeting(name: str, language: str = "English") -> str:
"""Creates a personalized greeting for a user in a specified language.
Args:
name (str): The name of the person to greet.
language (str): The language for the greeting. Defaults to English.
"""
if language.lower() == "spanish":
return f"Hola, {name}! Cómo estás?"
else:
return f"Hello, {name}! How are you?"
# Instantiate an agent and provide the Python function directly to the tools list
root_agent = LlmAgent(
name='greeting_tool_agent',
model='gemini-2.5-flash',
instruction="""You are a helpful greeter. When the user asks for a greeting,
use the `create_greeting` tool to generate it.""",
tools=[create_greeting], # The framework automatically wraps this function as a tool
)
Custom function tool

We can empower our agents with complex capabilities simply by writing clean, well-documented Python functions.

Built-in tools

The ADK provides a library of ready-to-use tools for common functionalities. The most prominent example is the google_search tool, which allows an agent to perform real-time Google searches without any custom coding. We simply import the tool and add it to the list.

from google.adk.agents.llm_agent import LlmAgent
from google.adk.tools import google_search
# Instantiate an agent and provide the imported built-in tool
root_agent = LlmAgent(
name='basic_search_agent',
model='gemini-2.5-flash',
instruction="Answer user questions by searching the internet.",
tools=[google_search],
)
Built-in tool

Some other tools include:

  • BuiltInCodeExecutor: It allows an agent to run generated code in a secure environment.

  • VertexAiSearchTool and VertexAiRagRetrieval: They enable search across private, configured data stores and documents.

  • BigQuery: It is a set of tools for asking questions about data in BigQuery tables using natural language.

  • Spanner: It is a set of tools for interacting with and querying Spanner databases.

Combining built-in and custom tools:

By default, the ADK enforces a limitation where a single agent can have either one built-in tool (like google_search) or multiple custom tools, but not a mix of both. However, the framework provides a specific and essential workaround for GoogleSearchTool and VertexAiSearchTool in Python. By setting the bypass_multi_tools_limit=True parameter when instantiating the LlmAgent, we can successfully combine these powerful built-in capabilities with our custom functions in the same agent, as our project will require.

Agents-as-tools

This is an advanced but powerful concept for building hierarchical agent systems. The ADK allows an entire, fully-defined agent to be wrapped inside a special AgentTool class. This wrapped agent can then be given as a tool to another, high-level agent. This enables a manager agent to delegate complex, multi-step sub-tasks to a specialized worker agent.

from google.adk.agents.llm_agent import LlmAgent
from google.adk.tools.agent_tool import AgentTool
# Define the specialized worker agent that will be used as a tool.
greeting_expert = LlmAgent(
name='greeting_expert_agent',
model='gemini-2.5-flash',
description='This agent is an expert at creating personalized greetings in different languages.',
instruction="""You are a greeting expert. A user will provide a name and a language.
Create a friendly, personalized greeting. For example: Hola, Maria!"""
)
# Define the main manager agent that will use the worker.
root_agent = Agent(
name='delegator_agent',
model='gemini-2.5-flash',
instruction="""You are a helpful assistant. If the user asks for any kind of greeting,
delegate the task to the `greeting_expert_agent` tool. Forward the user's
request exactly as you receive it.""",
# Wrap the worker agent in AgentTool and provide it as a tool to the manager.
tools=[AgentTool(agent=greeting_expert)]
)
Agent as tool

Agent-as-a-tool vs. sub-agent:

It is important to distinguish the Agent-as-a-tool pattern from the concept of a sub-agent. When an agent is used as a tool, it executes its task and returns a result to the calling agent, which then decides how to proceed. In contrast, transferring control to a sub-agent means the calling agent is removed from the loop, and the sub-agent takes over the conversation with the user directly. The key difference is the flow of control: a tool returns a result, while a sub-agent permanently takes over the interaction.

The execution engine: The Runner

We have defined our agent and discussed how to empower it with capabilities using tools, but one thing is missing: the engine that makes it run. In the ADK, this component is the Runner.

The Runner is the underlying engine that powers an agent application. It is responsible for orchestrating the entire execution flow in response to user input, managing the conversation state, and handling the back-and-forth communication between the LlmAgent and its tools.

Note: It is important to understand that the Runner is separate from the agent’s definition. We define what our agent is by creating an LlmAgent instance. We make it run by passing that agent instance to a Runner.

Role and the event loop

The Runner's primary role is to manage the event loop, which is the fundamental pattern governing how ADK executes an agent’s code. This is a cooperative, back-and-forth communication cycle:

  1. The Runner receives a user’s query.

  2. It kicks off the agent’s logic.

  3. The agent’s logic runs until it needs to communicate something—like a final answer, a request to call a tool, or a change in state. At this point, the agent’s code pauses and yields an Event object back to the Runner.

  4. The Runner receives this Event, processes it (e.g., executes a requested tool or commits a state change), and forwards the result upstream.

  5. Only after the Runner has finished processing the event, it signals the agent’s logic to resume from exactly where it left off, now aware of the outcome of the event.

This cooperative yield, pause, process, resume cycle is the heartbeat of the ADK runtime. It ensures that actions like tool calls and state updates are handled consistently and that the agent is always working with the most up-to-date information.

The ADK runtime
The ADK runtime

The Runner also manages the conversation’s session, which holds the history of all events and the current state dictionary for a specific user interaction.

Executing an agent with the runner

The following code demonstrates how to combine all the components. We will define a simple greeting_agent, instantiate a Runner to manage it, and then use the Runner to execute a user query and print the final result.

GOOGLE_GENAI_USE_VERTEXAI=0
GOOGLE_API_KEY={{GOOGLE_API_KEY}}
Executing an agent with the runner

Code explanation

  • Lines 1–6: We import the necessary libraries, including asyncio for running our asynchronous code, LlmAgent to define our agent’s blueprint, InMemoryRunner, which is a specific type of runner that manages sessions in memory, and genai_types for constructing message objects.

  • Lines 10–18: We define our chat_agent by creating an instance of the LlmAgent class. This is the static blueprint of our agent, defining its model, name, description, and the core instruction that will guide its behavior.

  • Line 21: We define our main asynchronous function, main(), which will contain the entire logic for setting up and running our chat application.

  • Lines 22–23: This is a check to ensure that the GOOGLE_API_KEY environment variable has been set. The runner needs this key to authenticate with the Gemini model on behalf of the agent.

  • Lines 26–29: We create an instance of InMemoryRunner. This is a convenient type of Runner that handles session and state management entirely in memory, which is perfect for simple scripts and testing. We pass our chat_agent blueprint to it, telling the runner which agent it is responsible for executing.

  • Lines 32–35: Before the chat begins, we explicitly create a session by calling runner.session_service.create_session(). A session is the container for a single conversation’s history and state. This call establishes a unique session_id that we will use to ensure continuity in the conversation.

  • Lines 41–82: This while True block creates a persistent chat loop in the terminal. It waits for the user to type a message and handles exit or quit commands to gracefully shut down the application.

    • Lines 56–59: We take the user’s raw text input and construct a Content object. This is the standardized format the ADK uses for all messages. We specify the role as "user" and place the text inside a Part.

    • Lines 65–69: This is the core execution step. We call runner.run_async(), providing the user_id and session_id to identify the correct conversation, along with the new_message. This method kicks off the event loop and returns an asynchronous stream of events from the agent.

    • Lines 70–74: We loop through each event yielded by the runner. We check if the event contains message content event.content and if that content has parts. If it does, we extract the text from each part and append it to our reply_chunks list. This correctly handles streaming responses where the final answer may arrive in multiple pieces.

    • Lines 77–82: After the event loop for the current turn is complete, we check if we received any reply chunks. If we did, we join them together into a single reply_text and print it to the console for the user to see.

  • Lines 85–86: This is the entry point that starts the entire application. Because our main() function is an async function, we cannot call it like a regular function. The asyncio.run() command starts the Python asynchronous event loop, tells it to run our main() function to completion, and then cleanly closes the loop.

Note: The code for the Runner class and its event loop, as demonstrated in the example above, is a core part of the Google ADK framework itself. We do not need to implement this execution machinery. Our primary role as developers is to define the agent’s blueprint LlmAgent and its capabilities tools; the framework then provides the Runner as the powerful, prebuilt engine to execute that blueprint.

The architecture of the ADK is built on a clear and powerful separation of concerns. By decoupling the declarative definition of an agent’s logic from the imperative machinery of its execution, the framework provides a robust and scalable foundation for building agentic systems. This modular design, where the brain, capabilities, and engine are distinct components, is the key to creating complex applications that remain organized, testable, and maintainable as they grow in scope and intelligence.