Tool Calling, Explained: How AI Agents Decide What to Do Next
Discover how AI agents use tool calling to decide their next action. This article breaks down the decision-making process, from function selection to execution, with practical examples.
Tags
Quick summary
Discover how AI agents use tool calling to decide their next action. This article breaks down the decision-making process, from function selection to execution, with practical examples.
Tool Calling, Explained: How AI Agents Decide What to Do Next
Large language models (LLMs) have evolved far beyond simple text generation. Today’s AI agents don’t just answer questions—they take actions. They search the web, run code, query databases, and control APIs. This capability is made possible by a mechanism called **tool calling** (also known as function calling). In this article, we’ll demystify tool calling: what it is, how it works under the hood, and how you can implement it in your own AI agents with concrete, step-by-step instructions.
What Is Tool Calling?
Tool calling is the process by which an LLM decides to invoke an external function or API, rather than generating a purely textual response. The model outputs a structured request (usually in JSON) that specifies which tool to use and with what parameters. The calling application then executes the tool, returns the result to the model, and the model incorporates that result into its final answer.
This mechanism transforms LLMs from passive text generators into active problem-solvers. For example, when you ask an AI agent “What’s the weather in London?”, the model might call a `get_weather` tool with the parameter `location="London"`, receive the current weather data, and then compose a human-readable response like “The weather in London is 15°C and cloudy.”
Why Tool Calling Matters
Without tool calling, an LLM can only rely on its training data, which becomes stale over time. With tool calling, agents can:
- Access real-time information (stock prices, news, weather)
- Perform calculations or run code
- Interact with databases and internal systems
- Trigger workflows and send notifications
- Retrieve private or domain-specific data
As noted by industry leaders like OpenAI and Anthropic, tool calling is a foundational capability for building reliable, autonomous AI agents. It bridges the gap between language understanding and real-world action.
How Tool Calling Works: The Decision Process
The core decision process involves three steps:
1. **The model receives a user query and a list of available tools** (each defined by a name, description, and parameter schema). 2. **The model analyzes the query and decides if a tool is needed.** If yes, it outputs a structured request (e.g., `{"tool": "search_web", "args": {"query": "latest AI news"}}`). 3. **The application executes the tool, returns the result to the model, and the model generates a final response** incorporating the tool’s output.
This decision is not hard-coded—it emerges from the LLM’s training. The model has learned to recognize when a question requires external data and to output the appropriate tool call. The key is that the model *chooses* to call a tool; the developer only provides the tool definitions.
Requirements
Before we dive into implementation, ensure you have the following:
- **Python 3.9+** installed on your system
- **An API key** from OpenAI, Anthropic, or another provider that supports function calling
- **Basic familiarity** with the command line and Python
- **pip** (Python package manager) up to date
We’ll use OpenAI’s API as our primary example, as it offers mature tool calling support. The concepts apply equally to Anthropic’s Claude and other models.
Step-by-Step Installation
1. Set Up a Python Virtual Environment
Isolate your project dependencies to avoid conflicts.
python -m venv tool-calling-envActivate the environment:
- On macOS/Linux: `source tool-calling-env/bin/activate`
- On Windows: `tool-calling-env\Scripts\activate`
2. Install the OpenAI Python Library
This library provides the client for interacting with OpenAI’s API.
pip install openai3. Install Additional Dependencies
We’ll need `requests` for making HTTP calls and `python-dotenv` to manage our API key securely.
pip install requests python-dotenv4. Set Up Your API Key
Create a `.env` file in your project root:
OPENAI_API_KEY=your-api-key-hereReplace `your-api-key-here` with your actual OpenAI API key. Never hard-code keys in your source code.
5. Verify Installation
Run a quick check to ensure everything works.
# test_import.py
import openai
import requests
from dotenv import load_dotenv
import os
load_dotenv()
print("All imports successful.")
print(f"API key loaded: {os.getenv('OPENAI_API_KEY')[:8]}...")Execute: `python test_import.py`
Usage Examples
Now we’ll build a practical AI agent that uses tool calling to fetch current time and weather data. This agent will demonstrate the full decision-making process.
Example 1: A Simple Time and Weather Agent
Create a file `agent.py`:
import json
import os
from datetime import datetime
from dotenv import load_dotenv
from openai import OpenAI
load_dotenv()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
# Define our tools
tools = [
{
"type": "function",
"function": {
"name": "get_current_time",
"description": "Get the current date and time",
"parameters": {"type": "object", "properties": {}}
}
},
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get the current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "City name, e.g., 'London'"
}
},
"required": ["location"]
}
}
}
]
# Tool implementations
def get_current_time():
return datetime.now().strftime("%Y-%m-%d %H:%M:%S")
def get_weather(location):
# Simulated weather data - in production, call a real API
weather_data = {
"London": "15°C, cloudy",
"New York": "22°C, sunny",
"Tokyo": "18°C, rainy"
}
return weather_data.get(location, "Weather data not available")
def run_agent(user_query):
messages = [{"role": "user", "content": user_query}]
# First API call: model decides whether to call a tool
response = client.chat.completions.create(
model="gpt-4o-mini", # Cost-effective model with tool calling
messages=messages,
tools=tools,
tool_choice="auto" # Let the model decide
)
assistant_message = response.choices[0].message
# Check if the model wants to call a tool
if assistant_message.tool_calls:
for tool_call in assistant_message.tool_calls:
function_name = tool_call.function.name
arguments = json.loads(tool_call.function.arguments)
print(f"Agent called: {function_name} with args: {arguments}")
# Execute the appropriate tool
if function_name == "get_current_time":
result = get_current_time()
elif function_name == "get_weather":
result = get_weather(arguments["location"])
else:
result = "Unknown tool"
# Add the tool result to the conversation
messages.append(assistant_message)
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": str(result)
})
# Second API call: model generates final response with tool results
final_response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
tools=tools
)
return final_response.choices[0].message.content
# If no tool call, return the direct response
return assistant_message.content
# Test the agent
if __name__ == "__main__":
print(run_agent("What's the current time?"))
print("---")
print(run_agent("What's the weather in London?"))
print("---")
print(run_agent("What's the weather in Tokyo and what time is it?"))Run the agent:
python agent.pyYou should see output like:
Agent called: get_current_time with args: {}
The current time is 2025-03-25 14:32:18.
---
Agent called: get_weather with args: {'location': 'London'}
The weather in London is 15°C, cloudy.
---
Agent called: get_weather with args: {'location': 'Tokyo'}
Agent called: get_current_time with args: {}
In Tokyo, the weather is 18°C and rainy. The current time is 2025-03-25 14:32:18.Notice how the model intelligently decided to call two tools for the third query. This is the core of tool calling: the model autonomously chooses which tools to invoke and in what order.
Example 2: A Web Search Agent with Error Handling
Let’s build a more robust agent that searches the web (using a simulated search) and handles errors gracefully.
Create `search_agent.py`:
import json
import os
from dotenv import load_dotenv
from openai import OpenAI
load_dotenv()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
tools = [
{
"type": "function",
"function": {
"name": "web_search",
"description": "Search the web for current information",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "Search query"
}
},
"required": ["query"]
}
}
},
{
"type": "function",
"function": {
"name": "calculate",
"description": "Perform a mathematical calculation",
"parameters": {
"type": "object",
"properties": {
"expression": {
"type": "string",
"description": "Mathematical expression, e.g., '2 + 2'"
}
},
"required": ["expression"]
}
}
}
]
def web_search(query):
# Simulated search - in production, use a real search API
results = {
"latest AI news": "AI agents are becoming more autonomous with tool calling capabilities.",
"OpenAI news": "OpenAI recently released GPT-4o with improved function calling.",
"Microsoft AI blog": "Microsoft announced new AI tools for enterprise customers."
}
return results.get(query.lower(), f"No results found for '{query}'")
def calculate(expression):
try:
result = eval(expression)
return str(result)
except Exception as e:
return f"Error: {str(e)}"
def run_agent_with_error_handling(user_query):
messages = [{"role": "user", "content": user_query}]
max_iterations = 3 # Prevent infinite loops
for _ in range(max_iterations):
response = client.chat.completions.create(
model="gpt-4o-mini",
messages=messages,
tools=tools,
tool_choice="auto"
)
assistant_message = response.choices[0].message
if not assistant_message.tool_calls:
# No more tools needed, return final response
return assistant_message.content
for tool_call in assistant_message.tool_calls:
function_name = tool_call.function.name
arguments = json.loads(tool_call.function.arguments)
print(f"Calling: {function_name}({arguments})")
if function_name == "web_search":
result = web_search(arguments["query"])
elif function_name == "calculate":
result = calculate(arguments["expression"])
else:
result = "Unknown tool"
messages.append(assistant_message)
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": result
})
return "Max iterations reached. Please try a simpler query."
if __name__ == "__main__":
print(run_agent_with_error_handling("Search for latest AI news"))
print("---")
print(run_agent_with_error_handling("What is 1234 * 5678?"))
print("---")
print(run_agent_with_error_handling("Search for OpenAI news and calculate 100/3"))Run it:
python search_agent.pyYou’ll see the agent successfully call multiple tools in sequence, handling errors like invalid expressions gracefully.
Advanced Considerations
Tool Descriptions Matter
The model relies heavily on tool descriptions to decide when to call a tool. A vague description like “Get data” will confuse the model. Be explicit: “Get the current weather for a given city. Returns temperature in Celsius and conditions.”
Parameter Schemas
Use clear, descriptive parameter names and include `required` fields. The model understands JSON Schema, so leverage features like `enum` for constrained values and `minimum`/`maximum` for numeric ranges.
Security and Validation
Never trust tool call arguments blindly. Always validate inputs before executing the tool. For example, if a tool runs shell commands, sanitize the input to prevent injection attacks.
Cost Management
Tool calling increases API costs because each tool invocation requires an additional round trip. Use cheaper models (like `gpt-4o-mini`) for simple decisions and batch multiple tool calls when possible.
Conclusion
Tool calling is the engine that powers modern AI agents. By allowing LLMs to autonomously decide when and how to invoke external functions, we unlock capabilities far beyond text generation: real-time data access, computation, automation, and integration with existing systems.
In this article, we walked through the complete process: from understanding the decision-making mechanism to building working Python agents that call tools for time, weather, and search. The key takeaway is that the model *chooses* which tool to use based on your descriptions—your job is to define the tools clearly and handle the results safely.
As you build your own agents, remember these principles:
- **Describe tools precisely** so the model understands their purpose.
- **Validate all inputs** before executing tools.
- **Handle errors gracefully**—tools can fail.
- **Iterate and test**—tool calling improves with better definitions and prompt engineering.
The future of AI agents is autonomous, tool-enabled, and increasingly capable. With the steps and examples provided here, you’re now equipped to build agents that don’t just talk—they act.
Sources
FAQ
What is this article about?
This article covers “Tool Calling, Explained: How AI Agents Decide What to Do Next” in the AI agents category. Discover how AI agents use tool calling to decide their next action. This article breaks down the decision-making process, from function selection to execution, with practical examples.
Who is this useful for?
It is useful for readers who want a practical understanding of AI tools, models, and workflows.
What should I do next?
Read the article, review the listed sources, and test the most relevant ideas in your own workflow.



