Back to home

Tool Calling, Explained: How AI Agents Decide What to Do Next

Discover how AI agents use tool calling to decide their next action. This article breaks down the decision-making process, from function selection to execution, with practical examples.

Audio reading is not available in this browser
Tool Calling, Explained: How AI Agents Decide What to Do Next

Tags

Quick summary

Discover how AI agents use tool calling to decide their next action. This article breaks down the decision-making process, from function selection to execution, with practical examples.

Tool Calling, Explained: How AI Agents Decide What to Do Next

Large language models (LLMs) have evolved far beyond simple text generation. Today’s AI agents don’t just answer questions—they take actions. They search the web, run code, query databases, and control APIs. This capability is made possible by a mechanism called **tool calling** (also known as function calling). In this article, we’ll demystify tool calling: what it is, how it works under the hood, and how you can implement it in your own AI agents with concrete, step-by-step instructions.

What Is Tool Calling?

Tool calling is the process by which an LLM decides to invoke an external function or API, rather than generating a purely textual response. The model outputs a structured request (usually in JSON) that specifies which tool to use and with what parameters. The calling application then executes the tool, returns the result to the model, and the model incorporates that result into its final answer.

This mechanism transforms LLMs from passive text generators into active problem-solvers. For example, when you ask an AI agent “What’s the weather in London?”, the model might call a `get_weather` tool with the parameter `location="London"`, receive the current weather data, and then compose a human-readable response like “The weather in London is 15°C and cloudy.”

Why Tool Calling Matters

Without tool calling, an LLM can only rely on its training data, which becomes stale over time. With tool calling, agents can:

  • Access real-time information (stock prices, news, weather)
  • Perform calculations or run code
  • Interact with databases and internal systems
  • Trigger workflows and send notifications
  • Retrieve private or domain-specific data

As noted by industry leaders like OpenAI and Anthropic, tool calling is a foundational capability for building reliable, autonomous AI agents. It bridges the gap between language understanding and real-world action.

How Tool Calling Works: The Decision Process

The core decision process involves three steps:

1. **The model receives a user query and a list of available tools** (each defined by a name, description, and parameter schema). 2. **The model analyzes the query and decides if a tool is needed.** If yes, it outputs a structured request (e.g., `{"tool": "search_web", "args": {"query": "latest AI news"}}`). 3. **The application executes the tool, returns the result to the model, and the model generates a final response** incorporating the tool’s output.

This decision is not hard-coded—it emerges from the LLM’s training. The model has learned to recognize when a question requires external data and to output the appropriate tool call. The key is that the model *chooses* to call a tool; the developer only provides the tool definitions.

Requirements

Before we dive into implementation, ensure you have the following:

  • **Python 3.9+** installed on your system
  • **An API key** from OpenAI, Anthropic, or another provider that supports function calling
  • **Basic familiarity** with the command line and Python
  • **pip** (Python package manager) up to date

We’ll use OpenAI’s API as our primary example, as it offers mature tool calling support. The concepts apply equally to Anthropic’s Claude and other models.

Step-by-Step Installation

1. Set Up a Python Virtual Environment

Isolate your project dependencies to avoid conflicts.

python -m venv tool-calling-env

Activate the environment:

  • On macOS/Linux: `source tool-calling-env/bin/activate`
  • On Windows: `tool-calling-env\Scripts\activate`

2. Install the OpenAI Python Library

This library provides the client for interacting with OpenAI’s API.

pip install openai

3. Install Additional Dependencies

We’ll need `requests` for making HTTP calls and `python-dotenv` to manage our API key securely.

pip install requests python-dotenv

4. Set Up Your API Key

Create a `.env` file in your project root:

OPENAI_API_KEY=your-api-key-here

Replace `your-api-key-here` with your actual OpenAI API key. Never hard-code keys in your source code.

5. Verify Installation

Run a quick check to ensure everything works.

# test_import.py
import openai
import requests
from dotenv import load_dotenv
import os

load_dotenv()
print("All imports successful.")
print(f"API key loaded: {os.getenv('OPENAI_API_KEY')[:8]}...")

Execute: `python test_import.py`

Usage Examples

Now we’ll build a practical AI agent that uses tool calling to fetch current time and weather data. This agent will demonstrate the full decision-making process.

Example 1: A Simple Time and Weather Agent

Create a file `agent.py`:

import json
import os
from datetime import datetime
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# Define our tools
tools = [
    {
        "type": "function",
        "function": {
            "name": "get_current_time",
            "description": "Get the current date and time",
            "parameters": {"type": "object", "properties": {}}
        }
    },
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather for a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "City name, e.g., 'London'"
                    }
                },
                "required": ["location"]
            }
        }
    }
]

# Tool implementations
def get_current_time():
    return datetime.now().strftime("%Y-%m-%d %H:%M:%S")

def get_weather(location):
    # Simulated weather data - in production, call a real API
    weather_data = {
        "London": "15°C, cloudy",
        "New York": "22°C, sunny",
        "Tokyo": "18°C, rainy"
    }
    return weather_data.get(location, "Weather data not available")

def run_agent(user_query):
    messages = [{"role": "user", "content": user_query}]
    
    # First API call: model decides whether to call a tool
    response = client.chat.completions.create(
        model="gpt-4o-mini",  # Cost-effective model with tool calling
        messages=messages,
        tools=tools,
        tool_choice="auto"  # Let the model decide
    )
    
    assistant_message = response.choices[0].message
    
    # Check if the model wants to call a tool
    if assistant_message.tool_calls:
        for tool_call in assistant_message.tool_calls:
            function_name = tool_call.function.name
            arguments = json.loads(tool_call.function.arguments)
            
            print(f"Agent called: {function_name} with args: {arguments}")
            
            # Execute the appropriate tool
            if function_name == "get_current_time":
                result = get_current_time()
            elif function_name == "get_weather":
                result = get_weather(arguments["location"])
            else:
                result = "Unknown tool"
            
            # Add the tool result to the conversation
            messages.append(assistant_message)
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": str(result)
            })
        
        # Second API call: model generates final response with tool results
        final_response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            tools=tools
        )
        
        return final_response.choices[0].message.content
    
    # If no tool call, return the direct response
    return assistant_message.content

# Test the agent
if __name__ == "__main__":
    print(run_agent("What's the current time?"))
    print("---")
    print(run_agent("What's the weather in London?"))
    print("---")
    print(run_agent("What's the weather in Tokyo and what time is it?"))

Run the agent:

python agent.py

You should see output like:

Agent called: get_current_time with args: {}
The current time is 2025-03-25 14:32:18.
---
Agent called: get_weather with args: {'location': 'London'}
The weather in London is 15°C, cloudy.
---
Agent called: get_weather with args: {'location': 'Tokyo'}
Agent called: get_current_time with args: {}
In Tokyo, the weather is 18°C and rainy. The current time is 2025-03-25 14:32:18.

Notice how the model intelligently decided to call two tools for the third query. This is the core of tool calling: the model autonomously chooses which tools to invoke and in what order.

Example 2: A Web Search Agent with Error Handling

Let’s build a more robust agent that searches the web (using a simulated search) and handles errors gracefully.

Create `search_agent.py`:

import json
import os
from dotenv import load_dotenv
from openai import OpenAI

load_dotenv()
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

tools = [
    {
        "type": "function",
        "function": {
            "name": "web_search",
            "description": "Search the web for current information",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "Search query"
                    }
                },
                "required": ["query"]
            }
        }
    },
    {
        "type": "function",
        "function": {
            "name": "calculate",
            "description": "Perform a mathematical calculation",
            "parameters": {
                "type": "object",
                "properties": {
                    "expression": {
                        "type": "string",
                        "description": "Mathematical expression, e.g., '2 + 2'"
                    }
                },
                "required": ["expression"]
            }
        }
    }
]

def web_search(query):
    # Simulated search - in production, use a real search API
    results = {
        "latest AI news": "AI agents are becoming more autonomous with tool calling capabilities.",
        "OpenAI news": "OpenAI recently released GPT-4o with improved function calling.",
        "Microsoft AI blog": "Microsoft announced new AI tools for enterprise customers."
    }
    return results.get(query.lower(), f"No results found for '{query}'")

def calculate(expression):
    try:
        result = eval(expression)
        return str(result)
    except Exception as e:
        return f"Error: {str(e)}"

def run_agent_with_error_handling(user_query):
    messages = [{"role": "user", "content": user_query}]
    max_iterations = 3  # Prevent infinite loops
    
    for _ in range(max_iterations):
        response = client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            tools=tools,
            tool_choice="auto"
        )
        
        assistant_message = response.choices[0].message
        
        if not assistant_message.tool_calls:
            # No more tools needed, return final response
            return assistant_message.content
        
        for tool_call in assistant_message.tool_calls:
            function_name = tool_call.function.name
            arguments = json.loads(tool_call.function.arguments)
            
            print(f"Calling: {function_name}({arguments})")
            
            if function_name == "web_search":
                result = web_search(arguments["query"])
            elif function_name == "calculate":
                result = calculate(arguments["expression"])
            else:
                result = "Unknown tool"
            
            messages.append(assistant_message)
            messages.append({
                "role": "tool",
                "tool_call_id": tool_call.id,
                "content": result
            })
    
    return "Max iterations reached. Please try a simpler query."

if __name__ == "__main__":
    print(run_agent_with_error_handling("Search for latest AI news"))
    print("---")
    print(run_agent_with_error_handling("What is 1234 * 5678?"))
    print("---")
    print(run_agent_with_error_handling("Search for OpenAI news and calculate 100/3"))

Run it:

python search_agent.py

You’ll see the agent successfully call multiple tools in sequence, handling errors like invalid expressions gracefully.

Advanced Considerations

Tool Descriptions Matter

The model relies heavily on tool descriptions to decide when to call a tool. A vague description like “Get data” will confuse the model. Be explicit: “Get the current weather for a given city. Returns temperature in Celsius and conditions.”

Parameter Schemas

Use clear, descriptive parameter names and include `required` fields. The model understands JSON Schema, so leverage features like `enum` for constrained values and `minimum`/`maximum` for numeric ranges.

Security and Validation

Never trust tool call arguments blindly. Always validate inputs before executing the tool. For example, if a tool runs shell commands, sanitize the input to prevent injection attacks.

Cost Management

Tool calling increases API costs because each tool invocation requires an additional round trip. Use cheaper models (like `gpt-4o-mini`) for simple decisions and batch multiple tool calls when possible.

Conclusion

Tool calling is the engine that powers modern AI agents. By allowing LLMs to autonomously decide when and how to invoke external functions, we unlock capabilities far beyond text generation: real-time data access, computation, automation, and integration with existing systems.

In this article, we walked through the complete process: from understanding the decision-making mechanism to building working Python agents that call tools for time, weather, and search. The key takeaway is that the model *chooses* which tool to use based on your descriptions—your job is to define the tools clearly and handle the results safely.

As you build your own agents, remember these principles:

  • **Describe tools precisely** so the model understands their purpose.
  • **Validate all inputs** before executing tools.
  • **Handle errors gracefully**—tools can fail.
  • **Iterate and test**—tool calling improves with better definitions and prompt engineering.

The future of AI agents is autonomous, tool-enabled, and increasingly capable. With the steps and examples provided here, you’re now equipped to build agents that don’t just talk—they act.

Sources

FAQ

What is this article about?

This article covers “Tool Calling, Explained: How AI Agents Decide What to Do Next” in the AI agents category. Discover how AI agents use tool calling to decide their next action. This article breaks down the decision-making process, from function selection to execution, with practical examples.

Who is this useful for?

It is useful for readers who want a practical understanding of AI tools, models, and workflows.

What should I do next?

Read the article, review the listed sources, and test the most relevant ideas in your own workflow.