Back to home

Structured Outputs with LLMs: JSON Mode, Function Calling, and When to Use Each

Learn how to get structured data from large language models using JSON mode and function calling. This guide compares both approaches with practical examples and helps you choose the right method for your application.

Audio reading is not available in this browser
Structured Outputs with LLMs: JSON Mode, Function Calling, and When to Use Each

Tags

Quick summary

Learn how to get structured data from large language models using JSON mode and function calling. This guide compares both approaches with practical examples and helps you choose the right method for your application.

Structured Outputs with LLMs: JSON Mode, Function Calling, and When to Use Each

Large language models (LLMs) have become indispensable tools for developers, but extracting reliable, structured data from their free-form text outputs remains a persistent challenge. Whether you're building a chatbot, an automated data pipeline, or a decision-support system, you need your LLM to return data in a predictable format. This article explores two primary techniques for achieving structured outputs: JSON Mode and Function Calling. We'll compare their mechanics, use cases, and provide practical guidance on when to employ each.

Understanding the Challenge

LLMs generate text token by token, which means their output is inherently unstructured. A simple prompt like "List the capital cities of Europe" might yield a bulleted list, a table, or a paragraph depending on the model's training and randomness. For production systems, this variability is unacceptable. You need a schema—a predefined structure that the model must adhere to. JSON Mode and Function Calling are two methods that enforce such schemas, but they work differently and suit different scenarios.

Requirements

Before diving into implementation, ensure you have the following:

  • Python 3.8 or later installed on your system.
  • Access to an LLM API that supports structured outputs. OpenAI's GPT-4 and GPT-4 Turbo are the most common, but other providers (e.g., Anthropic, Google) offer similar capabilities. We'll use OpenAI as the reference implementation.
  • An API key from your LLM provider. For OpenAI, sign up at platform.openai.com and create a key.
  • Basic familiarity with Python and command-line tools.

Step-by-Step Installation

We'll set up a Python environment and install the necessary libraries. These steps assume you're using a Unix-like terminal (Linux, macOS, or WSL on Windows).

1. Create a Virtual Environment

Isolating dependencies prevents conflicts with other projects. Run the following commands:

python3 -m venv llm-structured
source llm-structured/bin/activate

The first command creates a virtual environment named `llm-structured`. The second activates it, so any Python packages you install will be contained within this environment.

2. Install the OpenAI Python Library

The OpenAI library provides a clean interface for interacting with their API. Install it with pip:

pip install openai

This installs the latest version of the `openai` package, which includes support for JSON Mode and Function Calling.

3. Set Your API Key

You need to authenticate with OpenAI. Set your API key as an environment variable for security:

export OPENAI_API_KEY="your-api-key-here"

Replace `your-api-key-here` with your actual key. For persistent use, add this line to your `.bashrc` or `.zshrc` file.

4. Verify Installation

Test that everything works by running a simple script:

python3 -c "import openai; print('OpenAI library installed successfully')"

If no errors appear, you're ready to proceed.

JSON Mode: Structured Outputs Without Functions

JSON Mode instructs the LLM to respond with a valid JSON object that matches a schema you provide. It's simple, flexible, and works with standard chat completions.

How JSON Mode Works

When you enable JSON Mode, the model is constrained to generate output that is parseable JSON. You specify the expected structure in the system message or user prompt, typically using a schema description. The model does not execute any code; it merely formats its response as JSON.

Usage Example: Extracting Product Information

Suppose you want to extract product details from a user's description. Here's a complete Python script:

import openai
import json

client = openai.OpenAI()

response = client.chat.completions.create(
    model="gpt-4-turbo",
    response_format={"type": "json_object"},
    messages=[
        {"role": "system", "content": "Extract product information into JSON with keys: name, price, category, in_stock."},
        {"role": "user", "content": "I bought a premium wireless mouse for $49.99 from the electronics section. It's currently in stock."}
    ]
)

# Parse the JSON response
product_data = json.loads(response.choices[0].message.content)
print(product_data)

**Explanation:**

  • The `response_format={"type": "json_object"}` parameter activates JSON Mode.
  • The system message tells the model the expected JSON structure.
  • The model returns a string that you can parse with `json.loads`.

When to Use JSON Mode

JSON Mode is ideal when:

  • You need a simple, flat data structure (e.g., a single object with a few fields).
  • The output schema is static and doesn't depend on user input.
  • You want minimal overhead—no need to define functions or handle tool calls.
  • The model's response is the final output, not an intermediate step.

**Limitations:** JSON Mode cannot enforce complex schemas like nested objects or arrays with conditional structures. The model may occasionally produce malformed JSON, requiring error handling.

Function Calling: Executing Structured Actions

Function Calling (also known as tool use) allows the LLM to request the execution of predefined functions. The model returns a structured JSON object that specifies which function to call and with what arguments. This is more powerful than JSON Mode because the model can decide when to invoke a function based on conversation context.

How Function Calling Works

You define functions with parameters and descriptions. The model evaluates the conversation and, if appropriate, returns a function call request. Your code then executes the function and passes the result back to the model for further processing.

Usage Example: Retrieving Weather Data

Let's create a weather assistant that uses Function Calling:

import openai
import json

client = openai.OpenAI()

# Define a function for the model to call
functions = [
    {
        "name": "get_weather",
        "description": "Get the current weather for a city",
        "parameters": {
            "type": "object",
            "properties": {
                "city": {
                    "type": "string",
                    "description": "The city name, e.g., San Francisco"
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature unit"
                }
            },
            "required": ["city"]
        }
    }
]

def get_weather(city, unit="celsius"):
    """Simulate a weather API call"""
    return f"The weather in {city} is 22° {unit}."

# Send the conversation
response = client.chat.completions.create(
    model="gpt-4-turbo",
    messages=[
        {"role": "user", "content": "What's the weather in Paris?"}
    ],
    functions=functions,
    function_call="auto"
)

message = response.choices[0].message

# Check if the model wants to call a function
if message.function_call:
    function_name = message.function_call.name
    arguments = json.loads(message.function_call.arguments)
    
    # Execute the function
    result = get_weather(arguments["city"], arguments.get("unit", "celsius"))
    
    # Send the function result back to the model
    second_response = client.chat.completions.create(
        model="gpt-4-turbo",
        messages=[
            {"role": "user", "content": "What's the weather in Paris?"},
            message,
            {
                "role": "function",
                "name": function_name,
                "content": result
            }
        ],
        functions=functions
    )
    
    print(second_response.choices[0].message.content)
else:
    print(message.content)

**Explanation:**

  • The `functions` list defines one function with its parameters and descriptions.
  • `function_call="auto"` lets the model decide when to call the function.
  • The model returns a `function_call` object with the function name and arguments.
  • You execute the function and feed the result back to the model for a human-readable response.

When to Use Function Calling

Function Calling is best when:

  • You need the model to interact with external systems (e.g., APIs, databases).
  • The output schema is dynamic or depends on conversation context.
  • You want the model to perform multi-step reasoning (e.g., calling a function, then using the result in a follow-up call).
  • The structured output is an intermediate step in a larger workflow.

**Limitations:** Function Calling requires more code to handle the call-return cycle. It also adds latency because you make multiple API calls.

Comparing JSON Mode and Function Calling

| Feature | JSON Mode | Function Calling | |------------------------|------------------------------------|--------------------------------------| | Output structure | Single JSON object | JSON object + function execution | | Schema enforcement | Via prompt instructions | Via explicit function definitions | | Complexity | Low | Medium to high | | Latency | One API call | Two or more API calls | | Dynamic schemas | Limited | Fully supported | | External interactions | No | Yes (via function execution) | | Error handling | Manual (parse JSON) | Built-in (function call validation) |

Choosing the Right Approach

The decision hinges on your use case:

  • **Use JSON Mode** for simple data extraction tasks where the output is the final product. Examples: parsing user queries, generating configuration files, or formatting logs.
  • **Use Function Calling** when the LLM needs to trigger real-world actions or when the output schema depends on user input. Examples: building a chatbot that books appointments, querying a database, or orchestrating multi-step workflows.

Hybrid Approach

You can combine both techniques. For instance, use Function Calling to retrieve data from an API, then ask the model to format the result as a specific JSON structure using JSON Mode. This gives you the best of both worlds: external interaction plus consistent formatting.

Best Practices

1. **Always validate output**: Even with structured modes, models can make mistakes. Use `try-except` blocks when parsing JSON and validate fields against your schema. 2. **Provide clear instructions**: The more specific you are about the expected structure, the more reliable the output. Include examples in the system message. 3. **Handle edge cases**: Function Calling may return multiple function calls or none. Your code should handle all scenarios gracefully. 4. **Monitor costs**: Function Calling requires multiple API calls, which increases token usage. Profile your application to ensure it stays within budget. 5. **Test with multiple models**: Different models (e.g., GPT-3.5 vs. GPT-4) may have varying reliability. Test thoroughly before production deployment.

Conclusion

JSON Mode and Function Calling are powerful tools for extracting structured outputs from LLMs, but they serve different purposes. JSON Mode offers simplicity and speed for static data extraction, while Function Calling provides flexibility and external integration for dynamic workflows. By understanding their strengths and limitations, you can choose the right approach for your application—or combine them for maximum effect. Start with JSON Mode for straightforward tasks, and graduate to Function Calling when your system demands interaction with the outside world. The key is to match the technique to the complexity of your problem, ensuring reliable, maintainable, and efficient AI-powered applications.

Sources

FAQ

What is this article about?

This article covers “Structured Outputs with LLMs: JSON Mode, Function Calling, and When to Use Each” in the Guides category. Learn how to get structured data from large language models using JSON mode and function calling. This guide compares both approaches with practical examples and helps you choose the right method for your application.

Who is this useful for?

It is useful for readers who want a practical understanding of AI tools, models, and workflows.

What should I do next?

Read the article, review the listed sources, and test the most relevant ideas in your own workflow.