The bridge between "chat with your data" and "act on your data." Here is how to connect any LLM to your internal tools and APIs.
Function calling is what turns an LLM from a text generator into an action taker. Instead of just answering "Your subscription expires on March 15," the LLM can call your billing API to extend it, send a renewal email, or create a support ticket. This tutorial covers everything from defining your first tool schema to handling parallel function calls, error recovery, and security boundaries. By the end, you'll have a working pattern for connecting any LLM to any API.
Function calling does not mean the LLM executes code. It means the LLM decides which function to call and generates the arguments as structured JSON. Your application code then actually executes the function and feeds the result back to the LLM for further reasoning.
The flow is: User query -> LLM decides to call a function -> LLM generates function name + arguments -> Your code executes the function -> Result sent back to LLM -> LLM generates final response.
The tool schema tells the LLM what functions are available, what they do, and what arguments they accept. The quality of this schema directly determines how accurately the LLM will use your tools.
# Tool definition for OpenAI function calling
tools = [{
"type": "function",
"function": {
"name": "get_customer_subscription",
"description": "Retrieve the current subscription status and plan details for a customer. Use this when the user asks about their plan, billing, or subscription status.",
"parameters": {
"type": "object",
"properties": {
"customer_id": {
"type": "string",
"description": "The unique customer identifier (e.g., 'cust_abc123')"
}
},
"required": ["customer_id"]
}
}
}]
Schema Writing Tips:
The description field is the most important part. Include: (1) what the function does, (2) when to use it, (3) when NOT to use it. The LLM uses this description to decide tool selection. Vague descriptions lead to wrong tool calls.
# Process LLM response and execute function calls
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
tools=tools
)
if response.choices[0].message.tool_calls:
for tool_call in response.choices[0].message.tool_calls:
func_name = tool_call.function.name
func_args = json.loads(tool_call.function.arguments)
# Execute the actual function
result = available_functions[func_name](**func_args)
# Feed result back to the LLM
messages.append({
"role": "tool",
"tool_call_id": tool_call.id,
"content": json.dumps(result)
})
Modern LLMs can request multiple function calls in a single response. If the user asks "What's my subscription status and my latest invoice?", the LLM will generate two parallel tool calls. Your code should execute them concurrently for faster responses.
# Handle parallel tool calls with asyncio
import asyncio
async def execute_parallel_tools(tool_calls):
tasks = []
for tool_call in tool_calls:
func = available_functions[tool_call.function.name]
args = json.loads(tool_call.function.arguments)
tasks.append(func(**args))
return await asyncio.gather(*tasks)
Tool calls fail. APIs time out, authentication expires, inputs are invalid. When a tool call fails, pass the error message back to the LLM so it can reason about what went wrong and try a different approach.
# Return errors to the LLM for recovery
try:
result = available_functions[func_name](**func_args)
except PermissionError:
result = {"error": "Access denied. The user does not have permission for this action."}
except TimeoutError:
result = {"error": "The service is temporarily unavailable. Try again."}
except ValueError as e:
result = {"error": f"Invalid input: {str(e)}"}
The LLM is surprisingly good at recovering from tool errors when given clear error messages. It will rephrase queries, try alternative tools, or inform the user about the limitation. This pattern is fundamental to building production AI agents.
Function calling introduces a new attack surface. The LLM decides which functions to call, but your application must enforce security boundaries.
Function calling is not limited to OpenAI. Models like Llama 3, Mistral, and Qwen support tool use with the same schema format. The implementation differs slightly per framework, but the pattern is identical: define tools, let the model generate calls, execute them, and feed results back.
For deploying open-source models with function calling support, see our guides on deploying LLMs on AWS Lambda and Terraform with AWS Bedrock.
GPT-4o supports up to 128 tools per request. In practice, keep it under 20. Each tool definition consumes tokens, and too many tools confuse the model's selection. Group related operations into fewer, more flexible tools.
Improve the function description. Add explicit "use this when..." and "do NOT use this for..." guidelines. Also consider adding a confirmation step where the LLM explains which tool it plans to use before execution.
Function calling is the mechanism; agents are the architecture. An agent uses function calling inside a reasoning loop that can plan, execute, observe, and iterate. Read about this in our production agent blueprint and LangChain vs LangGraph comparison.
We build tool-using AI agents that connect to your internal systems. CRM, billing, support, operations -- all accessible through natural language.
Discuss Your Integration