It's 10 a.m. You type into a Slack-connected assistant: "What's blocking the mobile release?" The assistant replies with the right answer - a specific ticket, its status, the person assigned. It didn't know that from training. It went and got it. But how, exactly?
The phrase people reach for is "the AI looked it up." That's not wrong, but it hides something interesting. The model never made an HTTP request. It doesn't have a network stack. What it did was write a very particular kind of JSON - and then something else did the rest.
The model's only job: decide and describe
In a regular API call, your code decides which endpoint to hit deterministically. With tool calling, the LLM decides which function to invoke based on natural language input and outputs structured JSON, while your code handles execution.
That's the key distinction. Function calling enables a language model to reason about what action to take and return a structured API-like JSON object that a client - your app - can then execute. Instead of parsing raw text, your system gets machine-readable output.
So when you asked about the mobile release, the model didn't answer. It paused and emitted something like: