The Simplest Agent Loop

Every AI agent framework — LangChain, LlamaIndex, Microsoft Agent Framework, CrewAI — wraps the same idea. Strip them down and you find the same beating heart: a while loop.

Here it is, in full:

import anthropic

client = anthropic.Anthropic()

tools = [
    {
        "name": "get_weather",
        "description": "Get the current weather for a location.",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {"type": "string", "description": "City name"}
            },
            "required": ["location"],
        },
    }
]

def get_weather(location: str) -> str:
    # Imagine a real API call here
    return f"Sunny, 72°F in {location}"

messages = [{"role": "user", "content": "What's the weather in Tokyo?"}]

while True:
    response = client.messages.create(
        model="claude-opus-4-6",
        max_tokens=1024,
        tools=tools,
        messages=messages,
    )

    messages.append({"role": "assistant", "content": response.content})

    if response.stop_reason == "end_turn":
        break

    # Handle tool calls
    tool_results = []
    for block in response.content:
        if block.type == "tool_use":
            result = get_weather(**block.input)
            tool_results.append({
                "type": "tool_result",
                "tool_use_id": block.id,
                "content": result,
            })

    messages.append({"role": "user", "content": tool_results})

print(next(b.text for b in response.content if hasattr(b, "text")))

That’s it. No framework. No magic. Just a loop, a list of messages, and a conditional break.

What’s actually happening

The loop runs as long as the model has more work to do. Each iteration, you send the full conversation — including any tool results — back to the model. The model either calls another tool or it doesn’t.

The critical insight is in the break condition:

if response.stop_reason == "end_turn":
    break

The model controls the stop. Not your code. Not a timeout. The model decides, on each turn, whether it needs more information or whether it has enough to answer. When it’s ready, it sets stop_reason to "end_turn" and produces a final text response. Until then, it returns "tool_use" and your job is to execute the tools and feed the results back.

Under the hood, stop_reason maps directly to special tokens — the structural delimiters baked into the model’s vocabulary that signal when to stop generating and why. If you haven’t read The Grammar of LLM Special Tokens, it’s worth a look: the end_turn / tool_use distinction the API surfaces is really just an abstraction over these tokens.

This is what makes a language model an agent: not fancy orchestration, but the ability to decide its own next action — including the decision to stop; your agent now has agency.

The message loop is the memory

Notice that messages grows with every iteration. The full history — the original question, every tool call, every tool result — gets sent back each time. The model has no persistent memory between API calls; the conversation list is the memory.

This is also why context length matters for agents. A complex task with many tool calls can fill up a context window quickly.

What frameworks add

Frameworks build on this pattern by adding:

Tool registries — so you can define many tools and dispatch calls automatically
Streaming — so you can show partial output as it arrives
Error handling — retries, malformed tool calls, API failures
Multi-agent coordination — routing between multiple models or specialized sub-agents
State management — persisting conversation history across sessions

All useful. But none of it changes the fundamental shape: a while loop where the LLM decides when to stop.

When you understand the loop, the frameworks become much easier to reason about. You can look at any agent system and ask: where’s the while loop? what triggers the break? who controls the stop? The answers tell you most of what you need to know.

Comments

Came here from LinkedIn or X? Join the conversation below — all discussion lives here.

What’s actually happening

The message loop is the memory

What frameworks add

Related

Comments