---
title: "How to Build an AI Agent from Scratch"
description: "Build an AI agent from scratch in 2026: what an agent is, the model-plus-tool-calling loop, a minimal working example, autonomy levels and production."
type: "guide"
locale: "en"
category: "Build"
canonical: "https://agenticschool.dev/guides/how-to-build-an-ai-agent"
datePublished: "2026-06-13"
dateModified: "2026-06-13"
---

# How to Build an AI Agent from Scratch

- Category: Build
- Keywords: how to build an ai agent, build ai agent from scratch, ai agent tutorial, ai agent architecture, ai agent tool calling loop
- Canonical URL: https://agenticschool.dev/guides/how-to-build-an-ai-agent
- Locale: en

> Build an AI agent from scratch in 2026: what an agent is, the model-plus-tool-calling loop, a minimal working example, autonomy levels and production.

An AI agent is a language model wrapped in a loop that can call tools, read the results and decide what to do next, repeating until a goal is reached. Building one from scratch is far simpler than the hype suggests: at its core it is a while loop around a model that supports tool calling, where you hand the model a goal and a set of functions, it asks to run one, you run it, you feed the result back, and it goes again until it is done. This guide takes you from the definition to a concrete, minimal agent you can run today, then up through the levels of autonomy and what changes when you put an agent in production. We will build the loop by hand first so you understand exactly what is happening, then point you at the SDKs that do this for you. Everything here is current as of June 2026.

## What an AI agent actually is

An AI agent is software that uses a language model to decide and act in a loop, rather than just answering once. The model is the brain, but a brain with no hands cannot do anything, so you give it tools: functions it can call to read a file, query a database, search the web or hit an API. The agent runs a loop: the model receives the goal and the list of available tools, it either answers or asks to call a tool, your code runs that tool and returns the result, and the model uses that result to decide its next move. That loop is the whole idea. "Agentic AI" is the broader term for systems built this way; an "AI agent" is one such system. For the precise definitions, see the glossary entries on AI agent, agentic AI, tool calling and the agent harness.

- Model: the reasoning core that decides what to do (an LLM that supports tool calling).
- Tools: functions the model can call to act on the world, each with a name, a description and an input schema.
- Loop: model decides, your code runs the chosen tool, the result goes back, repeat until done.
- See the glossary: AI agent, agentic AI, tool calling, agent harness for the formal definitions.

## The build loop, step by step

Every agent, from a ten-line script to Claude Code, runs the same loop. You send the model the conversation so far plus the tool definitions. The model replies in one of two ways: with a final answer (it is done), or with a request to call one or more tools. If it asks for a tool, your code executes that tool, captures the output, appends it to the conversation as a tool result, and sends everything back. The model reads the result and decides again. You keep looping until the model returns a final answer or you hit a safety limit on iterations. The two non-negotiable guardrails are a maximum number of turns, so a confused agent cannot loop forever, and validation of tool inputs, because the model is asking you to run real code with arguments it chose.

- Send the goal, conversation history and tool definitions to the model.
- If the model returns a final answer, stop and return it.
- If it requests a tool, validate the input, run the tool, append the result, and loop.
- Always cap the number of iterations and validate tool arguments before executing.

## A minimal agent you can build

Here is the smallest agent that does something real: a model with one tool (a calculator) running the tool-calling loop by hand against the Anthropic Messages API. The pattern is identical for any provider that supports tool calling. The model gets the question and the tool definition; when it replies with stop_reason "tool_use", we run the tool, send back a tool_result, and loop until it gives a plain text answer. Read it once and the magic disappears: an agent is a loop, a model and a dictionary of functions.

```python
# pip install anthropic
# A minimal agent: one tool, the tool-calling loop by hand.
import anthropic

client = anthropic.Anthropic()  # reads ANTHROPIC_API_KEY from the env

# 1) Define the tools: a name, a description, and an input schema.
tools = [
    {
        "name": "calculator",
        "description": "Evaluate a basic arithmetic expression.",
        "input_schema": {
            "type": "object",
            "properties": {"expression": {"type": "string"}},
            "required": ["expression"],
        },
    }
]

# 2) Map tool names to the real functions that run them.
def calculator(expression: str) -> str:
    # Real code: validate hard. A toy eval is fine only for a demo.
    allowed = set("0123456789+-*/(). ")
    if not set(expression) <= allowed:
        return "error: invalid characters"
    return str(eval(expression))  # demo only; never eval untrusted input in prod

TOOLS = {"calculator": calculator}

# 3) The loop.
def run_agent(goal: str, max_turns: int = 8) -> str:
    messages = [{"role": "user", "content": goal}]
    for _ in range(max_turns):
        resp = client.messages.create(
            model="claude-sonnet-4-5",
            max_tokens=1024,
            tools=tools,
            messages=messages,
        )
        if resp.stop_reason != "tool_use":
            return "".join(b.text for b in resp.content if b.type == "text")
        messages.append({"role": "assistant", "content": resp.content})
        results = []
        for block in resp.content:
            if block.type == "tool_use":
                out = TOOLS[block.name](**block.input)
                results.append({
                    "type": "tool_result",
                    "tool_use_id": block.id,
                    "content": out,
                })
        messages.append({"role": "user", "content": results})
    return "stopped: hit the turn limit"

print(run_agent("What is 4321 * 1234, then add 99?"))
```
A complete minimal agent in Python: one tool, the model-plus-tool-calling loop by hand. The same shape works with any tool-calling model.

That is genuinely all an agent is. To make it useful you add more tools (read a file, call your API, query a database), give each a precise description so the model knows when to use it, and harden the execution path. The eval in the calculator is for the demo only; never run model-chosen code or expressions without strict validation or a sandbox.

## Use a framework once you understand the loop

Building the loop by hand once is the best way to understand agents, but in production you reach for a framework that handles the loop, retries, streaming, sessions and permissions for you. In 2026 the two most direct paths are the Claude Agent SDK, which exposes the same agent loop, tool set and context management that power Claude Code (install @anthropic-ai/claude-agent-sdk for TypeScript or claude-agent-sdk for Python), and the OpenAI Agents SDK, a lightweight Python and TypeScript framework that turns any function into a tool with automatic schema generation (pip install openai-agents). Both give you tool calling, multi-step loops, human-in-the-loop checkpoints, subagents and first-class MCP support out of the box. The principle is the same one you just built; the SDK just removes the plumbing.

- Claude Agent SDK: the same loop and tools that run Claude Code, programmable in Python and TypeScript, with built-in MCP and subagents.
- OpenAI Agents SDK: a lightweight multi-agent framework that turns any function into a validated tool (pip install openai-agents).
- Both handle the loop, retries, streaming, sessions and permissions you would otherwise write by hand.
- Connect external tools through MCP rather than bespoke glue; see What Is an MCP Server.

## The levels of autonomy

Not every agent should be fully autonomous, and choosing the right level is a design decision, not a default. Think of a ladder. At the bottom the model only suggests and a human does everything. One rung up it drafts and a human approves each action. Higher, it acts autonomously on low-risk steps but pauses for approval on anything sensitive (a human-in-the-loop checkpoint). At the top it runs an entire workflow unattended. The right level depends on the cost of a mistake: the more an error hurts, the more human oversight you keep. Most reliable production agents sit in the middle, fully autonomous on safe, reversible actions and gated on the rest. The Automation and Agentic Systems course covers this as the five levels of LLM autonomy.

- Suggest only: the agent proposes, a human does everything. Lowest risk, lowest leverage.
- Draft and approve: the agent prepares the action, a human confirms before it runs.
- Autonomous with checkpoints: it acts on safe steps and pauses for approval on risky ones.
- Fully unattended: it runs the whole workflow alone; reserve this for low-stakes, reversible tasks.

## Productionizing your agent

A demo agent and a production agent differ in everything around the loop. The model and the tools are the easy part; reliability is the work. Validate every tool input, because the model is choosing the arguments. Run anything that executes code or touches the outside world in a sandbox with timeouts and resource limits, never on a machine you care about. Log every step (the goal, each tool call, each result) so you can see what the agent did and debug it when it goes sideways. Cap iterations and cost so a confused agent cannot loop forever or run up a bill. And keep a human in the loop for irreversible or sensitive actions. These are the same lessons the founder builds learned the hard way: CallAssistant gave its voice agent tightly defined tools because there is no "are you sure?" on a phone call, and CodeCourier ran untrusted code only inside a disposable sandbox.

- Validate tool inputs and run code-executing tools in a sandbox with timeouts and limits.
- Log the goal, every tool call and every result so the agent is observable and debuggable.
- Cap iterations and spend so a runaway loop cannot cost you time or money.
- Gate irreversible or sensitive actions behind a human-in-the-loop approval step.
- Learn from real builds: CallAssistant (tight tools) and CodeCourier (sandboxing) on the Builds page.

## Steps

### 1. Pick a tool-calling model

Choose a model that supports tool calling (for example a Claude or GPT tier) and get an API key. The agent loop is identical across providers that support tools.

### 2. Define your tools

For each action the agent needs, write a function and a tool definition with a name, a clear description and an input schema. The description is what the model reads to decide when to call it.

### 3. Write the loop

Send the goal, conversation and tool definitions to the model. If it returns a final answer, stop. If it requests a tool, validate the input, run the tool, append the result, and send everything back.

### 4. Add guardrails

Cap the number of iterations, validate every tool argument, and run any code-executing tool in a sandbox with timeouts. Log each step so you can see what the agent did.

### 5. Choose an autonomy level

Decide which actions the agent may take unattended and which need human approval, based on the cost of a mistake. Gate irreversible or sensitive actions behind a checkpoint.

### 6. Move to an SDK for production

Once the loop is clear, adopt the Claude Agent SDK or the OpenAI Agents SDK to get retries, streaming, sessions, permissions and MCP support without writing the plumbing yourself.

## FAQ

### What is an AI agent?

An AI agent is a language model wrapped in a loop that can call tools, read the results and decide what to do next, repeating until it reaches a goal. The model is the brain, the tools are its hands, and the loop is what makes it act rather than just answer once.

### How do I build an AI agent from scratch?

Pick a model that supports tool calling, define your tools as functions with a name, description and input schema, then write a loop: send the goal and tools to the model, run any tool it requests, feed the result back, and repeat until it returns a final answer. Add a turn cap and input validation as guardrails.

### Do I need a framework like LangChain to build an agent?

No. An agent is a loop around a tool-calling model, and you can build a working one in a few dozen lines. Frameworks like the Claude Agent SDK or the OpenAI Agents SDK are worth adopting once you understand the loop, because they handle retries, streaming, sessions, permissions and MCP for you, not because the core idea is hard.

### What is the difference between an AI agent and a chatbot?

A chatbot answers a message and stops. An agent runs a loop: it can call tools to act on the world, read the results, and take multiple steps toward a goal before responding. The presence of tools and a decision loop is what makes something an agent rather than a chatbot.

### How do I make an AI agent safe to run in production?

Validate every tool input because the model chooses the arguments, run any code-executing tool in a sandbox with timeouts and resource limits, cap iterations and spend, log every step so the agent is observable, and keep a human in the loop for irreversible or sensitive actions.

### How many tools should an AI agent have?

As few as the task needs. Each tool adds to the context and to the chance the model picks the wrong one, so a handful of sharp, well-described tools beats a large pile. Give each tool a precise description, because that description is how the model decides when to use it.