O2: AI Agents Deep Dive
An LLM is a brain in a jar β impressive reasoning, zero ability to act. An AI agent wraps that brain with memory, tools, and a planning loop so it can observe the world, decide what to do, execute actions, and learn from results. This module covers the full spectrum from chatbot to multi-agent swarm. For orchestration primitives agents build on, see O1: Semantic Kernel. For the tool protocols agents use, see O3: MCP & Tools.
What Is an AI Agent?
Agent = LLM + Memory + Tools + Planning| Component | Role | Example |
|---|---|---|
| LLM | Reasoning engine β understands language, generates plans | GPT-4o, Claude, Llama 3 |
| Memory | Short-term (conversation) + long-term (vector store) context | Chat history, Cosmos DB, Redis |
| Tools | Functions the agent can invoke to affect the real world | Search API, database query, email sender |
| Planning | Strategy for decomposing goals into steps | ReAct, Chain-of-Thought, Tree-of-Thought |
The Autonomy Heuristic
Talks β Assistant Β· Suggests β Copilot Β· Acts β Agent. If it waits for every instruction, itβs an assistant. If it proactively suggests next steps, itβs a copilot. If it takes action on your behalf, itβs an agent.
Chatbot vs Agent
| Dimension | Chatbot | Agent |
|---|---|---|
| Interaction | Reactive Q&A β user asks, bot answers | Goal-driven β user sets objective, agent pursues it |
| Decision making | Template matching or single LLM call | Multi-step reasoning with planning loops |
| Tool access | None or scripted integrations | Dynamic tool selection and chaining |
| State | Stateless or simple session memory | Rich short-term + long-term memory |
| Autonomy | Low β follows scripts | High β decomposes goals, adapts, retries |
| Error handling | βI donβt understandβ | Retries, alternative tools, escalation |
Agent vs Copilot vs Assistant
| Trait | Assistant | Copilot | Agent |
|---|---|---|---|
| Autonomy | Low | Medium | High |
| Initiative | Responds to commands | Proactively suggests | Acts independently |
| Scope | Single task | Workflow augmentation | End-to-end goal completion |
| Human role | Driver | Co-driver | Passenger (with override) |
| Example | Siri setting a timer | GitHub Copilot suggesting code | Agent booking flights + hotel for a trip |
The Evolution of AI Systems
Rule-Based Bot β LLM Chatbot β RAG Chatbot β Tool-Using Assistant β AI Agent β Multi-Agent System
β β β β β β
Hard-coded Free-form Grounded in Can call APIs Autonomous Agents
if/else responses your data and functions planning loop collaborateEach stage adds a capability: natural language β knowledge β action β autonomy β collaboration.
The Core Agent Loop
Every agent, regardless of framework, runs a variation of this loop:
βββββββββββββββββββββββββββββββββββββββ
β 1. OBSERVE β User goal or env β
β 2. THINK β LLM reasons + plans β
β 3. ACT β Execute tool/action β
β 4. OBSERVE β Check results β
β 5. REPEAT or STOP β
βββββββββββββββββββββββββββββββββββββββThe ReAct pattern (Reason + Act) is the most common implementation: the LLM produces a Thought β Action β Observation cycle until the task is complete or a stop condition is met.
# Simplified agent loop (pseudocode)
def agent_loop(goal: str, tools: list, max_hops: int = 10):
memory = [{"role": "user", "content": goal}]
for hop in range(max_hops):
response = llm.chat(memory, tools=tools)
if response.finish_reason == "stop":
return response.content # Done
# Execute tool call
result = execute_tool(response.tool_calls[0])
memory.append({"role": "tool", "content": result})
raise TimeoutError("Agent exceeded max hops")Always set max_hops (or equivalent). Without it, an agent can loop forever β burning tokens and money. Start with 10 hops, increase only if your use case demands it.
Agent Frameworks Comparison
| Framework | Language | Strength | Pattern | Best For |
|---|---|---|---|---|
| AutoGen | Python | Multi-agent conversations, code execution | ConversableAgent + GroupChat | Research, coding tasks, multi-agent debate |
| CrewAI | Python | Task delegation, role-based agents | Crew β Agent β Task with delegation | Business workflows, content pipelines |
| LangChain | Python/JS | LCEL chains, extensive tool ecosystem | AgentExecutor, LangGraph for cycles | RAG, tool-heavy pipelines, prototyping |
| Semantic Kernel | C#/Python/Java | Enterprise-grade, Azure-native | Plugins + Planners + Filters | Production .NET/Java apps, Azure integration |
| Microsoft Agent Framework | Python | Production SDK, Azure Foundry integration | Agent Service with tools + state | Enterprise deployment with eval + monitoring |
When to use what
- Prototyping β LangChain (fastest ecosystem, most examples)
- Multi-agent research β AutoGen (built for agent conversations)
- Enterprise .NET β Semantic Kernel (first-class C# support)
- Production Python β Microsoft Agent Framework (Foundry integration)
- Business process β CrewAI (intuitive role/task model)
Multi-Agent Patterns
Supervisor Pattern
One orchestrator agent delegates to specialist agents and aggregates results.
βββββββββββββββ
β Supervisor β
ββββ¬ββββ¬ββββ¬ββββ
β β β
ββββββ β ββββββ
βΌ βΌ βΌ
Researcher Coder ReviewerSwarm Pattern
Agents self-organize without central control. Each agent decides when to hand off to another.
Pipeline Pattern
Sequential handoff β Agent A β Agent B β Agent C. Each agent transforms and passes output forward. Ideal for content generation (research β write β edit β publish).
Debate Pattern
Two+ agents argue opposing positions. A judge agent synthesizes the best answer. Useful for complex analysis where multiple perspectives improve accuracy.
Production Guardrails
| Guardrail | Why | Implementation |
|---|---|---|
| Max hops | Prevent infinite loops | max_iterations=10 in agent config |
| Token budget | Control cost per request | max_tokens per hop + total budget |
| Timeout | Prevent hung agents | 60s per tool call, 5min per task |
| Audit trail | Compliance + debugging | Log every thought/action/observation |
| Human-in-the-loop | Safety for destructive actions | Require approval for writes/deletes |
| Sandboxing | Prevent code execution escapes | Docker containers for code agents |
| Content safety | Block harmful outputs | Azure Content Safety on every response |
For infrastructure to run agents at scale, see O5: AI Infrastructure. For evaluation of agent quality, see O4: Azure AI Foundry.
Key Takeaways
- Agent = LLM + Memory + Tools + Planning β each component is necessary
- The agent loop (Observe β Think β Act) is universal across all frameworks
- Multi-agent patterns (Supervisor, Swarm, Pipeline, Debate) solve different coordination needs
- Production agents need guardrails: max hops, token budgets, timeouts, audit trails
- Choose your framework based on language, deployment target, and collaboration pattern β not hype