Skip to Content
LearningO2: AI Agents Deep Dive

O2: AI Agents Deep Dive

An LLM is a brain in a jar β€” impressive reasoning, zero ability to act. An AI agent wraps that brain with memory, tools, and a planning loop so it can observe the world, decide what to do, execute actions, and learn from results. This module covers the full spectrum from chatbot to multi-agent swarm. For orchestration primitives agents build on, see O1: Semantic Kernel. For the tool protocols agents use, see O3: MCP & Tools.

What Is an AI Agent?

Agent = LLM + Memory + Tools + Planning
ComponentRoleExample
LLMReasoning engine β€” understands language, generates plansGPT-4o, Claude, Llama 3
MemoryShort-term (conversation) + long-term (vector store) contextChat history, Cosmos DB, Redis
ToolsFunctions the agent can invoke to affect the real worldSearch API, database query, email sender
PlanningStrategy for decomposing goals into stepsReAct, Chain-of-Thought, Tree-of-Thought
πŸ’‘

The Autonomy Heuristic

Talks β†’ Assistant Β· Suggests β†’ Copilot Β· Acts β†’ Agent. If it waits for every instruction, it’s an assistant. If it proactively suggests next steps, it’s a copilot. If it takes action on your behalf, it’s an agent.

Chatbot vs Agent

DimensionChatbotAgent
InteractionReactive Q&A β€” user asks, bot answersGoal-driven β€” user sets objective, agent pursues it
Decision makingTemplate matching or single LLM callMulti-step reasoning with planning loops
Tool accessNone or scripted integrationsDynamic tool selection and chaining
StateStateless or simple session memoryRich short-term + long-term memory
AutonomyLow β€” follows scriptsHigh β€” decomposes goals, adapts, retries
Error handling”I don’t understand”Retries, alternative tools, escalation

Agent vs Copilot vs Assistant

TraitAssistantCopilotAgent
AutonomyLowMediumHigh
InitiativeResponds to commandsProactively suggestsActs independently
ScopeSingle taskWorkflow augmentationEnd-to-end goal completion
Human roleDriverCo-driverPassenger (with override)
ExampleSiri setting a timerGitHub Copilot suggesting codeAgent booking flights + hotel for a trip

The Evolution of AI Systems

Rule-Based Bot β†’ LLM Chatbot β†’ RAG Chatbot β†’ Tool-Using Assistant β†’ AI Agent β†’ Multi-Agent System ↓ ↓ ↓ ↓ ↓ ↓ Hard-coded Free-form Grounded in Can call APIs Autonomous Agents if/else responses your data and functions planning loop collaborate

Each stage adds a capability: natural language β†’ knowledge β†’ action β†’ autonomy β†’ collaboration.

The Core Agent Loop

Every agent, regardless of framework, runs a variation of this loop:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ 1. OBSERVE ← User goal or env β”‚ β”‚ 2. THINK ← LLM reasons + plans β”‚ β”‚ 3. ACT ← Execute tool/action β”‚ β”‚ 4. OBSERVE ← Check results β”‚ β”‚ 5. REPEAT or STOP β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

The ReAct pattern (Reason + Act) is the most common implementation: the LLM produces a Thought β†’ Action β†’ Observation cycle until the task is complete or a stop condition is met.

# Simplified agent loop (pseudocode) def agent_loop(goal: str, tools: list, max_hops: int = 10): memory = [{"role": "user", "content": goal}] for hop in range(max_hops): response = llm.chat(memory, tools=tools) if response.finish_reason == "stop": return response.content # Done # Execute tool call result = execute_tool(response.tool_calls[0]) memory.append({"role": "tool", "content": result}) raise TimeoutError("Agent exceeded max hops")
⚠️

Always set max_hops (or equivalent). Without it, an agent can loop forever β€” burning tokens and money. Start with 10 hops, increase only if your use case demands it.

Agent Frameworks Comparison

FrameworkLanguageStrengthPatternBest For
AutoGenPythonMulti-agent conversations, code executionConversableAgent + GroupChatResearch, coding tasks, multi-agent debate
CrewAIPythonTask delegation, role-based agentsCrew β†’ Agent β†’ Task with delegationBusiness workflows, content pipelines
LangChainPython/JSLCEL chains, extensive tool ecosystemAgentExecutor, LangGraph for cyclesRAG, tool-heavy pipelines, prototyping
Semantic KernelC#/Python/JavaEnterprise-grade, Azure-nativePlugins + Planners + FiltersProduction .NET/Java apps, Azure integration
Microsoft Agent FrameworkPythonProduction SDK, Azure Foundry integrationAgent Service with tools + stateEnterprise deployment with eval + monitoring
ℹ️

When to use what

  • Prototyping β†’ LangChain (fastest ecosystem, most examples)
  • Multi-agent research β†’ AutoGen (built for agent conversations)
  • Enterprise .NET β†’ Semantic Kernel (first-class C# support)
  • Production Python β†’ Microsoft Agent Framework (Foundry integration)
  • Business process β†’ CrewAI (intuitive role/task model)

Multi-Agent Patterns

Supervisor Pattern

One orchestrator agent delegates to specialist agents and aggregates results.

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Supervisor β”‚ β””β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”¬β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”˜ β”‚ └────┐ β–Ό β–Ό β–Ό Researcher Coder Reviewer

Swarm Pattern

Agents self-organize without central control. Each agent decides when to hand off to another.

Pipeline Pattern

Sequential handoff β€” Agent A β†’ Agent B β†’ Agent C. Each agent transforms and passes output forward. Ideal for content generation (research β†’ write β†’ edit β†’ publish).

Debate Pattern

Two+ agents argue opposing positions. A judge agent synthesizes the best answer. Useful for complex analysis where multiple perspectives improve accuracy.

Production Guardrails

GuardrailWhyImplementation
Max hopsPrevent infinite loopsmax_iterations=10 in agent config
Token budgetControl cost per requestmax_tokens per hop + total budget
TimeoutPrevent hung agents60s per tool call, 5min per task
Audit trailCompliance + debuggingLog every thought/action/observation
Human-in-the-loopSafety for destructive actionsRequire approval for writes/deletes
SandboxingPrevent code execution escapesDocker containers for code agents
Content safetyBlock harmful outputsAzure Content Safety on every response

For infrastructure to run agents at scale, see O5: AI Infrastructure. For evaluation of agent quality, see O4: Azure AI Foundry.

Key Takeaways

  1. Agent = LLM + Memory + Tools + Planning β€” each component is necessary
  2. The agent loop (Observe β†’ Think β†’ Act) is universal across all frameworks
  3. Multi-agent patterns (Supervisor, Swarm, Pipeline, Debate) solve different coordination needs
  4. Production agents need guardrails: max hops, token budgets, timeouts, audit trails
  5. Choose your framework based on language, deployment target, and collaboration pattern β€” not hype
Last updated on