Skip to main content

O4: Azure AI Foundry

Azure AI Foundry is Microsoft's unified platform for building, evaluating, and deploying AI applications at enterprise scale. Think of it as the control plane for your entire AI lifecycle โ€” from model selection through production monitoring. For the orchestration SDKs that build on Foundry, see O1: Semantic Kernel. For infrastructure underpinning Foundry deployments, see O5: AI Infrastructure.

:::tip Think of it this way If Azure Resource Manager (ARM) is the control plane for Azure infrastructure, Azure AI Foundry is the control plane for AI workloads โ€” model deployment, prompt management, evaluation, and safety all in one place. :::

Evolutionโ€‹

Azure ML Studio (2015) โ†’ Azure AI Studio (2023) โ†’ Azure AI Foundry (GA 2024+)
ML training only AI + ML unified Full AI lifecycle platform
Preview Production-ready

Each generation expanded scope: ML-only โ†’ AI experiments โ†’ enterprise AI lifecycle management.

Three Interfaces, One Platformโ€‹

InterfaceBest ForExample
Portal (ai.azure.com)Exploration, visual evaluation, prompt playgroundTry models, compare outputs, review evals
Python SDK (azure-ai-projects)Programmatic access, CI/CD integration, automationBuild evaluation pipelines, deploy models
CLI (az ml)Scripting, infrastructure automationProvision hubs/projects in pipelines

All three interfaces manage the same underlying resources โ€” choose based on your workflow.

Hub and Project Modelโ€‹

Foundry uses a two-tier workspace hierarchy that separates shared infrastructure from team workspaces:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ AI Hub โ”‚
โ”‚ Shared: connections, compute, โ”‚
โ”‚ networking, security policies โ”‚
โ”‚ โ”‚
โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ”‚ Project A โ”‚ โ”‚ Project B โ”‚ โ”‚
โ”‚ โ”‚ Team Alpha โ”‚ โ”‚ Team Beta โ”‚ โ”‚
โ”‚ โ”‚ - Endpoints โ”‚ โ”‚ - Endpoints โ”‚ โ”‚
โ”‚ โ”‚ - Evals โ”‚ โ”‚ - Evals โ”‚ โ”‚
โ”‚ โ”‚ - Flows โ”‚ โ”‚ - Flows โ”‚ โ”‚
โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

Hub (Organization Level)โ€‹

ResponsibilityWhat It Manages
ConnectionsAzure OpenAI, AI Search, Storage โ€” shared across projects
ComputeShared compute instances for training and inference
NetworkingPrivate endpoints, VNet integration, firewall rules
SecurityRBAC policies, managed identity, Key Vault integration
GovernanceContent safety policies, model access controls

Project (Team Level)โ€‹

ResponsibilityWhat It Manages
EndpointsModel deployments specific to this team
EvaluationsQuality metrics, test datasets, evaluation runs
Prompt FlowRAG/agent orchestration flows
DataDatasets, indexes, and data connections
ArtifactsPrompt versions, flow snapshots, evaluation history

:::info Hub โ†” Project relationship One Hub โ†’ many Projects. Projects inherit the Hub's connections and security. Teams get isolation (their own endpoints, evals, data) while sharing expensive infrastructure (compute, networking). :::

Model Catalogโ€‹

Foundry's Model Catalog provides access to 1,700+ models from multiple providers:

ProviderNotable ModelsDeployment Type
OpenAIGPT-4o, GPT-4o-mini, o1, o3Managed (Azure OpenAI)
MetaLlama 3.1 (8B/70B/405B)Serverless API or Managed Compute
MistralMistral Large, MixtralServerless API
CohereCommand R, Command R+Serverless API
MicrosoftPhi-3, Phi-3.5, FlorenceServerless API or Managed Compute

Deployment Typesโ€‹

TypeHow It WorksBillingBest For
Serverless APIPay-per-token, no infrastructure to manageToken-based pricingVariable/unpredictable workloads
Managed ComputeDedicated VM(s) running the modelVM hourly rateConsistent high throughput, custom models
GlobalMicrosoft-hosted, multi-regionToken-based pricingHighest availability, lowest latency

For GPU sizing and PTU vs PAYG decisions, see O5: AI Infrastructure.

Evaluation Pipelinesโ€‹

Foundry provides built-in evaluation for measuring AI quality โ€” critical for production deployments:

Built-in Metricsโ€‹

MetricWhat It MeasuresScaleTarget
GroundednessAre claims supported by the provided context?1โ€“5โ‰ฅ 4.0
RelevanceDoes the response address the user's question?1โ€“5โ‰ฅ 4.0
CoherenceIs the response logically structured and readable?1โ€“5โ‰ฅ 4.0
FluencyIs the language natural and grammatically correct?1โ€“5โ‰ฅ 4.0
SimilarityHow close is the response to a ground truth answer?1โ€“5โ‰ฅ 3.5

Running Evaluationsโ€‹

from azure.ai.projects import AIProjectClient
from azure.ai.evaluation import GroundednessEvaluator, RelevanceEvaluator

project = AIProjectClient.from_connection_string(conn_str)

# Create evaluators
groundedness = GroundednessEvaluator(model_config)
relevance = RelevanceEvaluator(model_config)

# Evaluate a dataset
results = project.evaluations.create(
data="test_dataset.jsonl",
evaluators={
"groundedness": groundedness,
"relevance": relevance,
},
)
print(results.metrics)
# {"groundedness": 4.3, "relevance": 4.1}
warning

Never deploy an AI application without running evaluations first. Evaluation is not optional โ€” it's the quality gate between development and production. Set minimum thresholds and fail the pipeline if they're not met.

Prompt Flowโ€‹

Prompt Flow is Foundry's visual orchestration tool for building RAG and agent pipelines:

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ Input โ”‚โ”€โ”€โ”€โ–บโ”‚ Embedding โ”‚โ”€โ”€โ”€โ–บโ”‚ Search โ”‚โ”€โ”€โ”€โ–บโ”‚ LLM โ”‚โ”€โ”€โ–บ Output
โ”‚ (query) โ”‚ โ”‚ (vectorize)โ”‚ โ”‚ (retrieve)โ”‚ โ”‚ (generate)โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
FeatureDescription
Visual editorDrag-and-drop DAG for prompt chains
VariantsA/B test different prompts side by side
Bulk testingRun flows against test datasets
DeploymentOne-click deploy to managed endpoint
TracingStep-by-step execution trace for debugging

When to Use Whatโ€‹

ScenarioUse
Quick model experimentationPortal playground
Building a RAG pipelinePrompt Flow + AI Search connection
Automated evaluation in CI/CDPython SDK + evaluation pipeline
Deploying to productionManaged endpoint with content safety
Multi-team AI developmentHub + per-team Projects

Key Takeawaysโ€‹

  1. Azure AI Foundry is the unified control plane for the AI lifecycle โ€” build, evaluate, deploy, monitor
  2. The Hub/Project model separates shared infrastructure from team workspaces
  3. The Model Catalog provides 1,700+ models with serverless or managed deployment options
  4. Evaluation pipelines with built-in metrics are essential quality gates before production
  5. Prompt Flow provides visual orchestration for RAG and agent pipelines
  6. Use the Portal for exploration, SDK for automation, CLI for infrastructure scripting