Building Multi-Agent Solutions with Azure AI Foundry

Introduction

Most enterprises have already cleared the first hurdle - proving that AI agents work. The harder question is what comes next: how do you coordinate multiple specialized agents to run a compliance check, process a loan, or manage a customer onboarding workflow that spans five systems and three departments?

Single-agent proofs of concept are deceptively easy to build. Production-scale automation is not. The gap between the two involves orchestration logic, state management, governance controls, and cross-domain specialization that no single agent can carry alone.

This article covers the architecture of multi-agent systems in Foundry Agent Service, Microsoft's managed platform for enterprise agent development. It examines the two primary orchestration patterns: Connected Agents and Multi-Agent Workflows - how each works, and which enterprise scenarios each pattern is built for.

TLDR

Single agents break down when processes span multiple domains, decision points, or require persistent state
Foundry Agent Service offers two patterns: Connected Agents (lightweight delegation) and Multi-Agent Workflows (stateful orchestration)
Connected Agents need no custom orchestrator; workflows handle long-running, complex processes
Built-in tracing, evaluators, and RBAC make both patterns viable for regulated industries
Both patterns support governance by design - tracing, access controls, and auditability are built into the architecture

Why Single Agents Fall Short in the Real World

A single agent handling a customer onboarding workflow quickly becomes a liability. It needs to authenticate the user, retrieve account data, run a credit check, verify compliance requirements, and route the case - all while maintaining context across every step. Add a failure at any one point and the whole process breaks. There's no isolation, no recovery path, and no clean way to debug which step went wrong.

That's an architecture problem, not a model quality problem. Three properties explain why multi-agent systems solve what single agents cannot:

Horizontal scalability - additional agents handle additional task volume without creating bottlenecks in a single processing thread
Specialization - each agent is tuned for a narrow domain, which means better performance and easier maintenance than a general-purpose agent stretched across roles
Composability - purpose-built agents can be reused across different workflows, reducing duplication and accelerating future development

Three core multi-agent system properties scalability specialization and composability

The enterprise AI adoption data shows the stakes. McKinsey's 2025 State of AI survey found that **23% of organizations are actively scaling an agentic AI system** somewhere in the enterprise - a signal that production deployments have moved well past the pilot stage.

At the same time, Gartner predicts over 40% of agentic AI projects will be canceled by end of 2027 due to escalating costs, unclear value, or inadequate risk controls. The architectural choices made now - particularly around orchestration and governance - will determine which category an organization falls into.

Foundry Agent Service: What's Under the Hood

Foundry Agent Service is Microsoft's fully managed platform for building, deploying, and scaling AI agents within the Azure AI Foundry ecosystem. It handles hosting, tool connectivity, observability, and state management so developers can focus on agent logic rather than infrastructure plumbing.

The platform draws from the Foundry model catalog - GPT-4o, Llama, and DeepSeek are all supported - and offers three agent types: Prompt agents (generally available), Workflow agents (preview), and Hosted agents (preview).

Four Platform Pillars for Multi-Agent Development

Pillar	What It Provides
Tool catalog	Built-in tools (web search, file search, code interpreter) plus custom tools via MCP, OpenAPI, Azure Functions, and A2A endpoints
Protocol support	MCP for standardized tool connectivity; A2A for cross-platform agent interoperability
Orchestration frameworks	Microsoft Agent Framework combining AutoGen's agent abstractions with Semantic Kernel's enterprise features
Observability	OpenTelemetry-based tracing, built-in evaluators, Agent Monitoring Dashboard

Development Surfaces

Teams can engage the platform at whatever level of abstraction suits them:

Foundry portal - no-code/low-code visual interface for building and testing agents
VS Code extension - YAML-based workflow design with visual and code views synchronized
SDKs - Python (Azure AI Agents client library) and .NET (Azure.AI.Agents.Persistent)

Connected Agents: Delegating Tasks Without Custom Orchestration

How the Pattern Works

Connected Agents let a primary agent delegate tasks to specialized sub-agents using natural language routing. No hardcoded orchestration logic required - the main agent interprets user intent, selects the appropriate sub-agent as a tool, and compiles the final response. Sub-agent responses are visible only to the main agent, not the end user.

A Sales Assist Agent illustrates the pattern well. Rather than one overloaded agent trying to handle every sales query, the main agent delegates to:

A Market Research Agent for industry and segment data
A Competitive Analysis Agent for positioning and competitor comparisons
A Customer Insights Agent for account history and behavioral signals
A Financial Analysis Agent for deal economics and forecasting

Each sub-agent stays focused on its domain. The main agent assembles the final output. Swapping or adding a sub-agent doesn't require touching the main agent's logic.

Connected Agents hub-and-spoke delegation pattern with four specialized sub-agents

Key Features for Enterprise Use

Routing is handled through natural language descriptions - no custom orchestrator needed
Add new sub-agents without modifying the main agent's logic
Focused responsibilities per agent make debugging substantially cleaner
Deploy via the Foundry portal UI, Python SDK, or .NET SDK

Technical Setup (High-Level)

Initialize the agent client
Create the sub-agent with its specific capabilities
Define a ConnectedAgentToolDefinition with a descriptive name and routing description
Create the main agent with the connected agent registered as a tool
Create a thread, run the agent, collect responses

Microsoft documents both ConnectedAgentToolDefinition and ConnectedAgentTool for Python and .NET. Microsoft's .NET samples include Sample23_PersistentAgents_Connected_Agent as a reference implementation.

Limitations to Plan Around

Three constraints matter for enterprise planning:

No local function calling - use OpenAPI tools or Azure Functions instead
Citation passthrough is not guaranteed - sub-agent citations may not surface in the main agent's response
Maximum delegation depth of 2 levels - sub-agents cannot spawn their own sub-agents

Multi-Agent Workflows: Stateful Orchestration for Complex Enterprise Processes

How Workflows Differ from Connected Agents

Multi-Agent Workflows are the structured, stateful orchestration layer built on the Microsoft Agent Framework. They're designed for long-running, multi-step processes where agents must maintain context across turns. Failures need to be recoverable, and execution must follow explicit conditional logic.

Where Connected Agents handle discrete delegable tasks, workflows handle processes with multiple checkpoints - think customer billing support that moves through authentication, triage, and resolution stages, with different agents responsible at each stage.

Orchestration Concepts

The workflow engine is built around four core constructs: nodes (execution points that can run agents in parallel), edges (connections between nodes), conditions (rule-based or model-driven transition logic), and variables (scoped data passed between stages without overwrites).

Templates accelerate setup:

Sequential - linear step-by-step execution
Human-in-the-loop - approval gates that pause for human review
Group Chat - multi-agent collaborative discussion patterns

Data passes between agents through structured JSON schema outputs and scoped variables, which matters for enterprise data integrity - especially in regulated workflows where you need a clear record of what data was passed where.

Multi-Agent Workflow four core constructs nodes edges conditions and variables diagram

Workflow Toolset

Tool	Description
Visual drag-and-drop builder	No orchestration code required; available in Foundry portal
YAML workflow definitions	Synchronized with visual view for code-visual parity
Power Fx expressions	Condition logic and computed values
Version history	Immutable snapshot created on every save

Enterprise Use Cases That Require Stateful Orchestration

These are the scenarios where stateful orchestration isn't optional - it's the only viable architecture:

Software development lifecycle - story generation flows into test case design, then QA validation, with human-in-the-loop gates before each stage advances. State must persist across sessions spanning days.
Contract review and compliance - clause extraction, compliance validation, and risk flagging require strict sequential handoffs. Losing state at any stage means a missed compliance check.
B2B SaaS renewal management - multi-week cycles coordinating customer success, billing, and account management agents require durable, multi-party state that Connected Agents alone cannot sustain.

When to Use Connected Agents vs. Multi-Agent Workflows

Decision Framework

Dimension	Connected Agents	Multi-Agent Workflows
Process complexity	Discrete, delegable tasks	Multi-step with conditional branching
State requirements	Stateless per interaction	Persistent state across turns
Control needs	Natural language routing is sufficient	Fine-grained execution control needed
Time horizon	Short-lived task completion	Long-running or durable processes

When to Choose Each

Connected Agents fit when:

The primary value is modular specialization without complex sequencing
Development speed matters and orchestration overhead is undesirable
Tasks can be fully described in natural language routing instructions

Multi-Agent Workflows fit when:

Processes involve regulatory checkpoints or human approval gates
Agents need to run in parallel and sync before proceeding
The workflow spans hours, days, or weeks - not a single session

Combining Both in a Single System

A workflow can incorporate Connected Agents within its nodes. A team building a procurement automation system might use a workflow to manage the approval lifecycle while Connected Agents handle the research and vendor comparison tasks within individual stages.

Starting with Connected Agents for simpler delegation, then scaling up to full workflows as process complexity grows, is the most common path teams take.

Governance, Observability, and Enterprise-Grade Deployment

Observability Stack

Foundry Agent Service tracing captures every agent call, input, output, tool usage, retry, and latency - using OpenTelemetry semantic conventions. Multi-agent spans like agent_to_agent_interaction and agent_planning provide visibility into cross-agent execution paths. Tracing is generally available for Prompt agents and in preview for Workflow agents.

The Agent Monitoring Dashboard tracks:

Token usage and latency
Success rates and error patterns
Evaluation results for quality and safety metrics
Scheduled evaluations and red-team scans (public preview)

Built-in evaluators cover task adherence, coherence, tool-call accuracy, and task completion - the signals that actually matter for regulated deployments.

Security and Governance Controls

Microsoft's platform provides several governance layers out of the box:

RBAC - Foundry roles (Foundry User, Project Manager, Account Owner, Foundry Owner) with Microsoft Entra ID recommended for production
Data privacy - customer data is stored in the customer's Azure tenant, protected by AES-256 encryption by default, with customer-managed key options available
Data isolation - prompts and generated content are not shared with other customers or used to train third-party model providers
Auditability - diagnostic logging to Azure Monitor, token usage logs, and Microsoft Purview Audit integration

Foundry Agent Service enterprise governance controls RBAC encryption tracing and auditability layers

For organizations in healthcare, financial services, or energy, these controls address the baseline requirements. The gap between "platform controls available" and "compliance achieved" depends on how governance is embedded at the architecture level.

Cross-Platform Interoperability

Platform-level governance only goes so far when agents need to coordinate across systems outside Azure. Foundry Agent Service addresses this with A2A (Agent-to-Agent) endpoints as a custom tool option in preview. The A2A protocol - announced by Google in April 2025 as an open standard - enables agents across enterprise systems to communicate and coordinate actions. SAP has announced A2A support in Joule, expanding the interoperability ecosystem beyond any single vendor's platform.

MCP support standardizes tool connectivity across the agent ecosystem. Together, A2A and MCP reduce the risk of building in ways that create vendor lock-in.

Moving from Architecture to Production

For organizations that need to go from design to production-ready deployment with governance built in from the start - rather than retrofitted - working with an AI engineering partner like Cybic can compress that timeline. Cybic embeds RBAC, encrypted data handling, audit trails, and compliance alignment (SOC 2, HIPAA, ISO, GDPR) at the infrastructure level - not as a final checklist, but as part of the initial architecture across every multi-agent engagement.

Frequently Asked Questions

What is the difference between Connected Agents and Multi-Agent Workflows in Azure AI Foundry?

Connected Agents use natural language task delegation from a primary agent to sub-agents - stateless, no orchestration code required. Multi-Agent Workflows provide a stateful, structured execution layer with explicit nodes, conditions, and failure recovery, suited for complex, long-running processes with regulatory or approval checkpoints.

Can Foundry Agent Service multi-agent systems work with non-Microsoft platforms and frameworks?

Yes. Via A2A protocol support (currently in preview), Foundry agents can expose endpoints that external orchestrators - including SAP Joule and Google's infrastructure - can call directly. MCP support adds standardized tool connectivity across frameworks. Verify current certified interoperability against Microsoft documentation before committing to specific third-party integrations.

What programming languages and SDKs are supported?

Python and .NET are the primary SDK options. Beyond code-first development, several interfaces are available:

No-code: Foundry portal
YAML-based design: VS Code extension
Orchestration framework: Semantic Kernel (Python and .NET)

What are the known limitations of Connected Agents?

Three constraints to plan for: local function calling is not supported (use OpenAPI tools or Azure Functions instead), citation passthrough from sub-agents to the main agent is not guaranteed, and the maximum delegation depth is 2 levels - sub-agents cannot spawn their own sub-agents.

How does Foundry Agent Service handle governance and security for multi-agent deployments?

Built-in controls include RBAC via Foundry roles, AES-256 encryption with customer-managed key options, OpenTelemetry-based execution tracing, and post-deployment monitoring with scheduled safety evaluations. Microsoft Entra ID is recommended for production authentication. Additional governance at the architecture level remains the organization's responsibility.

How should enterprises evaluate readiness to move from single-agent POC to multi-agent production?

Assess readiness across four dimensions before expanding to production:

Does the target process span multiple specialized roles or decision points?
Does state need to persist across interactions?
Are human-in-the-loop approvals required?
Is observability and governance tooling in place to monitor agent behavior at scale?