ChanlChanl
Tools & MCP

The Three Protocols Every AI Agent Will Speak

The AI agent protocol stack has three layers: MCP for tools, A2A for agent-to-agent communication, and WebMCP for browser interaction. A practitioner's guide to how they work together in production.

DGDean GroverCo-founderFollow
March 20, 2026
18 min read
Three-layer protocol stack diagram showing MCP, A2A, and WebMCP working together for AI agents

A customer asks your AI agent: "Return the laptop I ordered last week, and find me a replacement under $1,000."

That single request touches three systems. The agent needs to call your returns API (a tool). It needs to hand off the product search to a specialist agent built by another team (agent coordination). And that specialist needs to browse your e-commerce catalog the way a human would, except reliably (browser interaction).

Twelve months ago, each of those connections required custom integration code. Today, each has a protocol. We run MCP in production for tool execution across voice, chat, and messaging. So when we say the agent protocol stack has crystallized into three layers, it's not a prediction. It's what we build against daily.

Here's the stack: MCP for tools. A2A for agent coordination. WebMCP for browser interaction. This article follows that laptop return request through all three layers -- from the first tool call to the final recommendation.


Table of Contents


Layer 1: MCP -- The Universal Tool Interface

Back to our laptop return. The first thing the returns agent needs is order data -- it has to call your order management system. That's MCP's job: connecting agents to tools, databases, APIs, and services through a standard interface.

Anthropic open-sourced MCP in November 2024. By March 2026, it was donated to the Agentic AI Foundation under the Linux Foundation, co-founded by OpenAI and Block. The ecosystem: 10,000+ active servers, 97 million monthly SDK downloads, and client support in Claude, ChatGPT, Cursor, Gemini, Microsoft Copilot, and VS Code.

How It Works

MCP uses a client-server architecture on JSON-RPC 2.0. A client (in your agent or IDE) connects to a server that exposes four capability types: Tools (executable actions), Resources (read-only data), Prompts (reusable templates), and Sampling (a reverse channel where the server requests LLM completions from the client).

Here's an MCP server that exposes the order lookup tool our returns agent needs:

typescript
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
 
// Each MCP server declares a name and version for discovery
const server = new McpServer({
  name: "customer-service",
  version: "1.0.0",
});
 
// Register a tool: name, description (the LLM reads this), schema, handler
server.tool(
  "lookup_customer",
  "Find a customer by email address and return their account details",
  {
    email: { type: "string", description: "Customer email address" },
  },
  async ({ email }) => {
    // The handler does the actual work -- any async operation
    const customer = await db.customers.findOne({ email });
    if (!customer) {
      return {
        content: [{ type: "text", text: `No customer found for ${email}` }],
      };
    }
    // Return structured data; the LLM decides how to present it
    return {
      content: [{
        type: "text",
        text: JSON.stringify({
          name: customer.name,
          plan: customer.plan,
          accountAge: customer.accountAgeDays,
          openTickets: customer.openTickets,
        }),
      }],
    };
  }
);
 
// stdio transport: messages flow through stdin/stdout (local tools)
// Production alternative: Streamable HTTP for cloud deployments
const transport = new StdioServerTransport();
await server.connect(transport);

Three transport options: stdio for local tools (stdin/stdout), SSE for remote HTTP streaming, and Streamable HTTP for production cloud deployments with bidirectional support. New to MCP? Our hands-on tutorial walks through building your first server.

The 72% Problem

143,000 of 200,000 tokens -- gone before the user says a word.

That's the number Perplexity CTO Denis Yarats dropped at Ask 2026 on March 11, announcing the company was moving away from MCP. Three MCP servers consumed 72% of the context window just with tool schemas, leaving only 57,000 tokens for conversation, documents, and reasoning.

Every tool description, parameter schema, and response format eats into the model's working memory. For agents making many tool calls across long conversations, the overhead compounds.

But the MCP maintainers already know. The 2026 roadmap (published March 9 by lead maintainer David Soria Parra) targets a .well-known metadata format for capability discovery -- agents won't load every schema upfront, they'll discover what's available and load on demand. We've seen this movie before: early REST APIs returned everything in one payload, then came pagination, sparse fieldsets, and GraphQL. MCP is on the same optimization trajectory.

The Session State Trap

Context consumption gets the headlines, but session state bites harder in production. MCP sessions are stateful by default, and the spec doesn't define how to persist them.

A Streamable HTTP connection gets a Mcp-Session-Id. The server maintains session state -- negotiated tools, resource subscriptions, initialization handshake. Fine for a single process. Breaks behind a load balancer. Session affinity is the common workaround, but it defeats horizontal scaling.

Our production fix: make every request stateless. Tool schemas resolve per-request from the database, not session memory. The Mcp-Session-Id is for logging correlation only. This adds 15-30ms per tool call but lets us scale horizontally without sticky sessions.

Real latency profile for a production MCP tool call:

  • Schema resolution: 15-30ms (database lookup)
  • Network transport: 5-20ms same region, 50-150ms cross-region
  • Tool execution: 50ms (database read) to 2-5s (external API)
  • JSON-RPC overhead: <1ms (negligible)

The protocol overhead is minimal. What kills you is what's behind it.

What's Coming in 2026

The roadmap targets four areas: transport scalability (fixing stateful sessions without adding new transports), agent communication (the Tasks primitive for agent-initiated workflows), governance (delegated Working Groups instead of bottleneck reviews), and enterprise features (audit trails, SSO auth, gateway behavior).

Security proposals worth watching: SEP-1932 (DPoP proof of key possession) and SEP-1933 (Workload Identity Federation). For more on how MCP is reshaping the agent ecosystem, see our piece on MCP's industry standardization.

Layer 2: A2A -- Agent Coordination

Our customer service agent now has the order data (via MCP). But it also needs to find a replacement laptop -- and that's a different agent's job. The product specialist was built by another team, runs a different framework, and has its own tool stack. A2A is how these agents find each other and hand off work.

Google launched A2A in April 2025. IBM's competing Agent Communication Protocol merged into it by August. By December, both MCP and A2A lived under the Linux Foundation's Agentic AI Foundation, with 100+ enterprises on board.

The key difference from MCP: MCP is client-server (agent calls tool). A2A is peer-to-peer (agent delegates to agent, and both sides can reason).

Agent Cards: Zero-Config Discovery

A2A's most elegant feature is the Agent Card -- a JSON manifest at a well-known URL that describes what an agent can do:

typescript
// Any agent can discover this at /.well-known/agent.json -- no prior config needed
const agentCard = {
  name: "Product Specialist",
  description: "Search catalogs, compare specs, and recommend products by budget",
  url: "https://products.example.com/a2a",  // Where to send A2A tasks
  version: "2.1.0",
  defaultInputModes: ["text"],
  defaultOutputModes: ["text"],
  capabilities: {
    streaming: true,           // Supports real-time updates via SSE
    pushNotifications: true,   // Can notify when async work completes
  },
  skills: [
    {
      id: "find-replacement",
      name: "Find Replacement Product",
      description: "Search for alternatives within a budget and category",
      tags: ["products", "search", "recommendations"],
    },
    {
      id: "compare-specs",
      name: "Compare Product Specs",
      description: "Side-by-side comparison of product specifications",
      tags: ["products", "comparison"],
    },
  ],
  authentication: {
    schemes: ["bearer"],       // How to authenticate A2A requests
    credentials: "oauth2",
  },
};

Think of it as an OpenAPI spec, but for agents. Our customer service agent fetches this card, sees the product specialist can find replacements within a budget, and decides to delegate -- all programmatically.

Task Lifecycle

A2A models work as a state machine. When a client agent sends a task to a remote agent, that task moves through defined states:

Send Task Streaming updates via SSE Needs clarification Client provides input Success Error Abort Return artifacts Client Agent submitted working input-required completed failed canceled
A2A task lifecycle: from submission through processing to completion with streaming support

The input-required state is particularly interesting. It handles the case where the remote agent needs clarification mid-task -- something that happens constantly in real multi-agent workflows. Instead of failing, the protocol has a built-in negotiation loop.

Delegating a Task

Here's how our customer service agent delegates the replacement search:

typescript
import { A2AClient } from "@a2a-protocol/sdk";
 
// Point at the product specialist's base URL -- it hosts /.well-known/agent.json
const client = new A2AClient("https://products.example.com");
 
// Fetch the Agent Card to confirm this agent can do what we need
const card = await client.getAgentCard();
 
// Send a task: plain text, structured as message parts (like chat messages)
const task = await client.sendTask({
  id: crypto.randomUUID(),
  message: {
    role: "user",
    parts: [
      {
        type: "text",
        text: "Find laptop replacements under $1000, in stock, similar to order ORD-8837",
      },
    ],
  },
});
 
// Stream results via SSE as the specialist agent works
for await (const event of client.streamTask(task.id)) {
  if (event.type === "status") {
    console.log(`Status: ${event.status.state}`); // working, input-required, completed...
  }
  if (event.type === "artifact") {
    console.log(`Result: ${event.artifact.parts[0].text}`); // The recommendations
  }
}

A2A uses JSON-over-HTTP with SSE -- standard web infrastructure, no custom transports. Adoption speed over performance optimization.

A2A Is a Negotiation Protocol

Here's the insight most people miss: the input-required state turns A2A into a negotiation protocol, not just a dispatch mechanism. A remote agent can pause mid-execution, explain what it needs, and wait. The client agent reasons about that request, gathers the missing information (possibly via other agents or MCP tools), and resumes.

In our scenario, the product specialist might pause: "I found three options. Two are refurbished. Should I include refurbished?" The customer service agent can answer that without going back to the customer. The SSE stream stays open, events flow as they happen -- no polling, no webhook callbacks.

Performance overhead: discovery is one cold HTTP call (~100-300ms, cached after). Task submission is ~50-200ms per round-trip. SSE streaming adds zero per-message overhead once connected. Total protocol cost for a three-agent workflow: roughly 300-800ms, dwarfed by actual agent processing time.

Layer 3: WebMCP -- The Browser API

Now our product specialist agent needs to search the company's e-commerce catalog. The catalog doesn't have an MCP server -- it's a website. Traditionally, the agent would screenshot the page and parse pixels, or scrape HTML meant for human eyes. Both approaches are expensive, fragile, and break on every UI change.

89% fewer tokens, 98% accuracy. Early WebMCP benchmarks show an 89% token efficiency improvement over screenshot-based methods and 67% less computational overhead. The accuracy gap is even starker: ~98% for WebMCP tool calls vs. ~85% for vision-based screen scraping.

WebMCP flips the model. The website declares its capabilities as typed JavaScript functions, and the agent calls them directly through a browser API. Google and Microsoft published it as a W3C Draft Community Group Report on February 10, 2026. Chrome 146 Canary shipped a preview behind an experimental flag within weeks.

WebMCP introduces navigator.modelContext -- the e-commerce site registers tools that our product specialist can call:

typescript
// The WEBSITE registers this -- not the agent. The site controls what's exposed.
navigator.modelContext.registerTool({
  name: "search_products",
  description: "Search the product catalog by query, category, or price range",
  inputSchema: {
    type: "object",
    properties: {
      query: { type: "string", description: "Search keywords" },
      category: {
        type: "string",
        enum: ["electronics", "clothing", "home", "sports"],
      },
      maxPrice: { type: "number", description: "Maximum price in USD" },
    },
    required: ["query"],
  },
  // execute() runs in the browser context -- same-origin, sandboxed
  async execute(args) {
    const results = await catalog.search(args);
    // Return structured data, not rendered HTML -- the agent gets clean JSON
    return {
      content: [{
        type: "text",
        text: JSON.stringify(
          results.map((p) => ({
            name: p.name, price: p.price, rating: p.rating, url: p.url,
          }))
        ),
      }],
    };
  },
});
 
// Second tool: cart actions. Same pattern as MCP -- deliberate API symmetry.
navigator.modelContext.registerTool({
  name: "add_to_cart",
  description: "Add a product to the shopping cart by product ID",
  inputSchema: {
    type: "object",
    properties: {
      productId: { type: "string", description: "Product ID" },
      quantity: { type: "number", description: "Quantity to add", default: 1 },
    },
    required: ["productId"],
  },
  async execute({ productId, quantity }) {
    const result = await cart.addItem(productId, quantity);
    return {
      content: [{ type: "text", text: `Added ${quantity}x ${result.productName} to cart. Total: $${result.cartTotal}` }],
    };
  },
});

The API surface mirrors MCP's tool definition format deliberately. If you've built MCP tools, you already know how to build WebMCP tools.

The Trust Model Is Inverted

Here's the architectural shift: the website, not the agent, decides what's available. MCP and A2A are agent-initiated (the agent reaches out). WebMCP is site-initiated (the website declares its capabilities). The agent can only call what the site explicitly registers.

This means WebMCP tools are sandboxed by the browser's security model -- same-origin policy, CORS, CSP all apply. An AI agent on a banking site can only call the tools that bank registers. No privilege escalation, no cross-origin access, no DOM state the site didn't expose.

The performance gap against screen scraping is dramatic. A screenshot approach (like Anthropic's Computer Use) captures a ~500KB-2MB viewport image, sends it to a vision model (300-800ms processing plus token costs for image encoding), and attempts to click UI elements with ~85% accuracy. A WebMCP call is a JavaScript function: zero transport overhead, 5-15ms execution, 98% accuracy. No interpretation step, no pixels through the model.

Maturity Check

Let's be direct: WebMCP is early. Chrome 146 Canary only, experimental flag required, no other browsers. The W3C spec is a Draft, not a Recommendation. The benchmarks come from early tests, not production at scale.

But the trajectory is clear. Google and Microsoft are co-developing it. WordPress already has a plugin (webmcp-abilities). If you build agents that interact with web apps, prototype against it now.

All Three Layers, One Workflow

We've been following a single customer request through each layer. Now let's see the full picture -- all three protocols composing in one workflow.

A2A: delegate return task A2A: delegate search task MCP: lookup order MCP: create RMA WebMCP: search catalog MCP: check inventory A2A: return recommendations A2A: return RMA details Compose response Customer Service Agent Returns Agent Product Specialist Agent Order Management Tool Returns System Tool E-commerce Website Inventory Tool Customer
How MCP, A2A, and WebMCP compose in a single customer service workflow
  1. The customer service agent uses A2A to delegate the return to a specialist returns agent and the product search to a product specialist agent.
  2. The returns agent uses MCP to look up the order in the order management system and create a return authorization.
  3. The product specialist uses WebMCP to search the company's e-commerce catalog (exposed as browser-native tools) and MCP to check real-time inventory.
  4. Both specialist agents return results via A2A, and the customer service agent composes the final response.

Here's the orchestrator -- the code that ties all three protocols together:

typescript
import { A2AClient } from "@a2a-protocol/sdk";
 
// KEY INSIGHT: The orchestrator only speaks A2A.
// Specialist agents handle MCP and WebMCP internally.
async function handleReturnAndReplace(orderId: string, maxBudget: number) {
  // Discover specialist agents via their well-known URLs
  const returnsAgent = new A2AClient("https://returns.internal.example.com");
  const productAgent = new A2AClient("https://products.internal.example.com");
 
  // Fire both tasks in parallel -- they're independent workflows
  const [returnTask, searchTask] = await Promise.all([
    // Returns agent: internally uses MCP to call order mgmt + returns systems
    returnsAgent.sendTask({
      id: crypto.randomUUID(),
      message: {
        role: "user",
        parts: [{ type: "text", text: `Process return for order ${orderId}` }],
      },
    }),
    // Product agent: internally uses WebMCP for catalog + MCP for inventory
    productAgent.sendTask({
      id: crypto.randomUUID(),
      message: {
        role: "user",
        parts: [{
          type: "text",
          text: `Find laptop replacements under $${maxBudget} that are in stock`,
        }],
      },
    }),
  ]);
 
  const results = { returnStatus: null, recommendations: null };
 
  // Stream from the returns agent -- watch for negotiation requests
  for await (const event of returnsAgent.streamTask(returnTask.id)) {
    if (event.type === "status" && event.status.state === "input-required") {
      // A2A negotiation: the returns agent needs a reason. We provide it
      // without going back to the customer -- the orchestrator decides.
      await returnsAgent.sendTask({
        id: returnTask.id,
        message: {
          role: "user",
          parts: [{ type: "text", text: "Reason: upgrading to newer model" }],
        },
      });
    }
    if (event.type === "artifact") {
      results.returnStatus = JSON.parse(event.artifact.parts[0].text);
    }
  }
 
  // Collect product recommendations from the specialist
  for await (const event of productAgent.streamTask(searchTask.id)) {
    if (event.type === "artifact") {
      results.recommendations = JSON.parse(event.artifact.parts[0].text);
    }
  }
 
  return results; // Both results compose into the customer's final response
}

Three protocols, three layers, one workflow. The orchestrator only speaks A2A. Specialist agents use whatever they need internally -- MCP for backend tools, WebMCP for browser interactions. The protocols compose without leaking abstractions across boundaries.

Side-by-Side Comparison

MCPA2AWebMCP
PurposeAgent-to-tool communicationAgent-to-agent coordinationAgent-to-browser interaction
Created byAnthropic (Nov 2024)Google (Apr 2025)Google + Microsoft (Feb 2026)
GovernanceAAIF / Linux FoundationAAIF / Linux FoundationW3C Community Group
ProtocolJSON-RPC 2.0JSON-over-HTTP + SSEBrowser JavaScript API
Transportstdio, SSE, Streamable HTTPHTTP + SSEnavigator.modelContext
DiscoveryManual config (.well-known planned)Agent Cards at /.well-known/agent.jsonBrowser-native tool registration
AuthOAuth proposals in progress (SEP-1932/1933); varies by transport todayBearer tokens, OAuth 2.0 declared in Agent CardsBrowser security model (same-origin, CSP)
State modelStateful sessions (scaling pain point); stateless workarounds commonBuilt-in task state machine (6 states) with negotiationStateless per tool invocation
Typical overhead15-30ms schema resolution + transport100-300ms discovery (cached), 50-200ms per task exchange<15ms (in-browser JS execution)
Ecosystem10,000+ servers, 97M monthly SDK downloads, 6 major clients100+ enterprise supporters, TSC at Linux FoundationChrome 146 Canary only (experimental flag)
MaturityProduction (16 months old), active 2026 roadmapProduction for orchestration (11 months old)Early preview (1 month old)
Biggest riskContext window consumption (72% in worst case)Thin ecosystem of A2A-native agentsSingle browser, experimental flag, no stable timeline
Best forConnecting agents to APIs, databases, SaaS toolsMulti-agent workflows across teams/orgsWeb application interaction, replacing screen scraping

The Governance Story

In early 2025, this looked like a standards war. Anthropic had MCP. Google had A2A. IBM had ACP. Twelve months later, they all live under one roof.

  • March 2025: IBM launches ACP, donates it to the Linux Foundation
  • April 2025: Google launches A2A with broad industry backing
  • August 2025: IBM's ACP merges into A2A
  • December 2025: Linux Foundation creates the Agentic AI Foundation (AAIF), co-founded by Anthropic, OpenAI, and Block. MCP and A2A find their permanent home
  • February 2026: Google and Microsoft jointly publish WebMCP as a W3C draft

From three competing protocols to one unified foundation in under a year. HTTP/2 took five years from proposal to RFC. The AI agent stack doesn't have that kind of time.

Which Protocol Do You Need?

Start with what your agent needs to do. The protocol choice follows.

Yes No Yes Yes, same team No, cross-team/cross-org No Yes Yes No No Yes No What does your agent need? Talk to external tools,APIs, or databases? Use MCP Coordinate with otherindependent agents? Same team/codebasecontrols all agents? Direct function calls may suffice.Consider A2A if teams diverge later. Use A2A Interact with webapplications on behalfof a user? Site exposesWebMCP tools? Use WebMCP Screen scraping orsite-specific API integration You may not needa protocol yet Also need agent-to-agentcoordination? MCP alone is sufficient
Protocol decision tree: match your agent's needs to the right protocol layer

The key insight: most production agents today need MCP and nothing else. A2A becomes necessary when you have multiple agent teams that evolve independently. WebMCP becomes relevant when your agent needs to act inside web applications where you don't control the backend.

What to Adopt Now

MCP: Ship it. It's production-ready.

10,000+ servers, SDKs in every major language, client support across all major platforms. Start by connecting to existing MCP servers for tools you already use (GitHub, Slack, CRMs), then build custom servers for your domain. Our MCP tutorial covers the first server step by step.

Three things to know before production: Context consumption is manageable -- scope tool access by agent role, don't expose every tool to every agent. Auth is the wild west -- the spec doesn't mandate a mechanism; most deployments use gateway tokens or API keys until the OAuth proposals (SEP-1932, SEP-1933) land. Session state bites at scale -- pick stateless, sticky sessions, or external session store early; migrating under load is painful.

A2A: Adopt for multi-agent systems.

Start by publishing Agent Cards for your existing agents, even if they don't talk to each other yet. This forces you to define capabilities explicitly. Then add A2A task delegation where you have clear orchestrator-specialist patterns. If all your agents run in one app and you control all the code, A2A may be overkill -- its value scales with the number of teams and frameworks involved.

WebMCP: Prototype now, ship later.

Too new for production. But if you build web applications, start thinking about which user-facing actions could become WebMCP tools: product search, account management, settings. You don't need to ship it yet, but you'll want the mental model ready when stable browser support lands.

The Economics Have Changed

Before these protocols, connecting an agent to ten tools meant ten custom integrations. Coordinating three agents meant a custom orchestration layer. Browser interaction meant fragile screen scrapers. Now each has a standard solution with growing ecosystem support.

Build your agent tools as MCP servers. Define agent capabilities as A2A Agent Cards. Plan for WebMCP when browser support matures. The agents that win in production won't have the best prompts -- they'll have the best connections.

The protocols are young. But the authentication story is getting solved, the context consumption problem is being addressed, enterprise features are coming, and for once, the entire industry is pulling in the same direction.

Connect your agents to any tool with MCP

Chanl's MCP runtime lets you build, connect, and monitor AI agent tools in production. Configure tool access per agent, track execution in real time, and ship with confidence.

Explore MCP on Chanl
DG

Co-founder

Building the platform for AI agents at Chanl — tools, testing, and observability for customer experience.

Learn Agentic AI

One lesson a week — practical techniques for building, testing, and shipping AI agents. From prompt engineering to production monitoring. Learn by doing.

500+ engineers subscribed

Frequently Asked Questions