ChanlChanl
Tools & MCP

Every Website Just Became an AI Agent Tool

Chrome 146 ships navigator.modelContext, a browser-native API that lets websites expose structured tools to AI agents. 89% fewer tokens, 98% task accuracy, zero server infrastructure. Here's how it works and what it means for agent builders.

DGDean GroverCo-founderFollow
March 20, 2026
14 min read read
Browser window with structured tool definitions flowing between a website and an AI agent

Our agent was screenshotting web pages. Every interaction started with a 500KB PNG, 300ms of vision model processing, and an 85% chance the agent would click the right button. On a good day.

The screenshots ate tokens. A single page capture consumed 1,500-2,000 tokens just to describe what a human could see in a glance. Multiply that by every step in a checkout flow, a flight search, a form submission. Our agent burned through context windows like they were free.

Then navigator.modelContext shipped in Chrome 146 Canary.

Instead of sending a screenshot and asking "what do you see?", the website now tells the agent: "Here are my tools, here are their parameters, here is how to call them." Structured JSON. A few hundred bytes. 89% fewer tokens. 98% task accuracy.

This is WebMCP. And it changes the economics of every browser-based AI agent.

Table of Contents

What WebMCP Actually Is

WebMCP (Web Model Context Protocol) is a W3C Draft Community Group Report, jointly developed by Google and Microsoft through the Web Machine Learning Community Group. Released as an early preview on February 10, 2026, it introduces navigator.modelContext, a browser-native JavaScript API that lets websites expose structured, callable tools to AI agents.

The core idea: instead of agents reverse-engineering a page's functionality from its DOM or pixels, the page declares what it can do.

registerTool({ name, schema, execute }) Search flights SFO to JFK Available tools: [searchFlights] Invoke searchFlights({ origin: "SFO", dest: "JFK" }) Execute handler with params Return structured result { flights: [...] } Website Browser AI Agent
WebMCP tool registration and invocation flow

The website registers tools with names, descriptions, JSON schemas, and execute handlers. The browser acts as a secure proxy between the website and the agent. When an agent needs to act, it discovers available tools, picks the right one, and invokes it with structured parameters. No pixel guessing. No DOM parsing. No fragile CSS selectors.

Patrick Brosset, who helped shape the proposal, clarified in his February 2026 update that the API naming evolved from window.agent to navigator.modelContext, and the spec now includes requestUserInteraction() for explicit user confirmation before sensitive actions.

Screenshot vs. WebMCP

Here is the comparison that matters. Every metric below represents the difference between sending a 500KB screenshot through a vision model versus sending a 200-byte JSON tool schema.

MetricScreenshot ApproachWebMCP
Tokens per interaction1,500-2,000~150-200
Token reductionBaseline89% fewer
Task accuracy~85% (best case)~98%
Computational overheadFull vision model inference67% reduction
Latency300-800ms (screenshot + vision)<50ms (JSON schema)
Breaks on UI changeYes, constantlyNo (schema-driven)
Handles dynamic contentPoorlyNatively
Auth session accessRequires cookie injectionInherits user session
Infrastructure neededHeadless browser + vision APINone (browser-native)

The 89% token reduction alone changes the economics. An agent that processes 1,000 web interactions per day goes from burning ~1.8M tokens on screenshots to ~180K on structured schemas. At current API pricing, that is real money.

Conventional wisdom says browser agents need vision models to understand web pages. The data says otherwise: structured tool schemas achieve 98% task accuracy versus 85% for screenshot-based approaches. Screenshot agents fail when a button moves, when content loads dynamically, when a modal overlays the target element. WebMCP tools are schema contracts. The website says "I accept these parameters and return these results." UI changes don't break the contract.

The Declarative API

The simplest path to WebMCP requires zero JavaScript. If your website already has HTML forms, you are five attributes away from being agent-ready.

html
<!-- Adding three attributes turns an existing form into a WebMCP tool.
     The browser auto-generates a tool schema from the form fields.
     toolautosubmit lets the agent submit without user clicking. -->
<form
  toolname="searchFlights"
  tooldescription="Search for available flights between two airports"
  toolautosubmit="true"
  action="/api/flights/search"
  method="GET"
>
  <!-- toolparamdescription gives the agent context about each field.
       Without it, the agent only sees the field name "origin". -->
  <label for="origin">Origin Airport</label>
  <input
    name="origin"
    type="text"
    required
    pattern="[A-Z]{3}"
    toolparamdescription="Three-letter IATA airport code (e.g., SFO, JFK, LAX)"
  />
 
  <label for="destination">Destination Airport</label>
  <input
    name="destination"
    type="text"
    required
    pattern="[A-Z]{3}"
    toolparamdescription="Three-letter IATA destination airport code"
  />
 
  <label for="date">Travel Date</label>
  <input name="date" type="date" required />
 
  <!-- min/max constraints become schema validation rules automatically -->
  <label for="passengers">Passengers</label>
  <input name="passengers" type="number" min="1" max="9" value="1" />
 
  <button type="submit">Search Flights</button>
</form>

The browser reads the toolname and tooldescription attributes, infers input parameters from form field names and types, and registers a tool that agents can discover and invoke. The required attribute becomes a required field in the JSON schema. The pattern attribute becomes a validation constraint. min/max on number inputs become numeric bounds.

When an agent invokes searchFlights, the browser fills in the form fields with the provided values. Without toolautosubmit, it waits for the user to click submit, keeping the human in the loop. With it, the form submits automatically.

This is the path for the 80% of web interactions that are already form-based. Contact forms, search bars, checkout flows, booking systems. Add the attributes, and your forms become tools.

The Imperative API

For interactions that go beyond form submission, dynamic workflows, multi-step processes, or anything that requires JavaScript execution, the Imperative API gives you full control.

typescript
// Guard against browsers without WebMCP support.
// This check is essential -- the API only exists in Chrome 146+ with the flag enabled.
if (!('modelContext' in navigator)) {
  console.log('WebMCP not available in this browser');
  return;
}
 
// Register a tool that adds items to a shopping cart.
// The execute handler runs in the browser context with full access
// to the page's state, DOM, and the user's authenticated session.
navigator.modelContext.registerTool({
  name: "addToCart",
 
  // This description is what the agent reads to decide when to use the tool.
  // Be specific. "Add item" is too vague -- the agent won't know what "item" means.
  description: "Add a product to the user's shopping cart by SKU and quantity",
 
  // JSON Schema defines the contract. The agent sends structured params,
  // not free-text. This is why accuracy hits 98%.
  inputSchema: {
    type: "object",
    properties: {
      sku: {
        type: "string",
        description: "Product SKU identifier (e.g., 'SHOE-RED-42')"
      },
      quantity: {
        type: "integer",
        minimum: 1,
        maximum: 10,
        description: "Number of items to add (1-10)"
      }
    },
    required: ["sku", "quantity"]
  },
 
  // The execute handler is an async function.
  // It receives validated params and a client object for user interaction.
  execute: async (params, client) => {
    const { sku, quantity } = params;
 
    // Call your existing cart API -- no new backend needed.
    const response = await fetch('/api/cart/add', {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({ sku, quantity })
    });
 
    const result = await response.json();
 
    // Return structured data the agent can reason about.
    return {
      type: "text",
      text: JSON.stringify({
        success: true,
        cartTotal: result.total,
        itemCount: result.itemCount
      })
    };
  }
});

The execute handler runs in the page's JavaScript context. It has access to everything your frontend code already has: the DOM, fetch, localStorage, the user's cookies and session. No separate backend server. No API proxy. No headless browser infrastructure. Remember that 500KB screenshot pipeline from our opening? This replaces the entire thing with a function call.

For tools that modify data or cost money, the client parameter provides requestUserInteraction():

typescript
// For sensitive actions, pause and ask the user before proceeding.
// The browser shows a native confirmation dialog -- not a custom modal
// that an agent could dismiss programmatically.
execute: async (params, client) => {
  // requestUserInteraction() pauses agent execution and shows
  // a browser-native prompt. The agent cannot bypass this.
  const confirmed = await client.requestUserInteraction({
    message: `Complete purchase of ${params.itemName} for $${params.price}?`
  });
 
  if (!confirmed) {
    return { type: "text", text: JSON.stringify({ cancelled: true }) };
  }
 
  // Only proceeds after explicit user consent
  const result = await processPayment(params);
  return { type: "text", text: JSON.stringify(result) };
}

You can also provide ambient context without registering a tool, useful for giving agents background information about the current page state:

typescript
// provideContext() shares read-only information with agents.
// No tool invocation, no user prompt -- just structured context.
navigator.modelContext.provideContext({
  name: "currentUserProfile",
  description: "The logged-in user's profile and preferences",
  data: {
    name: "Jane Smith",
    tier: "premium",
    preferredLanguage: "en",
    recentOrders: 12
  }
});

WebMCP's security model is the reason this can work at all. The browser is the trust boundary, not the agent.

HTTPS required. The API only works in secure contexts. No HTTP, no file://, no exceptions.

Same-origin policy. Tools inherit the page's origin boundary. A tool registered on flights.example.com cannot access data from bank.example.com. This is the same security model that protects every other browser API.

User consent is mandatory. The browser mediates every tool invocation. For sensitive actions, requestUserInteraction() shows a native browser prompt that the agent cannot dismiss or bypass. The user sees exactly what the agent wants to do and decides whether to allow it.

Session inheritance, not injection. Tools run within the user's existing authenticated session. If the user is logged into their bank, the tool operates with those credentials. No cookie injection. No credential passing. But also no access beyond what the user already has.

Per-invocation permissions. The Permission and Consent Manager ensures only tools with explicit user approval can execute. This is the same pattern as geolocation, camera, and microphone access: the browser asks, the user decides.

This design specifically mitigates what researchers call the "deadly triad" scenario, where an agent has simultaneous access to multiple sensitive tabs. WebMCP's domain-level isolation means a tool on one origin cannot reach another, even if the same agent is interacting with both.

WebMCP vs. Server-Side MCP

If you have been building with MCP, the natural question is: does WebMCP replace it? No. They solve different problems.

DimensionServer-Side MCPWebMCP
Runs whereBackend serverBrowser (client-side)
ProtocolJSON-RPC 2.0JavaScript API
TransportStreamable HTTP, stdioBrowser internal
Auth modelOAuth 2.1, API keysUser's browser session
User presentNo (headless)Yes (always)
PrimitivesTools, Resources, PromptsTools, Context
InfrastructureMCP server deploymentZero (browser-native)
Use caseService-to-service, backend automationBrowser-based, user-facing

The Chrome for Developers blog put it clearly: backend MCP handles service-to-service automation where no browser UI is needed. WebMCP handles interactions where the user is present and the agent benefits from shared visual context.

In practice, most production systems will use both. A travel company maintains a server-side MCP server for direct API integrations with Claude, ChatGPT, and other platforms. Simultaneously, it implements WebMCP tools on its consumer website so browser-based agents can interact with the booking flow in the user's authenticated session.

The layering looks like this:

Backend (Server-Side MCP) Browser (WebMCP) AI Agent User Session MCP Server Flight Search API Booking API Payment API navigator.modelContext Search Form Tool Add to Cart Tool Checkout Tool
WebMCP and server-side MCP complement each other

Server-side MCP for headless automation. WebMCP for user-present browser interactions. Same agent, same tools conceptually, different execution environments.

What This Means for Agents

WebMCP shifts the agent-website relationship from adversarial to cooperative. Today, agents scrape, guess, and break. Tomorrow, websites publish tool contracts and agents call them reliably.

For agent builders: Your browser automation stack gets simpler. No more maintaining screenshot pipelines, vision model integrations, or brittle CSS selectors. If the target website supports WebMCP, you get structured tool access with guaranteed parameter validation and typed responses. The tool management patterns you already use for server-side MCP translate directly.

For website owners: WebMCP is the new structured data. Just as Schema.org markup made your content machine-readable for search engines, WebMCP tool registration makes your functionality machine-callable for agents. Early adopters report that agent-ready websites get preferential treatment from agent platforms because they are cheaper (89% fewer tokens) and more reliable (98% accuracy) to interact with.

For platform teams: If you are building agent infrastructure, WebMCP tools should be first-class citizens alongside server-side MCP tools. An agent orchestrating a customer workflow might call a backend MCP server to check inventory, then invoke a WebMCP tool on the retailer's website to complete the purchase in the user's session. Your monitoring and analytics need to track both.

For the agentic web: Dan Petrovic called WebMCP "the biggest shift in technical SEO since structured data." When agents choose which websites to interact with, they will prefer sites that publish tool contracts over sites that require screen scraping. The cost differential alone (89% token savings) makes WebMCP-enabled sites the economically rational choice for every agent platform.

Getting Started Today

WebMCP is early. Chrome 146 Canary only, behind a flag. The spec is a draft, and breaking changes are expected. But the developer experience is ready for prototyping.

Step 1: Enable the flag. Open chrome://flags in Chrome Canary (146+), search for "WebMCP for testing," and enable it.

Step 2: Start with the Declarative API. Pick one form on your site. Add toolname, tooldescription, and toolparamdescription attributes. Verify with the Model Context Tool Inspector Chrome extension.

Step 3: Graduate to the Imperative API. For dynamic interactions, register tools with navigator.modelContext.registerTool(). Always guard with if ('modelContext' in navigator) for graceful fallback.

Step 4: Test with real agents. Connect a browser-based agent (or use Chrome's built-in Gemini integration) and verify that tool discovery, invocation, and result handling work end-to-end.

Step 5: Layer with server-side MCP. Your existing MCP server handles headless workflows. WebMCP handles browser interactions. Same tool catalog, different transports.

The spec lives at github.com/webmachinelearning/webmcp. The W3C Community Group is active and accepting feedback. If you are building agents that interact with websites, now is the time to shape the standard.


WebMCP turns the web from something agents scrape into something agents use. That is not an incremental improvement. It is a category change.

The screenshot-and-pray era had a good run. Structured tool contracts are better in every dimension: cost, accuracy, reliability, security. The only question is adoption speed, and with Google and Microsoft jointly pushing the spec through W3C, the infrastructure is moving fast.

Our agent stopped screenshotting web pages. Yours should too.

Build agents with production tool infrastructure

Chanl gives your AI agents managed tools, MCP integration, and monitoring. When WebMCP-enabled websites become the norm, your agents will be ready.

Explore Chanl Tools
DG

Co-founder

Building the platform for AI agents at Chanl — tools, testing, and observability for customer experience.

Learn Agentic AI

One lesson a week — practical techniques for building, testing, and shipping AI agents. From prompt engineering to production monitoring. Learn by doing.

500+ engineers subscribed

Frequently Asked Questions