ChanlChanl
Tools & MCP

Your agent has 30 tools and no idea when to use them

MCP tools give agents external capabilities. Skills give agents behavioral expertise. Learn the architecture of both, build them in TypeScript, and understand when to use each — and when you need both.

DGDean GroverCo-founderFollow
March 13, 2026
14 min read read
Watercolor illustration of two interlocking systems — tools and behavioral instructions — powering an AI agent

Your agent has 30 tools and no idea when to use them. It calls the refund API when a customer asks about return policies. It schedules a callback when it should just check order status. Every tool works. The agent still fails.

Or flip it: your agent has meticulous behavioral instructions — it knows exactly how to handle angry customers, when to escalate, how to phrase apologies. But when a customer asks "where's my package?" it can only say "Let me look into that for you" and then do nothing, because it has no tools to actually look anything up.

This is the tension between MCP tools and agent skills. Tools give agents the ability to act. Skills give agents the judgment to act well. The question everyone asks is "which pattern wins?" The answer: neither, because they're not competing. They're two halves of the same architecture.

Prerequisites: Node.js 20+ with TypeScript, an Anthropic API key, and basic familiarity with MCP concepts (see MCP Explained if needed). Install dependencies: npm install @modelcontextprotocol/sdk @anthropic-ai/sdk zod

What are MCP tools and agent skills?

MCP tools are schema-driven capabilities exposed through a standardized protocol that let agents execute actions on external systems — looking up orders, querying databases, calling APIs. Agent skills are domain-specific instruction sets, typically written in markdown, that guide how an agent behaves when handling tasks — when to escalate, what tone to use, how to sequence multi-step workflows.

MCP tools: the capability layer

MCP (Model Context Protocol) is an open standard built on JSON-RPC 2.0 that defines how AI applications discover and invoke external tools. Each tool has a name, a description the LLM reads for selection, and an input schema that validates parameters. The protocol handles capability negotiation, transport, and execution semantics.

A tool definition looks like this:

typescript
// MCP tool: deterministic, schema-validated, external
{
  name: "lookup_order",
  description: "Retrieves order status, tracking number, and ETA by order ID",
  inputSchema: {
    type: "object",
    properties: {
      orderId: { type: "string", description: "Order ID (ORD-XXXXX)" }
    },
    required: ["orderId"]
  }
}

The key property of a tool is determinism. Given the same input, the tool either succeeds or fails. There's no ambiguity about what lookup_order does — it hits an API and returns structured data. The LLM can't misinterpret the execution; it can only misinterpret when to call it.

Agent skills: the behavior layer

A skill is a structured set of instructions that an agent loads into its context when handling a specific type of task. Unlike tools, skills don't execute code or call APIs. They shape how the agent reasons about a problem.

markdown
# Skill: Refund Handling
 
## When to activate
Customer mentions refund, return, money back, or chargeback.
 
## Rules
1. Always check order status before discussing refund eligibility
2. Orders within 30 days: offer full refund immediately
3. Orders 30-90 days: offer store credit, escalate if customer insists on cash refund
4. Orders beyond 90 days: apologize, explain policy, offer goodwill discount on next order
5. Never promise a refund timeline — say "within 5-7 business days" only after refund is confirmed
 
## Tone
Empathetic but direct. Acknowledge frustration before explaining policy.

The key property of a skill is adaptability. The same skill produces different agent behavior depending on the conversation context. A refund skill applied to an angry customer on their third call produces different reasoning than the same skill applied to a first-time caller with a simple return request.

The fundamental asymmetry

Here's what makes this distinction load-bearing in production:

PropertyMCP toolsAgent skills
Executes codeYes — calls APIs, queries DBsNo — shapes reasoning only
DeterministicYes — same input, same executionNo — context-dependent behavior
Schema-validatedYes — JSON Schema for inputs/outputsNo — natural language, open to interpretation
DiscoveryProtocol-level (MCP handshake)Context-level (loaded into prompt)
Failure modeAPI errors, timeouts, auth failuresMisinterpretation, hallucination, instruction drift
GovernanceMCP Policy ControlsPrompt testing, eval frameworks

Why this distinction matters now

The problem isn't that teams don't know about either pattern. It's that they're conflating them.

Over-tooled agents have 40 MCP tools and no behavioral guidance. The agent can look up orders, check inventory, process refunds, schedule callbacks, and search the knowledge base. But it doesn't know when to use which. It calls the refund API when the customer just wants to know their return window. It schedules a callback when it could resolve the issue itself.

Over-skilled agents have detailed behavioral instructions for every scenario but no tools to act on them. The agent knows the refund policy inside out, knows exactly when to escalate, crafts empathetic responses — and then says "I'll make sure our team looks into that" because it can't actually look anything up.

I want a refund for order ORD-48291 Load refund-handling skill Check order status first, 30-day policy, empathetic tone Reasoning: skill says check status before discussing refund Execute lookup_order({ orderId: "ORD-48291" }) GET /orders/ORD-48291 { status: "delivered", date: "2026-02-20", total: 89.99 } Order delivered 21 days ago — within 30-day window Reasoning: skill says within 30 days → offer full refund Execute process_refund({ orderId: "ORD-48291", amount: 89.99 }) POST /refunds { refundId: "REF-1234", status: "processing" } Refund initiated I've initiated a full refund of $89.99. You'll see it within 5-7 business days. User Agent (LLM) Skills (Behavior) MCP Tools (Capabilities) External Systems
How tools and skills serve different roles in the agent reasoning loop

Notice how the skill and tool interleave. The skill tells the agent to check the order before discussing refunds. The tool does the checking. The skill applies the 30-day policy to the result. The tool processes the refund. Neither layer works alone.

How MCP tools work under the hood

MCP tools are external capabilities exposed through a JSON-RPC 2.0 protocol with schema-driven discovery, structured input validation, and deterministic execution. The protocol handles capability negotiation at connection time, so the agent learns what tools exist before it needs them.

Building an MCP tool server

Here's an MCP server that exposes customer support tools — the same tools you'd compose with skills later:

typescript
import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { StdioServerTransport } from "@modelcontextprotocol/sdk/server/stdio.js";
import { z } from "zod";
 
const server = new McpServer({
  name: "customer-support-tools",
  version: "1.0.0",
});
 
// --- Stub implementations (replace with real API calls in production) ---
 
async function fetchOrder(orderId: string) {
  return {
    id: orderId,
    status: "delivered",
    carrier: "UPS",
    estimatedDelivery: "2026-03-05",
    deliveredDate: "2026-03-01",
  };
}
 
async function initiateRefund(
  orderId: string,
  amount: number,
  reason?: string,
) {
  return { id: `REF-${Date.now()}`, orderId, amount, reason };
}
 
async function createEscalation(
  reason: string,
  priority: string,
  sentiment?: string,
) {
  return {
    id: `TKT-${Date.now()}`,
    estimatedWait: priority === "high" ? "2 minutes" : "10 minutes",
    team: priority === "high" ? "senior-support" : "general-support",
  };
}
 
// Tool 1: Order lookup
server.tool(
  "lookup_order",
  "Retrieves current status, tracking number, carrier, and estimated " +
    "delivery date for a customer order. Use when a customer asks about " +
    "shipping, delivery, or order status.",
  {
    orderId: z.string().describe("Order ID in format ORD-XXXXX"),
  },
  async ({ orderId }) => {
    const order = await fetchOrder(orderId);
    if (!order) {
      return {
        content: [{ type: "text", text: `Order ${orderId} not found` }],
        isError: true,
      };
    }
    return {
      content: [{
        type: "text",
        text: JSON.stringify({
          orderId,
          status: order.status,
          carrier: order.carrier,
          eta: order.estimatedDelivery,
          deliveredDate: order.deliveredDate,
        }),
      }],
    };
  }
);
 
// Tool 2: Refund processing
server.tool(
  "process_refund",
  "Initiates a refund for a delivered order. Returns refund ID and " +
    "processing timeline. Only call after confirming order eligibility.",
  {
    orderId: z.string().describe("Order ID to refund"),
    amount: z.number().describe("Refund amount in dollars"),
    reason: z.string().optional().describe("Reason for refund"),
  },
  async ({ orderId, amount, reason }) => {
    const refund = await initiateRefund(orderId, amount, reason);
    return {
      content: [{
        type: "text",
        text: JSON.stringify({
          refundId: refund.id,
          status: "processing",
          estimatedCompletion: "5-7 business days",
        }),
      }],
    };
  }
);
 
// Tool 3: Escalation
server.tool(
  "escalate_to_human",
  "Transfers the conversation to a human support agent. Use when the " +
    "customer's issue requires judgment beyond automated handling, or " +
    "when the customer explicitly requests a human.",
  {
    reason: z.string().describe("Summary of why escalation is needed"),
    priority: z.enum(["low", "medium", "high"]).describe("Urgency level"),
    customerSentiment: z.enum(["neutral", "frustrated", "angry"])
      .optional()
      .describe("Customer's emotional state"),
  },
  async ({ reason, priority, customerSentiment }) => {
    const ticket = await createEscalation(reason, priority, customerSentiment);
    return {
      content: [{
        type: "text",
        text: JSON.stringify({
          ticketId: ticket.id,
          estimatedWait: ticket.estimatedWait,
          assignedTeam: ticket.team,
        }),
      }],
    };
  }
);
 
async function main() {
  const transport = new StdioServerTransport();
  await server.connect(transport);
}
 
main().catch(console.error);

Each tool is self-contained: name, description, typed schema, deterministic handler. The MCP client discovers all three tools through the protocol handshake. The LLM reads descriptions to decide which tool to call. Inputs are validated against the schema before the handler executes.

Progressive discovery

Before the MCP specification introduced progressive discovery, loading 50 MCP tools could consume tens of thousands of tokens — tool descriptions and full schemas were loaded into context immediately. Progressive discovery changed this with a two-stage approach:

  1. Stage 1: Load compact summaries — name and description only, 20-50 tokens per tool
  2. Stage 2: When the LLM selects a tool, load the full input schema, output schema, and extended description

In our testing with a 50-tool setup, this reduced token usage from roughly 77,000 tokens to around 8,700 — an 85% reduction. This is the breakthrough that made large tool inventories practical without drowning the context window.

How agent skills work under the hood

Agent skills are structured instruction sets — typically markdown files with YAML metadata — that an agent loads into its context on demand. Unlike tools, skills don't execute anything. They modify how the agent reasons about a problem by providing domain expertise, decision rules, and behavioral guidelines.

Anatomy of a skill

A skill is a directory containing a definition file (usually SKILL.md), optional scripts, and reference materials:

text
skills/
  refund-handling/
    SKILL.md           # Instructions + metadata
    refund-policy.md   # Reference: current policy document
    escalation-tree.md # Reference: who handles what
  tone-and-empathy/
    SKILL.md
    examples.md        # Good/bad response examples

The skill definition uses YAML frontmatter for metadata and markdown for instructions:

markdown
---
name: refund-handling
description: >
  Guides the agent through refund eligibility checks, policy application,
  and customer communication for return and refund requests.
triggers:
  - refund
  - return
  - money back
  - chargeback
dependencies:
  tools:
    - lookup_order
    - process_refund
    - escalate_to_human
---
 
# Refund Handling
 
## Decision Flow
 
1. **Always check order status first** — call lookup_order before discussing any refund
2. **Apply the 30/90 rule:**
   - Delivered within 30 days → offer full refund immediately
   - Delivered 30-90 days ago → offer store credit; escalate if customer insists on cash
   - Delivered beyond 90 days → explain policy, offer 15% goodwill discount
3. **Never promise a timeline** until the refund is confirmed in the system
4. **Never deny a refund without offering an alternative** (store credit, exchange, discount)
 
## Escalation Triggers
 
Escalate to human immediately if:
- Customer mentions legal action or regulatory complaint
- Customer has called about same issue 3+ times (check interaction history)
- Refund amount exceeds $500
- Product is flagged for safety recall
 
## Tone Rules
 
- Acknowledge frustration before explaining policy
- Use the customer's name after the first exchange
- Never say "unfortunately" — rephrase as what you CAN do
- End with a specific next step, not a vague promise

Loading skills in code

Here's a skill loader that discovers and applies skills at runtime:

typescript
import { readFileSync, readdirSync, existsSync } from "fs";
import { join } from "path";
 
interface SkillMetadata {
  name: string;
  description: string;
  triggers: string[];
  dependencies?: {
    tools?: string[];
    skills?: string[];
  };
}
 
interface Skill {
  metadata: SkillMetadata;
  instructions: string;
  references: Map<string, string>;
}
 
class SkillRegistry {
  private skills = new Map<string, Skill>();
 
  loadFromDirectory(skillsDir: string): void {
    const entries = readdirSync(skillsDir, { withFileTypes: true });
 
    for (const entry of entries) {
      if (!entry.isDirectory()) continue;
 
      const skillPath = join(skillsDir, entry.name, "SKILL.md");
      if (!existsSync(skillPath)) continue;
 
      const raw = readFileSync(skillPath, "utf-8");
      const { metadata, body } = this.parseFrontmatter(raw);
 
      // Load reference files
      const refs = new Map<string, string>();
      const refDir = join(skillsDir, entry.name);
      for (const file of readdirSync(refDir)) {
        if (file !== "SKILL.md" && file.endsWith(".md")) {
          refs.set(file, readFileSync(join(refDir, file), "utf-8"));
        }
      }
 
      this.skills.set(metadata.name, {
        metadata,
        instructions: body,
        references: refs,
      });
    }
  }
 
  // Match skills based on user message content
  findRelevantSkills(userMessage: string): Skill[] {
    const lower = userMessage.toLowerCase();
    return Array.from(this.skills.values()).filter((skill) =>
      skill.metadata.triggers.some((trigger) => lower.includes(trigger))
    );
  }
 
  // All registered skills (for dependency validation)
  all(): Skill[] {
    return Array.from(this.skills.values());
  }
 
  // Build the context injection for the LLM
  buildContext(skills: Skill[]): string {
    return skills
      .map((skill) => {
        let context = `## Skill: ${skill.metadata.name}\n\n`;
        context += skill.instructions;
        for (const [name, content] of skill.references) {
          context += `\n\n### Reference: ${name}\n\n${content}`;
        }
        return context;
      })
      .join("\n\n---\n\n");
  }
 
  private parseFrontmatter(raw: string): {
    metadata: SkillMetadata;
    body: string;
  } {
    const match = raw.match(/^---\n([\s\S]*?)\n---\n([\s\S]*)$/);
    if (!match) throw new Error("Invalid SKILL.md format");
    // Simple YAML parsing — production code should use a YAML library
    const yamlStr = match[1];
    const metadata = this.parseYaml(yamlStr) as SkillMetadata;
    return { metadata, body: match[2].trim() };
  }
 
  private parseYaml(yaml: string): Record<string, unknown> {
    // Simplified — use 'yaml' package in production
    const result: Record<string, unknown> = {};
    const lines = yaml.split("\n");
    let currentKey = "";
    let currentList: string[] = [];
 
    for (const line of lines) {
      if (line.startsWith("  - ")) {
        currentList.push(line.replace("  - ", "").trim());
        result[currentKey] = currentList;
      } else if (line.includes(":")) {
        if (currentList.length > 0) currentList = [];
        const [key, ...valueParts] = line.split(":");
        currentKey = key.trim();
        const value = valueParts.join(":").trim();
        if (value) result[currentKey] = value;
      }
    }
    return result;
  }
}

Four composition patterns

Skills don't exist in isolation. An agent handling a customer conversation might need refund knowledge, tone guidelines, and product-specific policies simultaneously. How you compose skills determines how the agent reasons:

Flat composition — all skills load into the same context. The agent sees every instruction at once and reasons across them. Simple, but context-heavy for large skill sets.

Hierarchical composition — a parent skill delegates to child skills. A "customer-support" meta-skill might invoke "refund-handling" for refund cases and "shipping-inquiry" for tracking questions. Keeps context focused.

Sequential composition — skills form a pipeline. A "triage" skill classifies the request, passes it to a domain-specific skill, which passes it to a "response-formatting" skill. Each skill transforms the output for the next.

Parallel composition — multiple skills provide simultaneous input. A "product-knowledge" skill and a "customer-history" skill both inform the agent's reasoning about the same request. The agent synthesizes their guidance.

Most production agents use flat or hierarchical composition. Sequential is common for multi-step workflows. Parallel suits situations where the agent needs to balance competing concerns — like a compliance skill and a customer-satisfaction skill that sometimes conflict.

The decision matrix: tools vs skills vs both

Use MCP tools when the agent needs to take deterministic action on external systems. Use skills when the agent needs domain expertise, workflow guidance, or behavioral rules. Use both when the agent is customer-facing and needs to act reliably while following business logic.

ScenarioUse MCP toolsUse skillsUse both
Look up order statusYes
Know when to offer a refund vs store creditYes
Handle a refund request end-to-endYes
Call a third-party APIYes
Follow a multi-step escalation policyYes
Process a return while following company tone guidelinesYes
Execute a database queryYes
Decide which of 30 tools to use for a given requestYes
Customer-facing support agentYes

The pattern: tools for actions, skills for decisions, both for production agents.

One useful heuristic: if the capability can be tested with a unit test (input → expected output), it's a tool. If it needs a conversation-level eval to assess quality, it's a skill.

Building the composed agent

Here's a complete agent that uses MCP tools for actions and skills for behavioral guidance. The agent handles customer support requests — it discovers available tools via MCP, loads relevant skills based on the conversation, and interleaves tool calls with skill-guided reasoning.

typescript
import Anthropic from "@anthropic-ai/sdk";
import { Client } from "@modelcontextprotocol/sdk/client/index.js";
import { StdioClientTransport } from "@modelcontextprotocol/sdk/client/stdio.js";
 
interface ConversationMessage {
  role: "user" | "assistant";
  content: string;
}
 
async function createSupportAgent() {
  // 1. Connect to MCP server for tools
  const mcpClient = new Client({
    name: "support-agent",
    version: "1.0.0",
  });
 
  const transport = new StdioClientTransport({
    command: "node",
    args: ["./customer-support-tools.js"],
  });
  await mcpClient.connect(transport);
 
  // 2. Discover available tools
  const { tools: mcpTools } = await mcpClient.listTools();
  const anthropicTools = mcpTools.map((tool) => ({
    name: tool.name,
    description: tool.description ?? "",
    input_schema: tool.inputSchema as Anthropic.Tool["input_schema"],
  }));
 
  // 3. Load skills
  const skillRegistry = new SkillRegistry();
  skillRegistry.loadFromDirectory("./skills");
 
  // 4. Create the agent loop
  const anthropic = new Anthropic();
  const conversationHistory: ConversationMessage[] = [];
 
  async function handleMessage(userMessage: string): Promise<string> {
    conversationHistory.push({ role: "user", content: userMessage });
 
    // Find relevant skills for this message
    const relevantSkills = skillRegistry.findRelevantSkills(userMessage);
    const skillContext = skillRegistry.buildContext(relevantSkills);
 
    // Build system prompt with skill context
    const systemPrompt = buildSystemPrompt(skillContext);
 
    // Agent loop: call Claude, execute tools, repeat until done
    let messages = conversationHistory.map((m) => ({
      role: m.role as "user" | "assistant",
      content: m.content,
    }));
 
    while (true) {
      const response = await anthropic.messages.create({
        model: "claude-sonnet-4-20250514",
        max_tokens: 4096,
        system: systemPrompt,
        tools: anthropicTools,
        messages,
      });
 
      // If the model wants to use a tool, execute it via MCP
      if (response.stop_reason === "tool_use") {
        const toolUseBlock = response.content.find(
          (block) => block.type === "tool_use"
        );
 
        if (toolUseBlock && toolUseBlock.type === "tool_use") {
          console.log(`  [Tool call: ${toolUseBlock.name}]`);
 
          const toolResult = await mcpClient.callTool({
            name: toolUseBlock.name,
            arguments: toolUseBlock.input as Record<string, unknown>,
          });
 
          // Feed result back to Claude
          messages = [
            ...messages,
            { role: "assistant" as const, content: response.content },
            {
              role: "user" as const,
              content: [
                {
                  type: "tool_result" as const,
                  tool_use_id: toolUseBlock.id,
                  content: JSON.stringify(toolResult.content),
                },
              ],
            },
          ];
          continue; // Let Claude reason about the tool result
        }
      }
 
      // Extract final text response
      const textBlock = response.content.find(
        (block) => block.type === "text"
      );
      const assistantMessage = textBlock?.type === "text"
        ? textBlock.text
        : "";
      conversationHistory.push({ role: "assistant", content: assistantMessage });
      return assistantMessage;
    }
  }
 
  return { handleMessage, close: () => mcpClient.close() };
}
 
function buildSystemPrompt(skillContext: string): string {
  let prompt = `You are a customer support agent. Be helpful, concise, and empathetic.`;
 
  if (skillContext) {
    prompt += `\n\n# Active Skills\n\n`;
    prompt += `Follow these instructions carefully when handling the customer's request:\n\n`;
    prompt += skillContext;
  }
 
  prompt += `\n\n# Tool Usage Rules\n`;
  prompt += `- Always check order status before discussing refund eligibility\n`;
  prompt += `- Never call process_refund without first confirming with the customer\n`;
  prompt += `- If unsure, ask the customer for clarification rather than guessing\n`;
 
  return prompt;
}

The key architectural choice: tools are discovered via MCP protocol, skills are loaded based on message content. The agent doesn't load all skills for every message — it matches triggers and loads only what's relevant. This mirrors how progressive discovery works for tools: start with summaries, load details on demand.

mcp-config.json
Live
{
"mcpServers":
{
"chanl":
{
"url": "https://app.chanl.ai/mcp",
"transport": "sse",
"apiKey": "sk-chanl-...a4f2"
}
}
}
Tools
12 connected
Memory
Active
Knowledge
3 sources

What breaks: five failure modes

Both patterns have failure modes that only surface in production. Knowing them in advance saves you the debugging sessions.

1. Tool sprawl without skills

The agent has 40 MCP tools and no behavioral guidance. Tool selection accuracy degrades above 15-20 tools because the LLM has to read every description and pick the right one. Without skills to narrow the scope — "for refund requests, use these three tools in this order" — the agent makes wrong selections.

Symptom: The agent calls check_inventory when the customer asked about a refund.

Fix: Add a triage skill that maps request types to tool subsets. Or use toolsets to group related tools and only expose the relevant group per conversation context.

2. Skills without tools

The agent has rich behavioral instructions but no way to act on them. It knows the refund policy but can't look up the order. It knows when to escalate but can't create a ticket.

Symptom: Every response ends with "Let me look into that for you" followed by silence.

Fix: Audit skills for tool dependencies. Every skill that references an action ("check the order", "process the refund") should declare dependencies.tools in its metadata. Validate at startup that every declared tool exists in the MCP registry.

typescript
// Validate skill-tool dependencies at startup
function validateDependencies(
  skills: SkillRegistry,
  mcpTools: string[]
): string[] {
  const missing: string[] = [];
  for (const skill of skills.all()) {
    const deps = skill.metadata.dependencies?.tools ?? [];
    for (const toolName of deps) {
      if (!mcpTools.includes(toolName)) {
        missing.push(`Skill "${skill.metadata.name}" requires ` +
          `tool "${toolName}" which is not registered`);
      }
    }
  }
  return missing;
}

3. Skill-tool mismatch

The skill says "check order status" but the tool is named get_order_details. The skill says "escalate to a manager" but the tool only supports escalate_to_human without role targeting. The instructions and capabilities are out of sync.

Symptom: The agent calls the wrong tool because the skill's language doesn't match the tool's description, or it hallucinates a tool that doesn't exist.

Fix: Skills should reference tools by exact name, not by description. Include tool names in skill instructions:

markdown
## Tools Available
- `lookup_order` — get order status (call this, not "check order status")
- `process_refund` — initiate refund (requires orderId and amount)
- `escalate_to_human` — transfer to human agent (no role targeting available)

4. Context budget overflow

Too many tools plus too many skills exceeds the context window. Progressive discovery helps for tools, but skills are still loaded as full text. A skill with policy documents, examples, and reference materials can consume thousands of tokens.

Symptom: The agent starts ignoring later instructions because they were truncated or deprioritized by the LLM's attention mechanism.

Fix: Apply progressive discovery to skills too. Load only the frontmatter (name, description, triggers) initially. Load full instructions only when a skill is matched. Keep reference materials out of the main context — load them as follow-up context only when the agent needs them.

typescript
// Lightweight skill matching — metadata only
function getSkillSummaries(skills: Skill[]): string {
  return skills
    .map((s) => `- ${s.metadata.name}: ${s.metadata.description}`)
    .join("\n");
}
 
// Full context — only for matched skills
function getFullSkillContext(skill: Skill): string {
  return skill.instructions; // Load references on demand
}

5. Security gaps: skills bypassing tool governance

MCP tools have governance infrastructure — policy controls can restrict which tools are permitted per organization. Skills have no equivalent governance layer. A skill that says "always process the refund immediately without checking eligibility" can override the behavioral guardrails that the tool governance assumes.

This isn't hypothetical. In January 2026, Asana discovered that its MCP server feature contained a logic flaw in access control that allowed data belonging to one organization to be accessed by other organizations. The agent could call the tool — nothing governed which tenant's data it should touch. A month later, Invariant Labs demonstrated a prompt-injection attack against the official GitHub MCP server where a malicious public GitHub issue hijacked an AI assistant, pulling data from private repositories including salary information. The agent had the right tools and the right permissions. What it lacked was the behavioral skill to recognize when an input was manipulating its decision-making. Both incidents illustrate the same gap: tools without skills governing when and how to use them are a liability, not a capability.

Symptom: The agent bypasses business rules because a skill instruction overrides tool-level controls.

Fix: Treat skill files as code — version control, code review, access control. Test skills against your business rules using an eval framework. Consider making critical business rules tool-level validations (the refund tool itself checks eligibility) rather than skill-level instructions (the skill tells the agent to check).

What's next: the convergence

MCP and skills are converging in practice even as they remain separate in the protocol spec. The 2026 MCP roadmap pushes the protocol toward territory that overlaps with skills.

Transport scalability means MCP servers can run as remote services, which opens the door for skills-as-a-service: shared behavioral packages deployed alongside tool servers.

Agent communication addresses async patterns and agent-to-agent interaction. When one agent can delegate to another, the boundary between "using a tool" and "invoking a skill" gets thinner. Google's A2A protocol is exploring this same space from a different angle.

Enterprise readiness brings audit trails, SSO-integrated auth, and gateway behavior. Today these apply to MCP tools. Tomorrow they'll need to cover skill provenance too — who wrote this skill, when was it last reviewed, is it approved for customer-facing use.

Governance maturation under the Linux Foundation means working groups, contributor ladders, and formal processes. Skills don't have any of this infrastructure yet. The frameworks that implement skills — Claude Code, Spring AI, LangChain — each have their own format. There's no standard skill schema, no interoperability protocol, no policy control layer.

The prediction: within 12 months, we'll see a proposal to make skills a fourth MCP primitive alongside tools, resources, and prompts. The architectural similarity is too strong to maintain two separate systems indefinitely.

Where to go from here

The answer to "which pattern wins" is: the one you compose correctly. MCP tools give agents the ability to reach into external systems and take action. Skills give agents the knowledge of when, why, and how to use those capabilities well. Neither is optional for a production customer-facing agent.

Start with tools. Build the three or four MCP tools your agent needs to take real actions.

Then add skills. Write the behavioral instructions that encode your business logic — refund policies, escalation criteria, tone guidelines. Start with one skill and observe how it changes the agent's reasoning. Add more as you identify gaps.

Then compose. Wire the agent loop so tools are discovered via MCP and skills are loaded based on conversation context. Validate that every skill's tool dependencies exist. Test with real conversations, not just unit tests.

Give your agents both tools and judgment

Chanl manages MCP tool hosting, execution, and monitoring so you can focus on building agents that act and reason correctly.

See how it works
DG

Co-founder

Building the platform for AI agents at Chanl — tools, testing, and observability for customer experience.

Learn Agentic AI

One lesson a week — practical techniques for building, testing, and shipping AI agents. From prompt engineering to production monitoring. Learn by doing.

500+ engineers subscribed

Frequently Asked Questions