Platform overview

The Agent Architecture Platform

SkillsWorkbench transforms prompts into production-grade AI workflows that can be evaluated, optimized, and deployed across agent runtimes — Claude active today, GPT and Gemini arriving next.

Five-phase workflowAdversarial evaluation78% token reductionThree deployment paths

Open Workbench →Read the FAQ

The workflow

Five phases, one reliable workflow

Every skill built in SkillsWorkbench passes through the same rigorous pipeline before it touches production.

Brainstorm

Phase 1

Describe your workflow in plain language. Starter Chips guide you for each industry. The platform asks clarifying questions for healthcare and finance workflows before drafting begins.

Covers

Free-form prompting · Starter chips · Industry selection · Requirements gathering

"Build a prior-auth assistant for oncology workflows that checks CPT codes against payer policies"

Draft

Phase 2

The industry architect generates a structured SKILL.md file with trigger phrases, tool declarations, step-by-step instructions, and token strategy annotations — all tuned for your vertical.

Covers

SKILL.md generation · Trigger optimisation · YAML frontmatter · Allowed-tools injection

Output: a production-ready SKILL.md with subagent planning, compliance constraints, and audit hooks

Evaluate

Phase 3

Run the skill against adversarially generated mock data. The Stress Test Sandbox embeds hidden traps and measures accuracy, token usage, and latency. The Eval Lab generates A/B test cases.

Covers

Adversarial mock data · Hidden traps · Accuracy scoring · Token benchmarking

Result: 94% accuracy on 8 adversarial scenarios, ~650 tokens per run, 1.2s estimated latency

Optimise

Phase 4

Review the token economy dashboard. Identify which tasks are overusing expensive models and refine the skill instructions to route bulk work to Haiku and critical reasoning to Opus.

Covers

Haiku-first delegation · Sonnet escalation · Opus for ambiguity · Monthly savings projection

Before: 2,100 tokens at Opus rates. After: 450 tokens with tiered delegation — 78% cost reduction

Deploy

Phase 5

Export to Claude Code, push to Managed Agents, or integrate with enterprise systems via MCP. The SKILL.md file is portable — deploy once, reuse everywhere.

Covers

Claude Code · Managed Agents · Enterprise MCP · Internal APIs

# Drop into your project: .claude/skills/healthcare-prior-auth-assistant.md

Token economy

The WhatsApp token-saving strategy

Route each task to the cheapest capable model. Reserve expensive models for the work only they can do.

Claude Haiku~80% of calls

Bulk scanning & classification

~78% cheaper than Sonnet

File parsingPattern matchingData extractionInitial triage

Claude Sonnet~15% of calls

Synthesis & structured output

Balanced cost/capability

Report generationCode suggestionsPolicy lookupStructured summaries

Claude Opus~5% of calls

Critical reasoning & ambiguity

Reserved for highest stakes

Legal interpretationFinancial edge casesClinical ambiguitySecurity reasoning

Token comparison — same task

Healthcare prior-auth check · typical run

Standard (Opus-only)2,100 tokens

SkillsWorkbench tiered450 tokens

78% reduction

in token cost with tiered delegation

Guard rails

max_spawn_depth= 2

Prevents recursive subagent cost explosion

context_mode= fork

Isolates each session — no memory bleed

haiku_threshold= bulk tasks

Automatic Haiku routing for scanning

Architecture

How the platform is structured

SkillsWorkbench is a layered system — each layer is independently testable and the interfaces between them are well-defined.

◎User

Browser-based SPA

↓

⬡SkillsWorkbench

Next.js App Router · Streaming API routes

↓

✦Industry Architect

SaaS · Healthcare (HIPAA) · Finance (GAAP)

↓

◈Skill Generator

SKILL.md drafting · Trigger optimiser · Token economy

↓

⊕Evaluation Sandbox

Stress test · Eval lab · A/B comparison

↓

⟶Deployment Layer

Claude Code · Managed Agents · Enterprise MCP

Anthropic API

· Claude Haiku

· Claude Sonnet

· Claude Opus

Tool Surface

· bash · file_edit

· github_api · search

· MCP servers

Security

· API key server-only

· No browser exposure

· Rate limiting

· Streaming proxied

Use cases

Skills built for real workflows

Each skill is tuned to a specific industry and workflow — not a generic assistant.

SaaS/audit-security

SaaS Security Architect

Scans codebases for RLS leaks, middleware vulnerabilities, API key exposure, and authentication gaps. Reports findings with line-level provenance and severity ratings.

Tools

· file_edit

· bash

· github_api

Model routing

Haiku scan → Sonnet synthesis

Example output

Detected 3 RLS bypass vectors in middleware.ts lines 44, 78, 112

Deployment

Three ways to deploy a skill

Skills are portable markdown files. Deploy them to any environment that can run Claude.

Local / Developer

Claude Code

The fastest path for individual developers and engineering teams. Download the SKILL.md from the workbench and place it in your project — no CLI commands or registry needed.

1Click ↓ Download SKILL.md in the workbench toolbar

2Create .claude/skills/ in your project root if it doesn't exist

3Move the downloaded file into that folder

4Claude Code loads it automatically on the next session

# Place in your project root:
.claude/skills/your-skill-name.md

Best for: Engineers, local iteration, rapid prototyping

Server / Automated

Managed Agents

The Deploy panel assembles an Anthropic Managed Agents-compatible payload. Copy it from the browser console and submit it to the Anthropic API to create a hosted, long-running agent session.

1Click Deploy in the workbench

2The payload is logged to the browser console

3Submit it to the Anthropic Managed Agents API

4Sessions stream events via SSE

{ "anthropic-beta": "managed-agents-2026-04-01",
  "skill": { "markdown": "..." } }

Best for: Long-running workflows, enterprise automation, multi-step pipelines

Integration Layer

Enterprise / MCP

Integrate skills into existing enterprise systems using Model Context Protocol (MCP) servers, internal REST APIs, or purpose-built adapters for healthcare, finance, and legal platforms.

1Export skill as portable SKILL.md

2Wrap in MCP tool definition or API handler

3Connect to ERP, FHIR, SEC Edgar, or GitHub

4Trigger via webhook, schedule, or event stream

# Connect via MCP or your internal API gateway

Best for: Enterprise teams, compliance workflows, regulated industry integrations

Ready to build your first skill?

Open the workbench, pick your industry, and describe the workflow you want to automate. The platform will guide the rest.

Open Workbench →FAQ