TeamSpec — The Open Specification for Enterprise AI Agents

The Problem

Every agent developer is solving the same problems from scratch.

Without a shared standard, every developer reinvents permissions, tool bindings, failure handling, and observability. The result is fragile, expensive, and impossible to share.

The reinvention problem

Every developer building AI agents makes the same set of decisions independently: what tools the agent can use, what it can do autonomously, how it handles failures, how you know what it did. None of this work is reusable. When it breaks in production — a task that runs unchecked, a failure that cascades silently — there is no shared framework to diagnose or fix it. You start over.

A standard format for any agent team

TeamSpec defines a standard format for describing an agent or a team of agents — the tools they use, the permissions they hold, how they handle failure, and how they communicate. Agent developers write a TeamSpec-compatible config once. Any platform that supports TeamSpec can execute it. Grounded in peer-reviewed research on AI delegation (arXiv:2602.11865) and refined through practitioner input.

The Specification

Seven dimensions of compliant AI agency

TeamSpec covers seven dimensions that every well-built agent needs. Each is defined, measurable, and independently assessable — giving developers a clear checklist and platforms a clear target.

01 · 20 POINTS

Task Decomposition & Allocation

The agent must break complex, ambiguous goals into concrete, actionable sub-tasks. It must adapt dynamically when conditions change mid-execution — not just follow a fixed script. Allocation includes cost-aware routing: simpler sub-tasks should use cheaper models; expensive model calls should be reserved for tasks that require them.

Goal breakdownDynamic replanningSub-task prioritizationCost-aware routing

02 · 15 POINTS

Authority & Responsibility Management

Clear, enforceable authority boundaries. What the agent can do autonomously. What requires human approval. Who is accountable for each decision. Budget limits are a first-class authority boundary — spending thresholds must be defined and enforced structurally, triggering approval gates when exceeded, just like any other high-stakes action.

Authority boundariesHuman-in-the-loopAccountability trackingBudget thresholds

03 · 20 POINTS

Trust & Safety Mechanisms

Verification systems for task completion, data access scoping to prevent information leakage, content guardrails to prevent harmful outputs, and prompt injection protection. The highest-weight dimension — enterprise deployment requires structural safety, not best-effort.

Access scopingContent guardrailsInjection protection

04 · 15 POINTS

Failure Handling & Resilience

The agent detects when things go wrong — tool failures, API outages, ambiguous inputs, conflicting goals, and budget exhaustion. It has defined recovery strategies for each: retry, queue, escalate to human, or gracefully halt with clear user communication. Running out of budget is a failure condition, not a silent stop.

Error detectionRecovery strategiesHuman escalationBudget exhaustion handling

05 · 10 POINTS

Intent & Communication Clarity

Users can clearly specify what they want. The agent communicates its state, progress, and decisions in plain language. Ambiguous instructions are clarified before execution — not silently interpreted.

Goal specificationStatus reportingAmbiguity resolution

06 · 10 POINTS

Scalability & Multi-Agent Coordination

The agent system handles multiple concurrent agents, delegation chains, and parallel sub-task execution. Coordination protocols prevent conflicts, duplication, and runaway resource consumption. Running parallel agents multiplies cost — the spec requires per-team budget caps, per-agent cost limits, and real-time spend awareness so the team operates within defined financial boundaries.

Parallel executionDelegation chainsPer-team budget capsReal-time spend awareness

07 · 10 POINTS

Transparency & Observability

Full audit trail of every action: what was done, what tool was called, what data was accessed, what decision was made autonomously, what was escalated. Cost is part of the record — token usage, model spend, and total cost per run must be logged alongside execution events so budget vs. actual is always visible and accountable.

Execution logsDecision traceabilityAudit trailsBudget vs. actual

90–100Excellent — critical delegation

75–89Good — most agentic tasks

60–74Fair — needs supervision

Below 60Significant compliance gaps

The Ecosystem

Specification. Build tooling. Reference implementation.

TeamSpec ships everything you need to go from idea to deployed agent: the spec to define your agents, the tools to run and manage them, and a reference implementation that shows how it all fits together.

⬡

The Spec

Seven dimensions. 100-point scale. Platform-agnostic. The canonical definition of what a compliant enterprise AI agent must do.

Read the spec →

⚒

Forge

The open-source execution layer. Reads configs from AgentHub and launches compliant agents with runtime controls, live status, and full execution logs.

Explore Forge →

⬡⬡

AgentHub

The open-source config registry. Define, version, and govern every agent configuration in one structured, auditable repository.

Explore AgentHub →

◈

Siggy

The reference implementation. An executive AI assistant that exercises every TeamSpec dimension in production-realistic workflows.

See Siggy →

Reference Implementation

Siggy shows the spec in action.

The best way to see TeamSpec in action is a real agent doing real work. Siggy is an executive assistant built to the full TeamSpec spec — and it has been benchmarked on three platforms so you can see exactly what each one does well and where it falls short.

What Siggy does

Calendar administration and meeting coordination

Research — projects, topics, customers, prospects

Messaging — drafting and sending communications

Opportunity and impediment analysis

Task delegation to agents and humans

Three-platform evaluation

FastBytes ran Siggy on three platforms and published an honest, dimension-by-dimension comparison.

OpenClaw76/100

LangChain72/100

Bespoke Python Stack44/100

Read the full case study at FastBytes →

Community

An open spec needs open contributors.

TeamSpec improves when developers with real-world agent experience contribute. If you've run into a deployment problem the spec doesn't cover — your experience belongs here.

CONTRIBUTE

Improve the Specification

Propose new dimensions, refine scoring rubrics, or challenge existing definitions with evidence from production deployments. All spec changes go through open RFC process.

Open GitHub →

BUILD

Extend Forge & AgentHub

Add integrations, fix bugs, write tests, improve documentation. Forge and AgentHub are both Apache 2.0 licensed — fork, extend, and contribute back.

View Repos →

Deploying AI agents
shouldn't be hard.

Every agent developer is solving the same problems from scratch.

The reinvention problem

A standard format for any agent team

Seven dimensions of compliant AI agency

Task Decomposition & Allocation

Authority & Responsibility Management

Trust & Safety Mechanisms

Failure Handling & Resilience

Intent & Communication Clarity

Scalability & Multi-Agent Coordination

Transparency & Observability

Specification. Build tooling. Reference implementation.

The Spec

Forge

AgentHub

Siggy

Siggy shows the spec in action.

What Siggy does

Three-platform evaluation

An open spec needs open contributors.

Improve the Specification

Extend Forge & AgentHub

TeamSpec is sponsored by FastBytes.

Ship your first agent team today.

Deploying AI agents shouldn't be hard.

Every agent developer is solving the same problems from scratch.

The reinvention problem

A standard format for any agent team

Seven dimensions of compliant AI agency

Task Decomposition & Allocation

Authority & Responsibility Management

Trust & Safety Mechanisms

Failure Handling & Resilience

Intent & Communication Clarity

Scalability & Multi-Agent Coordination

Transparency & Observability

Specification. Build tooling. Reference implementation.

The Spec

Forge

AgentHub

Siggy

Siggy shows the spec in action.

What Siggy does

Three-platform evaluation

An open spec needs open contributors.

Improve the Specification

Extend Forge & AgentHub

TeamSpec is sponsored by FastBytes.

Ship your first agent team today.

Deploying AI agents
shouldn't be hard.