Security Model¶

ToolClad inverts the security model of sandbox-based approaches. Instead of letting an LLM generate arbitrary commands and intercepting dangerous ones, ToolClad constrains the LLM to fill typed parameters that are validated against a manifest. The dangerous action cannot be expressed because the interface does not permit it.

Allow-List vs Deny-List¶

Approach	Model	Weakness
Sandbox (deny-list)	LLM generates command -> sandbox intercepts -> allow/deny	Deny-lists have gaps by definition. Novel attack patterns bypass rules.
ToolClad (allow-list)	LLM fills typed parameters -> policy gate -> executor validates -> constructs command	Only declared operations are possible. The attack surface is the manifest.

The sandbox approach requires the security layer to understand every possible dangerous command. ToolClad requires the manifest author to declare every permitted command. The allow-list is finite and auditable.

Shell Injection Prevention¶

All string-based types reject shell metacharacters by default:

; | & $ ` ( ) { } [ ] < > ! \n \r

This is not optional sanitization -- it is the type system's default behavior. Any value containing these characters is rejected before it reaches command construction. The sanitize = ["injection"] annotation is the default for string types; it does not need to be declared explicitly.

Newline (\n) and carriage return (\r) are blocked across all string-based types. These prevent:

Command injection via newline splitting (arg1\nmalicious_command)
Header injection in HTTP backends
Log injection in evidence transcripts

Array-Based Execution¶

ToolClad never invokes sh -c with a command string. Commands are dispatched via direct execve with an argument array. The preferred exec format declares the argv directly; the legacy template format is split via a quote-aware parser before dispatch:

# exec format (preferred) — maps directly to execve:
exec = ["nmap", "-sT", "-sV", "--max-rate", "1000", "10.0.1.0/24"]

# What ToolClad never does (unsafe):
sh -c "nmap -sT -sV --max-rate 1000 10.0.1.0/24"

Array-based execution means that even if a metacharacter somehow passed validation (it cannot, but defense in depth), the shell would never interpret it. There is no shell.

HTTP Body JSON-Escaping¶

When values are interpolated into HTTP body templates ([http].body_template), they are JSON-escaped before substitution. This prevents injection attacks where an agent-supplied value could break out of a JSON string field and alter the structure of the request body. Quotes, backslashes, newlines, and control characters are all escaped.

Platform-Aware Evidence Directories¶

Evidence output directories use platform-appropriate temporary directories (/tmp on Linux/macOS, %TEMP% on Windows) when no explicit output_dir is configured. This ensures evidence capture works correctly across operating systems without hardcoded paths.

HTTP Error Semantics¶

HTTP backend responses are classified by status code:

2xx: success status in the evidence envelope
4xx: client_error status -- the request was malformed or unauthorized
5xx: server_error status -- the upstream service failed

This classification gives LLM agents actionable error semantics for self-correction.

Process Group Kill¶

Tools are spawned in a new process group (PGID). When a timeout fires, the executor kills the entire process group, not just the top-level process. This prevents:

Zombie child processes that outlive the timeout
Background processes spawned by the tool that continue running
Resource leaks from tools that fork internally

Absolute Path Blocking¶

The path type rejects:

Path traversal: ../, ..\\
Absolute paths: /etc/shadow, C:\Windows\System32
Null bytes: \0

Paths are constrained to relative locations within the project directory. A tool cannot read or write outside its intended scope.

Scope Enforcement¶

Parameters with type scope_target, url (with scope_check = true), cidr, or ip_address are automatically validated against the project's scope definition.

Scope enforcement supports:

IP addresses: Exact match against allowed IPs
CIDR ranges: Membership check against allowed networks
Domain names: Pattern matching against allowed domains
URL hosts: Host extraction and domain matching

The scope check runs in the executor, after Cedar authorization but before command execution. This is defense in depth: even if a Cedar policy bug allows an out-of-scope target, the type system catches it.

For browser mode, URL scope enforcement extends to navigation, redirects, and link clicks. See Browser Mode for details.

Cedar Policy Integration¶

The [tool.cedar] section declares the Cedar resource and action for the tool:

[tool.cedar]
resource = "PenTest::ScanTarget"
action = "execute_tool"

The runtime's Gate phase builds a Cedar authorization request from the manifest metadata plus the agent's runtime context (phase, environment, agent identity). Cedar policies in policies/ evaluate this request and produce ALLOW, DENY, or PENDING (for human approval).

Automatic Policy Generation¶

Because the manifest declares the tool's risk tier, parameter types, and Cedar resource, the runtime can generate baseline Cedar policies:

# Auto-generated from nmap_scan.clad.toml (risk_tier = "low")
permit (
    principal,
    action == PenTest::Action::"execute_tool",
    resource
)
when {
    resource.tool_name == "nmap_scan"
};

Teams then refine the generated policies with phase restrictions, environment constraints, and human approval gates.

Session and Browser Cedar Context¶

For session and browser modes, Cedar policies receive additional context:

resource.session_state -- current state of the interactive tool
resource.interaction_count -- number of interactions so far
resource.command -- the specific session/browser command being invoked
resource.page_state.* -- browser page state (URL, domain, forms, auth status)

This enables state-aware and time-aware governance on interactive tools.

SchemaPin Integration¶

SchemaPin signs .clad.toml files directly as first-class artifacts:

schemapin-sign tools/nmap_scan.clad.toml

The signature covers the entire behavioral contract. If anyone tampers with a command template, validation rule, scope constraint, output schema, or session command pattern, the hash changes and verification fails.

Verification Flow¶

Runtime loads nmap_scan.clad.toml from tools/
Runtime hashes the manifest content (SHA-256)
Runtime looks up the tool's provider domain (from toolclad.toml or symbiont.toml)
Runtime fetches .well-known/schemapin.json from the provider domain
Runtime verifies the hash against the published signature using SchemaPin's TOFU pinning
If verification fails, the manifest is rejected and the tool is not registered

No [tool.schemapin] section is needed in the manifest. The manifest stays clean. SchemaPin uses its existing .well-known/schemapin.json discovery infrastructure.

What the Signature Protects¶

The signature covers more than an MCP JSON Schema would:

Protected	MCP Schema Only	ToolClad Manifest
Parameter names and types	Yes	Yes
Validation rules (regex, ranges)	No	Yes
Command template	No	Yes
Scope constraints	No	Yes
Session command patterns	No	Yes
Browser scope domains	No	Yes
Output schema	Partial	Yes
Risk tier and Cedar mappings	No	Yes

SchemaPin v1.4 Hardening (alpha)¶

SchemaPin v1.4-alpha (released 2026-04-30 / 2026-05-01) adds three additive optional fields and one cross-channel verification mechanism that ToolClad publishers SHOULD adopt for defense-in-depth. None of them require manifest changes — they are signing-time options on schemapin-sign.

Feature	What it adds	ToolClad use
Signature expiration (`expires_at`)	Optional RFC 3339 TTL on the signature. Past the expiry, verifiers degrade (warn) rather than fail.	Sign with a 6-month TTL; runtime policy can refuse `risk_tier = "high"` tools whose signature is stale.
Schema version binding (`schema_version` + `previous_hash`)	Optional semver tag plus a hash chain to the prior signed version. Defends against rug-pull substitutions.	Set `schema_version` to your manifest's `[tool] version`. Set `previous_hash` to the prior release's `skill_hash`.
DNS TXT cross-verification (`_schemapin.{vendor-domain}`)	Optional second-channel trust signal independent of HTTPS hosting.	Publish the TXT record alongside `.well-known/schemapin.json`. A divergence between the two indicates compromise of one channel.

Recommended sign command for a versioned ToolClad release:

# Read [tool] version from the manifest (semver string already required)
TOOL_VERSION=$(awk -F\" '/^version[[:space:]]*=/ {print $2; exit}' tools/nmap_scan.clad.toml)

# For the first release of a tool: no previous_hash
schemapin-sign tools/nmap_scan.clad.toml \
    --expires-in 6mo \
    --schema-version "$TOOL_VERSION"

# For subsequent releases: chain to the prior signed version
PREV_HASH=$(jq -r '.skill_hash' tools/nmap_scan.clad.toml.sig.prior)
schemapin-sign tools/nmap_scan.clad.toml \
    --expires-in 6mo \
    --schema-version "$TOOL_VERSION" \
    --previous-hash "$PREV_HASH"

Verifiers (Symbiont, custom runtimes) SHOULD:

Treat expired = true as a policy signal — refuse high-risk tools, prompt on medium-risk, log on low-risk.
Maintain a per-tool latest_known_hash next to the TOFU pin. On encountering a signature whose previous_hash doesn't match that pin, prompt the operator (similar to the TOFU key-rotation prompt) before rolling forward.
When the vendor publishes a _schemapin.{vendor-domain} TXT record, fetch it and require the fingerprint to match the discovery-document key. Mismatch is a hard fail (DOMAIN_MISMATCH).

See the canonical SchemaPin guides for protocol detail:

Backward compatibility: all three are additive optional. SchemaPin v1.3 verifiers ignore the new fields entirely; ToolClad publishers can adopt them at their own pace without breaking older runtimes.

No-Eval Guarantee¶

Conditional evaluators in [command.conditionals] use closed-vocabulary parsers:

[command.conditionals]
service_port = { when = "port != 0", template = "-s {port}" }

The when expression supports only:

Variable references (declared parameter names)
Comparison operators (==, !=)
Logical operators (and, or)
String literals and numeric literals

SECURITY REQUIREMENT: Implementations MUST NOT use eval(), Function(), exec(), or any dynamic code execution mechanism to resolve conditions. Doing so creates a Remote Code Execution (RCE) vulnerability if an attacker can influence manifest content or argument values. All four reference implementations enforce this with a regex-based comparison parser.

Cross-Language Scope Validation¶

Scope validation involves non-trivial logic: CIDR containment math, IPv4/IPv6 normalization, DNS wildcard suffix matching. Re-implementing this in four languages creates a risk of security drift.

To mitigate this, ToolClad provides:

Normative test vectors (tests/scope_vectors.json) — a shared set of test cases that all implementations must pass
Centralization path — production deployments can use a single scope validation endpoint (gRPC/HTTP) or compile a Rust-based validator to WebAssembly for cross-language use

Static Analysis¶

Because ToolClad manifests are declarative, you can determine what any tool can possibly do before it ever runs:

Parameter space: Enumerable for enum types, bounded for numeric types, regex-constrained for string types
Command shape: The template defines the exact command structure; only placeholder values vary
Risk profile: Declared risk tier, Cedar resource/action, human approval requirements
Scope constraints: Which targets and domains the tool can reach
Output shape: The exact JSON structure the tool will produce

This enables formal verification: you can prove properties about valid invocations without executing anything.

The Trust Chain¶

SchemaPin verifies the manifest has not been modified
  -> The manifest constrains what the tool can accept
    -> Cedar authorizes whether this invocation is allowed
      -> The executor validates arguments against manifest types
        -> The executor constructs and runs the command
          -> Each layer trusts the one before it

Each layer in the chain has a single responsibility:

SchemaPin: Integrity -- the manifest is authentic and unmodified
Manifest: Interface -- only declared operations with typed parameters
Cedar: Authorization -- this agent, in this context, is allowed to invoke this tool
Executor: Validation -- all parameter values satisfy their type constraints
Runtime: Execution -- array-based dispatch with timeout and process isolation