Security Model¶
ToolClad inverts the security model of sandbox-based approaches. Instead of letting an LLM generate arbitrary commands and intercepting dangerous ones, ToolClad constrains the LLM to fill typed parameters that are validated against a manifest. The dangerous action cannot be expressed because the interface does not permit it.
Allow-List vs Deny-List¶
| Approach | Model | Weakness |
|---|---|---|
| Sandbox (deny-list) | LLM generates command -> sandbox intercepts -> allow/deny | Deny-lists have gaps by definition. Novel attack patterns bypass rules. |
| ToolClad (allow-list) | LLM fills typed parameters -> policy gate -> executor validates -> constructs command | Only declared operations are possible. The attack surface is the manifest. |
The sandbox approach requires the security layer to understand every possible dangerous command. ToolClad requires the manifest author to declare every permitted command. The allow-list is finite and auditable.
Shell Injection Prevention¶
All string-based types reject shell metacharacters by default:
This is not optional sanitization -- it is the type system's default behavior. Any value containing these characters is rejected before it reaches command construction. The sanitize = ["injection"] annotation is the default for string types; it does not need to be declared explicitly.
Newline (\n) and carriage return (\r) are blocked across all string-based types. These prevent:
- Command injection via newline splitting (
arg1\nmalicious_command) - Header injection in HTTP backends
- Log injection in evidence transcripts
Array-Based Execution¶
ToolClad never invokes sh -c with a command string. Commands are dispatched via direct execve with an argument array. The preferred exec format declares the argv directly; the legacy template format is split via a quote-aware parser before dispatch:
# exec format (preferred) — maps directly to execve:
exec = ["nmap", "-sT", "-sV", "--max-rate", "1000", "10.0.1.0/24"]
# What ToolClad never does (unsafe):
sh -c "nmap -sT -sV --max-rate 1000 10.0.1.0/24"
Array-based execution means that even if a metacharacter somehow passed validation (it cannot, but defense in depth), the shell would never interpret it. There is no shell.
HTTP Body JSON-Escaping¶
When values are interpolated into HTTP body templates ([http].body_template), they are JSON-escaped before substitution. This prevents injection attacks where an agent-supplied value could break out of a JSON string field and alter the structure of the request body. Quotes, backslashes, newlines, and control characters are all escaped.
Platform-Aware Evidence Directories¶
Evidence output directories use platform-appropriate temporary directories (/tmp on Linux/macOS, %TEMP% on Windows) when no explicit output_dir is configured. This ensures evidence capture works correctly across operating systems without hardcoded paths.
HTTP Error Semantics¶
HTTP backend responses are classified by status code:
- 2xx:
successstatus in the evidence envelope - 4xx:
client_errorstatus -- the request was malformed or unauthorized - 5xx:
server_errorstatus -- the upstream service failed
This classification gives LLM agents actionable error semantics for self-correction.
Process Group Kill¶
Tools are spawned in a new process group (PGID). When a timeout fires, the executor kills the entire process group, not just the top-level process. This prevents:
- Zombie child processes that outlive the timeout
- Background processes spawned by the tool that continue running
- Resource leaks from tools that fork internally
Absolute Path Blocking¶
The path type rejects:
- Path traversal:
../,..\\ - Absolute paths:
/etc/shadow,C:\Windows\System32 - Null bytes:
\0
Paths are constrained to relative locations within the project directory. A tool cannot read or write outside its intended scope.
Scope Enforcement¶
Parameters with type scope_target, url (with scope_check = true), cidr, or ip_address are automatically validated against the project's scope definition.
Scope enforcement supports:
- IP addresses: Exact match against allowed IPs
- CIDR ranges: Membership check against allowed networks
- Domain names: Pattern matching against allowed domains
- URL hosts: Host extraction and domain matching
The scope check runs in the executor, after Cedar authorization but before command execution. This is defense in depth: even if a Cedar policy bug allows an out-of-scope target, the type system catches it.
For browser mode, URL scope enforcement extends to navigation, redirects, and link clicks. See Browser Mode for details.
Cedar Policy Integration¶
The [tool.cedar] section declares the Cedar resource and action for the tool:
The runtime's Gate phase builds a Cedar authorization request from the manifest metadata plus the agent's runtime context (phase, environment, agent identity). Cedar policies in policies/ evaluate this request and produce ALLOW, DENY, or PENDING (for human approval).
Automatic Policy Generation¶
Because the manifest declares the tool's risk tier, parameter types, and Cedar resource, the runtime can generate baseline Cedar policies:
# Auto-generated from nmap_scan.clad.toml (risk_tier = "low")
permit (
principal,
action == PenTest::Action::"execute_tool",
resource
)
when {
resource.tool_name == "nmap_scan"
};
Teams then refine the generated policies with phase restrictions, environment constraints, and human approval gates.
Session and Browser Cedar Context¶
For session and browser modes, Cedar policies receive additional context:
resource.session_state-- current state of the interactive toolresource.interaction_count-- number of interactions so farresource.command-- the specific session/browser command being invokedresource.page_state.*-- browser page state (URL, domain, forms, auth status)
This enables state-aware and time-aware governance on interactive tools.
SchemaPin Integration¶
SchemaPin signs .clad.toml files directly as first-class artifacts:
The signature covers the entire behavioral contract. If anyone tampers with a command template, validation rule, scope constraint, output schema, or session command pattern, the hash changes and verification fails.
Verification Flow¶
- Runtime loads
nmap_scan.clad.tomlfromtools/ - Runtime hashes the manifest content (SHA-256)
- Runtime looks up the tool's provider domain (from
toolclad.tomlorsymbiont.toml) - Runtime fetches
.well-known/schemapin.jsonfrom the provider domain - Runtime verifies the hash against the published signature using SchemaPin's TOFU pinning
- If verification fails, the manifest is rejected and the tool is not registered
No [tool.schemapin] section is needed in the manifest. The manifest stays clean. SchemaPin uses its existing .well-known/schemapin.json discovery infrastructure.
What the Signature Protects¶
The signature covers more than an MCP JSON Schema would:
| Protected | MCP Schema Only | ToolClad Manifest |
|---|---|---|
| Parameter names and types | Yes | Yes |
| Validation rules (regex, ranges) | No | Yes |
| Command template | No | Yes |
| Scope constraints | No | Yes |
| Session command patterns | No | Yes |
| Browser scope domains | No | Yes |
| Output schema | Partial | Yes |
| Risk tier and Cedar mappings | No | Yes |
SchemaPin v1.4 Hardening (alpha)¶
SchemaPin v1.4-alpha (released 2026-04-30 / 2026-05-01) adds three additive optional fields and one cross-channel verification mechanism that ToolClad publishers SHOULD adopt for defense-in-depth. None of them require manifest changes — they are signing-time options on schemapin-sign.
| Feature | What it adds | ToolClad use |
|---|---|---|
Signature expiration (expires_at) |
Optional RFC 3339 TTL on the signature. Past the expiry, verifiers degrade (warn) rather than fail. | Sign with a 6-month TTL; runtime policy can refuse risk_tier = "high" tools whose signature is stale. |
Schema version binding (schema_version + previous_hash) |
Optional semver tag plus a hash chain to the prior signed version. Defends against rug-pull substitutions. | Set schema_version to your manifest's [tool] version. Set previous_hash to the prior release's skill_hash. |
DNS TXT cross-verification (_schemapin.{vendor-domain}) |
Optional second-channel trust signal independent of HTTPS hosting. | Publish the TXT record alongside .well-known/schemapin.json. A divergence between the two indicates compromise of one channel. |
Recommended sign command for a versioned ToolClad release:
# Read [tool] version from the manifest (semver string already required)
TOOL_VERSION=$(awk -F\" '/^version[[:space:]]*=/ {print $2; exit}' tools/nmap_scan.clad.toml)
# For the first release of a tool: no previous_hash
schemapin-sign tools/nmap_scan.clad.toml \
--expires-in 6mo \
--schema-version "$TOOL_VERSION"
# For subsequent releases: chain to the prior signed version
PREV_HASH=$(jq -r '.skill_hash' tools/nmap_scan.clad.toml.sig.prior)
schemapin-sign tools/nmap_scan.clad.toml \
--expires-in 6mo \
--schema-version "$TOOL_VERSION" \
--previous-hash "$PREV_HASH"
Verifiers (Symbiont, custom runtimes) SHOULD:
- Treat
expired = trueas a policy signal — refuse high-risk tools, prompt on medium-risk, log on low-risk. - Maintain a per-tool
latest_known_hashnext to the TOFU pin. On encountering a signature whoseprevious_hashdoesn't match that pin, prompt the operator (similar to the TOFU key-rotation prompt) before rolling forward. - When the vendor publishes a
_schemapin.{vendor-domain}TXT record, fetch it and require the fingerprint to match the discovery-document key. Mismatch is a hard fail (DOMAIN_MISMATCH).
See the canonical SchemaPin guides for protocol detail:
Backward compatibility: all three are additive optional. SchemaPin v1.3 verifiers ignore the new fields entirely; ToolClad publishers can adopt them at their own pace without breaking older runtimes.
No-Eval Guarantee¶
Conditional evaluators in [command.conditionals] use closed-vocabulary parsers:
The when expression supports only:
- Variable references (declared parameter names)
- Comparison operators (
==,!=) - Logical operators (
and,or) - String literals and numeric literals
SECURITY REQUIREMENT: Implementations MUST NOT use eval(), Function(), exec(), or any dynamic code execution mechanism to resolve conditions. Doing so creates a Remote Code Execution (RCE) vulnerability if an attacker can influence manifest content or argument values. All four reference implementations enforce this with a regex-based comparison parser.
Cross-Language Scope Validation¶
Scope validation involves non-trivial logic: CIDR containment math, IPv4/IPv6 normalization, DNS wildcard suffix matching. Re-implementing this in four languages creates a risk of security drift.
To mitigate this, ToolClad provides:
- Normative test vectors (
tests/scope_vectors.json) — a shared set of test cases that all implementations must pass - Centralization path — production deployments can use a single scope validation endpoint (gRPC/HTTP) or compile a Rust-based validator to WebAssembly for cross-language use
Static Analysis¶
Because ToolClad manifests are declarative, you can determine what any tool can possibly do before it ever runs:
- Parameter space: Enumerable for enum types, bounded for numeric types, regex-constrained for string types
- Command shape: The template defines the exact command structure; only placeholder values vary
- Risk profile: Declared risk tier, Cedar resource/action, human approval requirements
- Scope constraints: Which targets and domains the tool can reach
- Output shape: The exact JSON structure the tool will produce
This enables formal verification: you can prove properties about valid invocations without executing anything.
The Trust Chain¶
SchemaPin verifies the manifest has not been modified
-> The manifest constrains what the tool can accept
-> Cedar authorizes whether this invocation is allowed
-> The executor validates arguments against manifest types
-> The executor constructs and runs the command
-> Each layer trusts the one before it
Each layer in the chain has a single responsibility:
- SchemaPin: Integrity -- the manifest is authentic and unmodified
- Manifest: Interface -- only declared operations with typed parameters
- Cedar: Authorization -- this agent, in this context, is allowed to invoke this tool
- Executor: Validation -- all parameter values satisfy their type constraints
- Runtime: Execution -- array-based dispatch with timeout and process isolation