Session Mode¶
Session mode governs interactive CLI tools -- programs that stay alive across multiple interactions and maintain internal state. Tools like psql, msfconsole, redis-cli, kubectl exec, and gdb accept commands over time, where each command changes the tool's internal state and carries a different risk level.
What Session Mode Solves¶
The oneshot model (validate, construct, execute, parse) does not fit interactive tools because:
- The tool stays alive across multiple agent interactions
- Each interaction changes the tool's internal state (loaded module, connected database, current context)
- The agent needs intermediate output to decide its next command
- Different commands within the same session carry different risk levels
Sandboxing an interactive tool treats the entire session as a black box. An agent with an open psql connection could DROP TABLE as easily as SELECT * because both are text sent to a PTY. ToolClad session mode declares which commands are allowed, validates each one independently, and gates every interaction through Cedar policy evaluation.
Session Manifest Example¶
A psql session manifest with read and write commands at different risk tiers:
# tools/psql_session.clad.toml
[tool]
name = "psql_session"
binary = "psql"
mode = "session"
description = "Governed PostgreSQL interactive session"
risk_tier = "medium"
[tool.cedar]
resource = "Data::Database"
action = "query"
[tool.evidence]
output_dir = "{evidence_dir}/{session_id}-psql"
capture = true
hash = "sha256"
# --- Session Lifecycle ---
[session]
startup_command = "psql -h localhost -U agent_user -d appdb"
ready_pattern = "^[a-zA-Z_]+=>[\\s]*$"
startup_timeout_seconds = 10
idle_timeout_seconds = 300
session_timeout_seconds = 1800
max_interactions = 200
[session.interaction]
input_sanitize = ["injection"]
output_max_bytes = 1048576
output_wait_ms = 2000
# --- Session Commands ---
[session.commands.select_query]
pattern = "^SELECT .+$"
description = "Run a read-only SELECT query"
risk_tier = "low"
[session.commands.insert]
pattern = "^INSERT INTO [a-zA-Z_]+ .+$"
description = "Insert rows into a table"
risk_tier = "medium"
[session.commands.update]
pattern = "^UPDATE [a-zA-Z_]+ SET .+$"
description = "Update rows in a table"
risk_tier = "medium"
human_approval = true
[session.commands.drop_table]
pattern = "^DROP TABLE .+$"
description = "Drop a table (destructive)"
risk_tier = "high"
human_approval = true
# --- Output Schema ---
[output.schema]
type = "object"
[output.schema.properties.prompt]
type = "string"
description = "Current psql prompt (indicates connection state)"
[output.schema.properties.output]
type = "string"
description = "Query result or command output"
[output.schema.properties.session_state]
type = "string"
enum = ["ready", "transaction", "error"]
description = "Inferred session state from prompt analysis"
[output.schema.properties.interaction_count]
type = "integer"
description = "Number of interactions in this session so far"
Session Configuration¶
[session] Section¶
| Field | Type | Description |
|---|---|---|
startup_command |
string | Command to launch the interactive tool |
ready_pattern |
regex | Pattern that matches the tool's input prompt |
startup_timeout_seconds |
integer | Max time to wait for the first prompt after launch |
idle_timeout_seconds |
integer | Max time between interactions before session is terminated |
session_timeout_seconds |
integer | Max total session lifetime |
max_interactions |
integer | Maximum number of command/response rounds |
[session.interaction] Section¶
| Field | Type | Description |
|---|---|---|
input_sanitize |
array | Sanitization rules applied to all command input |
output_max_bytes |
integer | Maximum bytes to capture from a single interaction |
output_wait_ms |
integer | Time to wait for output after sending a command |
[session.commands.*] Section¶
Each entry under [session.commands] declares a permitted operation:
| Field | Type | Description |
|---|---|---|
pattern |
regex | Regex the command text must match |
description |
string | Human-readable description (shown in MCP tool listing) |
risk_tier |
string | "low", "medium", or "high" |
human_approval |
boolean | Whether this command requires human approval before execution |
extract_target |
boolean | If true, extract the target value for scope checking |
A command that does not match any declared pattern is rejected before it reaches the PTY.
Session Commands as Typed MCP Tools¶
Each [session.commands.*] entry becomes a separate MCP tool visible to the LLM. The naming convention is {tool_name}.{command_name}:
| MCP Tool Name | LLM Sees | Agent Provides |
|---|---|---|
psql_session.select_query |
"Run a read-only SELECT query" | { "command": "SELECT * FROM users LIMIT 10" } |
psql_session.insert |
"Insert rows into a table" | { "command": "INSERT INTO logs (msg) VALUES ('test')" } |
psql_session.drop_table |
"Drop a table (destructive)" | { "command": "DROP TABLE temp_data" } |
The LLM never sees a free-text input field. It picks from typed operations. The command text is validated against the pattern regex, scope-checked if extract_target = true, and policy-gated at the command's declared risk_tier.
Per-Interaction ORGA Gating¶
Session mode turns each round of interaction into its own ORGA cycle:
Iteration 1:
Observe: "psql is ready, showing dbname=> prompt"
Reason: LLM proposes "SELECT * FROM users LIMIT 10"
Gate: Cedar checks: is SELECT allowed? Is this agent read-only?
ToolClad validates: command matches select_query pattern
Act: SessionExecutor sends command to PTY, waits for next prompt
Iteration 2:
Observe: "Query returned 10 rows, showing dbname=> prompt"
Reason: LLM proposes "DROP TABLE users"
Gate: Cedar checks: is DROP allowed? -> DENY (or PENDING human approval)
Act: (blocked, denial fed back to agent)
Every command passes through Cedar policy evaluation and ToolClad pattern validation. The LLM cannot free-type into a terminal.
Prompt-Based State Inference¶
The ready_pattern regex tells the SessionExecutor when the tool is waiting for input. Different prompts reveal different internal states:
| Prompt | Inferred State |
|---|---|
msf6 > |
ready (no module loaded) |
msf6 exploit(ms17_010) > |
module_loaded |
dbname=> |
ready (psql, standard user) |
dbname=# |
ready (psql, superuser -- higher risk tier) |
The executor parses prompt changes and reports session_state in the output schema. Cedar policies can reference session state for authorization decisions.
Session Lifecycle¶
spawn -> startup -> ready -> interact -> ... -> terminate
| | | |
| startup_command | agent | idle_timeout |
| wait for | sends | session_timeout |
| ready_pattern | cmds | max_interactions |
| | | explicit close |
| startup_timeout | | agent terminates |
| exceeded? FAIL | | |
Sessions are scoped to a single agent ORGA loop. When the loop completes (task done, timeout, or error), the SessionExecutor terminates the process and finalizes the evidence transcript. Sessions do not persist across agent restarts -- a session is an ephemeral execution context, not durable state.
SessionExecutor Flow¶
Agent ORGA Loop SessionExecutor Tool Process (PTY)
| | |
| propose command | |
|--------------------------->| |
| | validate against pattern |
| | check scope (if extract) |
| | Cedar policy evaluation |
| | |
| [if allowed] |
| | write to PTY stdin |
| |----------------------------->|
| | |
| | read until ready_pattern |
| |<-----------------------------|
| | |
| | parse prompt -> state |
| | frame output |
| | validate against schema |
| | wrap in evidence envelope |
| | |
| {prompt, output, state} | |
|<---------------------------| |
| | |
[next ORGA iteration]
The SessionExecutor handles:
- PTY allocation: Spawns the tool in a pseudo-terminal, manages process lifecycle
- Prompt detection: Watches stdout for
ready_patternmatches to know when the tool is waiting for input - Output framing: Extracts meaningful output between the sent command and the next prompt, stripping echoed input and ANSI escape codes
- State inference: Parses prompt changes to update
session_state - Timeout enforcement: Kills the session if
idle_timeout,session_timeout, ormax_interactionsare exceeded - Evidence capture: Logs the full session transcript with timestamps, command/response pairs, and policy decisions
Evidence Transcript Logging¶
The SessionExecutor captures a timestamped transcript of every interaction:
[00:00.0] STARTUP psql -h localhost -U agent_user -d appdb
[00:01.2] READY prompt="appdb=>" state=ready
[00:01.5] INPUT "SELECT * FROM users LIMIT 10" cedar=ALLOW pattern=select_query
[00:02.1] OUTPUT 10 rows returned prompt="appdb=>" state=ready
[00:02.3] INPUT "DROP TABLE users" cedar=DENY pattern=drop_table
This transcript is part of the evidence envelope and provides a complete audit trail of every command the agent sent, whether it was allowed, and what the tool returned.
Cedar Policy Examples¶
Read-only agent¶
// Allow only SELECT queries for read-only agents
permit (
principal == Data::Agent::"reader",
action == Data::Action::"query",
resource
)
when {
resource.tool_name == "psql_session.select_query"
};
State-aware gating¶
// Only allow "run" when module is configured
permit (
principal == PenTest::Phase::"exploit",
action == PenTest::Action::"execute_tool",
resource
)
when {
resource.tool_name == "msfconsole_session.run" &&
resource.session_state == "configured"
};
Interaction count limits¶
// Block all commands after 50 interactions
forbid (
principal,
action == Data::Action::"query",
resource
)
when {
resource.interaction_count >= 50
};
Session Mode Applications¶
| Tool | Session Commands | Governance Value |
|---|---|---|
psql / mysql |
select_query, insert, update, create_table, drop_table |
Read-only agents cannot mutate. DDL requires human approval. |
redis-cli |
get, set, del, keys, flushdb |
Agents can read/write keys but flushdb requires approval. |
kubectl exec |
get, describe, logs, delete, apply |
Agents can inspect but destructive operations are gated. |
gdb / lldb |
info, backtrace, print, set, continue, kill |
Agents can inspect state but set and kill are gated. |
python3 / node |
REPL commands with import restrictions | Agent can compute but cannot import os, subprocess, socket. |