The Four Controls
Overview

The four controls

An AI agent has exactly three things an attacker can reach: the arguments it passes to tools, the prompt it is fed, and the output it produces. Each is untrusted. laravel-ai-guardrails places one deterministic control on each surface, plus a human gate on the most dangerous action class.

flowchart TB subgraph Inbound P[User prompt] --> B[B · Input Screening + Audit] end B -->|allowed| LLM[laravel/ai model] B -.->|blocked| Refuse[Pre-model refusal + audit] subgraph Model decisions LLM --> TC[Tool call] LLM --> Txt[Text / structured output] end TC --> A[A · Tool Firewall] A --> D[D · HITL Bridge] Txt --> C[C · Output Handler] A --> Run[Tool executes] D -.->|destructive| Approve[Human approval] C --> UI[Rendered safely]

The mapping

	Control	Untrusted surface	Threat it closes
A	Tool Firewall	model-chosen tool arguments	confused-deputy / IDOR
B	Input Screening + Audit	user prompts	jailbreak / exfiltration prompts
C	Output Handler	model output (text + structured)	stored-XSS / data-exfil / PII leakage
D	HITL Bridge	destructive tool calls	unauthorized destructive actions

Composability

The controls are independent and individually toggleable. A master kill-switch (ai-guardrails.enabled) degrades the whole package to pass-through; each control also has its own enabled flag and an enforce / monitor / off mode. Nothing shares state — you can adopt Control B alone, or all four.

Controls A, B, C are deterministic and offline — no model call, no network. Only Control D reaches out (to laravel-flow for human approval). That is what makes the whole stack reproducible and unit-testable.

The audit is the product

Control B appends every screening attempt — blocked and allowed — to an immutable store. The list of patterns you can argue about; the append-only forensic record you cannot. That record is the value proposition, surfaced through the HTTP admin API and domain events.

Where to go next

Each control page below explains its theory, data model, decision records, and a worked example.
The threat model frames why each posture is chosen.
The architecture overview shows how the pieces wire together at boot.

Last updated: Edit this page

The four controls

The mapping

Composability

The audit is the product

Where to go next

onThisPage