Runtime Behavioral Security for AI Agents

The pause between intent and action.

Agent threats live at the semantic layer — which tools got called, which files got read, whether behavior matches stated intent. That's only visible in-process. Sentinel ships as a hooks-based SDK that enforces deterministic security checks at the moments that matter.

"The thing protecting you does not inherit the failure modes of the thing it's protecting."
No tokens No external API No LLM in the monitoring pipeline $0 runtime cost On-prem by default Policy-less out of box
Attack Demos
6
Runnable live. Not slides.
Checkpoints
4
Deterministic hooks across every agent action
SDK Modules
27
Cross-vendor. No platform dependency.
Runtime Cost
$0
Fully local. No tokens consumed.
Architecture Validated By
Palo Alto Networks
"If I had spare time, I would build that."
— Director of Engineering, AI Security · Palo Alto Networks

Every existing safety tool is either locked to one vendor or built on the same AI it's meant to protect. Nobody has built an independent, cross-vendor, deterministic agent security layer. The company that does owns the governance category.


// The Problem · Three Failure Modes
Intent Drift

User asked. Agent did something else.

User: summarize this code
Agent: rm -rf ./src

No vendor's safety layer can compare intent to action. The platform never knew what the user asked for.

Boundary Violation

Agent reads what it shouldn't.

.env · ~/.ssh/id_rsa
/etc/production

Each vendor's safety ends at its own session boundary. Anything outside is invisible — and exfiltratable.

Cross-Vendor

Two agents. One attack.

Claude reads credentials.
30s later
Cursor exfiltrates.

No platform sees both halves. The attack lives in the seam between vendors that don't talk to each other.


// The Architecture

Policy-less by default. Four deterministic checkpoints.

Sentinel ships policy-less out of the box. No YAML to write. No rules to configure. Behavioral baselines do the work — Sentinel learns what normal looks like and flags what isn't. For teams that want explicit control, a single YAML file adds custom policy on top of the baseline. But the default state is zero configuration beyond install.

Hooks block, allow, or guide. No probabilistic classifiers. No LLM judgment calls. Deterministic enforcement at every checkpoint — the only architecture that can make a hard security guarantee.

"A security guard that can be talked into unlocking the vault." That's what an LLM in the security pipeline is.
01
pre_prompt
intent capture
02
pre_execution
action gate
03
post_execution
drift detection
04
mcp_call
tool boundary
Agent runtime · any vendor
User prompt
stated intent
LLM reasoning
decides tool + target
Sentinel · deterministic enforcement layer
pre_prompt
Intent capture
pre_execution
Action gate
post_execution
Drift detection
mcp_call
Tool boundary
Behavioral baseline · rolling window
25 sensitivity rules · optional YAML
↓ ALLOW
Tool execution
file · API · command · MCP
↓ BLOCK
Enforcement escalation
restrict → quarantine → audit
No LLM in the monitoring pipeline · $0 runtime cost · append-only SHA-256 audit trail

// Deployment

On-prem by default. Your infrastructure, your keys, your data.

Sentinel runs entirely in-process. No data leaves your environment. No external API calls. No cloud dependency. BYOK — your team pays infrastructure costs directly to your LLM providers. Sentinel never touches your keys, your credentials, or your agent data.

On-premises
BYOK
Zero data egress
In-process SDK
No SaaS dependency

Enterprise security teams won't trust SaaS with security data at this stage. We know. Sentinel was built for that reality from day one — no hosted component, no data retention, no sub-processor disclosures required.


// For Security Leaders

What Sentinel actually prevents.

No code required to understand this section. If your organization is deploying AI agents in production — Cursor, Claude Code, internal LangChain agents, MCP tools — here's what changes with Sentinel installed.

Credential exfiltration

Agents compromised by prompt injection read .env files, SSH keys, and production secrets — then POST them externally. Sentinel blocks the file read before it executes. The credential is never seen.

Cross-vendor coordination attacks

Agent A reads credentials. Agent B exfiltrates 30 seconds later. No single vendor sees both halves. Sentinel monitors all agents from one layer, regardless of vendor, and correlates behavior across the fleet.

Intent drift

"Summarize this code" becomes "delete the source directory." Sentinel scores every action against the declared task. When alignment drops, enforcement kicks in — before the destructive action runs.

Compliance audit trail

Every agent action is logged to an append-only, SHA-256 hash-chained audit trail. Structured for SOC 2, GDPR, HIPAA, and EU AI Act. When the auditor asks "what did your agents do," you hand them a report — not a shrug.

Deployment model

On-premises. In-process. BYOK. No data leaves your environment. No hosted component. No SaaS trust questions. No sub-processor disclosures. Your infrastructure, your keys, your control. Sentinel installs as an npm package and configures from a single YAML file — or runs policy-less with behavioral baselines active from minute one.


// The Numbers
80%+
Attack success rate for behavioral control traps across five real agent frameworks.
Shapira et al. (2025), cited in Franklin et al., "AI Agent Traps," Google DeepMind, 2026. Web-use agents driven to exfiltrate local files, passwords, and secrets via task-aligned injections.
4 of 4
Sentinel detection layers caught the attack independently in live LLM demo — real Claude Sonnet 4 agent, poisoned file, no scripted behavior.
Side-by-side comparison: vulnerable agent (right) vs. Sentinel-protected agent (left) running identical prompts. Vulnerable agent reads .env.production and POSTs externally — three consecutive runs, consistent exfiltration. Sentinel blocks at every layer independently.
48%
of cybersecurity professionals rank agentic AI as the #1 attack vector for 2026.
Above phishing. Above ransomware. Above supply chain. The attack surface grew before the security layer existed.
$0
Published NIST AI agent security standards.
No finalized standards exist. Companies who build audit infrastructure now arrive at compliance ahead of mandate — with 12 months of data already in place when requirements land.
// Deadline

EU AI Act — August 2, 2026

Automatic logging required for high-risk AI. Penalty: up to €35M or 7% of global revenue.

// Market Signal

Palo Alto Prisma AIRS 3.0

Announced at RSA 2026. Agent security is now a board-level priority — the window to move first is closing.


// Why Existing Tools Fail

01 — Vendor-locked safety ends at the session boundary.

Claude's safety layer knows nothing about what Cursor did five minutes ago. Each vendor protects their own surface — and most enterprises run more than one agent. Cross-vendor coordination attacks are invisible to existing tools. Sentinel monitors all agents from a single layer.

02 — LLM classifiers inherit the failure mode they were built to stop.

The dominant approach runs an LLM to classify whether a prompt is dangerous. A prompt injection that gets through the agent also gets through the classifier. You're using the compromised surface to evaluate itself. Sentinel has no LLM in the monitoring pipeline. Deterministic enforcement cannot be prompt-injected.

03 — Sandboxes contain blast radius. They're blind to intent drift inside the box.

Sandbox tools give agents isolated execution environments. They cannot tell you that the agent inside is doing something different from what the user asked. A monitoring tool is a security camera. Sentinel's pre_execution hook is a locked door.

04 — Policy-based tools suffer death by a thousand cuts.

Traditional security products require manual policy configuration. Policies get out of date. They require constant maintenance. They create friction developers route around. Sentinel ships policy-less by default — behavioral baselines do the work instead of manual policy management.


// How It Works
Step 01 // Install

One command. No account.

Install the SDK from npm. Fully local — your data never leaves your environment. No cloud signup. No procurement cycle.

# Install
$ npm install @tuent/sentinel

# Initialize — policy-less, behavioral baseline active
const sentinel = await Sentinel.init('agent-id')
Step 02 // Run (policy-less)

Sentinel observes immediately.

25 built-in sensitivity rules block credentials, SSH keys, and system files automatically. Behavioral baselines build over 30 days. No YAML required.

# Sentinel learns what normal looks like.
# No policy file. No configuration. Just install and run.
# .env, .ssh, /etc — blocked automatically.

# Optional: add explicit policy when you want it
const sentinel = await Sentinel.fromPolicy('.sentinel.yaml')

Optional — when your team wants explicit control:

# .sentinel.yaml — single file, full configuration
agent:
  id: coding-agent
  role: AI Coding Assistant
policy:
  allow:
    actions: [file_read, file_write]
    targets: ["src/**", "docs/**"]
  # .env, SSH, /etc — blocked automatically.
enforcement:
  restrictAfter: 2
  quarantineAfter: 3
Step 03 // Enforce

Register hooks. Block before execution.

Hook into any checkpoint. Sentinel evaluates the action before it executes. Violations escalate automatically: restricted at 2, quarantined at 3.

sentinel.on(agentId, 'pre_execution', (ctx) => {
  // ctx.evaluationResult → allow | block | guide
  // Behavioral baseline handles the defaults.
  // Add hooks for application-specific policy.
})

// How Sentinel Compares

Three things enterprises actually do today to mitigate agent risk — and where each one stops working.

Claude Auto ModeSandbox (Docker, E2B)LLM Classifiers (Lakera, etc.)Sentinel
What it doesAsk-before-acting approval flow within one vendor's agentIsolates agent execution to limit blast radiusScans prompts and outputs for malicious contentWraps every agent action with deterministic hooks, cross-vendor
Cross-vendorNo — Claude onlyPartial — per-containerPartial — per-modelYes — one SDK, all agents
Sees intent driftNo — no comparison of intent to actionNo — blind inside the sandboxPartial — prompt-level onlyYes — scores every action against declared task
Blocks before executionSometimes — approval fatigue bypasses itNo — contains damage after the factNo — scans at submission, not runtimeYes — pre_execution hook fires before the action runs
LLM in security layerYes — LLM polices itselfNoYes — inherits prompt injection surfaceNo — fully deterministic
Behavioral baselineNoNoNoYes — per-agent, 30-day rolling window
DeploymentBuilt-in (vendor lock-in)Infra team manages containersSaaS API — data leaves your environmentnpm install. On-prem. BYOK. 10 minutes.
ConfigurationVendor-defined, limited customizationContainer policies, network rulesHundreds of rules, manual maintenancePolicy-less by default. Optional YAML.
Runtime costTokens for approval flowCompute per containerTokens per classification call$0 — no external API, no tokens
Audit trailVendor logs onlyContainer logs — unstructuredClassification logsAppend-only, SHA-256 hash-chained. SOC 2, EU AI Act.

// The Secret Sauce

Every competitor builds the observation layer from scratch. Tuent already had one.

Sentinel's enforcement logic sits on top of a complete behavioral observation engine — session classification, deviation detection, rolling baselines, cross-vendor data model. That engine was production-tested on human behavior for months before it ever touched an agent. Porting it took two weeks.

That's the moat. Not speed. Not a team. An observation foundation that any company entering this category has to build before they can ship their first line of enforcement logic. By the time they have it, Sentinel has 12 months of behavioral data inside customer environments.

Test Suite
1,108
SDK Modules
27
Sensitivity Rules
25
Built-in
Time to Port
2 wks
Humans → Agents

// What's Built · Capability, Not Commercial
6
Live Attack Demos
4
Hook Checkpoints
27
SDK Modules
1,108
Tests Passing

This is capability traction, not commercial traction. Commercial traction is what we're raising to build next —
a design-partner cohort of five named enterprise security and AI-platform teams deploying agents in production.


// Founding Story

We spent months building a system that watched developer filesystem activity and rendered behavior as a living data structure. We thought we were building a productivity tool.

A semiconductor executive deploying AI at scale interrupted a demo fifteen minutes in. The same observation layer would work on agents — and no one was building it. Companies that rolled out Cursor or Claude Code had thousands of agents in production with no behavioral monitoring. The existing tools were either vendor-locked, LLM-based, or blind to intent.

"The same observation layer would work on AI agents — and no one is doing it."
// The founding moment

Best friends since high school. Second venture together. Roles swapped. Charlie built the core codebase — hooks engine, evaluation pipeline, framework adapters — in three weeks. James leads GTM, business development, and partnership strategy. Same team, different seats, sharper the second time.

"We're not asking you to bet on potential. We're asking you to bet on the second rep."


// Why Free Is the Right First Move

Enterprise security is bought top-down: CISO mandate, procurement cycle, six-month deployment. That requires the problem to already be board-acknowledged. It isn't yet — not for AI agents.

The way this market gets built is bottom-up. A developer installs Sentinel, hooks their agent, and has enforcement running in 10 minutes. They demo it in a PR review. The team lead asks for shared visibility. The VP Engineering asks for SSO and compliance exports. The enterprise deal closes itself — and by then, Sentinel has 12 months of behavioral baselines no competitor can replicate without starting over. Sentinel is at step zero of this motion. The free SDK is the entry point.

// Phase 01

Individual Developer

npm install. Hooks their agent. Enforcement running in 10 minutes. Free, no account. Policy-less — behavioral baselines active immediately.

Trigger: demo in a PR review. Team sees it working.
// Phase 02

Team Lead

Asks for fleet visibility. Alerts in Slack. One dashboard across all agents.

Trigger: second agent incident in the same week.
// Phase 03

VP Engineering / CISO

SSO, compliance exports, policy repository, SLA. Procurement opens. Deal closes itself.

Trigger: SOC 2 audit, EU AI Act, or board security review.
// Pilot Structure

Free pilot. BYOK. Clear boundaries.

Customers cover their own infrastructure costs via bring-your-own-key. Clear data disclosure and LOI required. Minimum 1 month for meaningful evaluation — enterprise cycles can extend to 6 months. Pricing flexibility matters more than revenue at this stage.

// Sales Cycle

Enterprise security sells in months, not weeks.

Sales cycles of up to 6 months are typical for enterprise security products. The free SDK compresses the bottom-up adoption cycle. By the time procurement opens, Sentinel has months of behavioral data the buyer can't get from a fresh install of a competitor.

// First User

Director of Security Engineering or Head of AI Platform at a company deploying AI agents in production. Trigger event: an agent did something unexpected, unauthorized, or destructive — and leadership wants to know how to prevent it.


// Why This Compounds
Data Gravity

Baselines are the moat.

Sentinel gets more accurate with every day it monitors an agent. Behavioral baselines built over 30 days catch deviations a new deployment misses. Switching means losing months of behavioral history — and starting blind again.

Regulatory Tailwind

Early movers win compliance.

EU AI Act deadline is August 2026. NIST frameworks are coming. Companies running Sentinel now arrive at compliance with a year of audit history already built — for free.

Architecture Advantage

Determinism can't be retrofitted.

LLM-based competitors cannot remove the LLM from their pipeline without rebuilding their product. Sentinel's deterministic hooks were designed in from day one. That's not a feature gap — it's structural.


// The Window

Now — 2026

The breach window is open. 88% of orgs already report incidents. Standards don't exist. Monitoring is sparse. Companies who deploy Sentinel now build 12+ months of behavioral baselines before compliance becomes mandatory.

August 2, 2026

EU AI Act automatic logging requirement activates. Penalty: up to €35M or 7% of global revenue. Sentinel customers hand auditors structured compliance reports. Everyone else scrambles.

2026–2027

NIST standards formalize. Market consolidates. Enterprise buyers standardize on 2–3 vendors. Developer-first tools with the deepest behavioral data win. Baselines become the product.

Too Late

The breach that sets the precedent. One high-profile AI agent incident changes every procurement conversation permanently. "Why didn't you have monitoring?" has no good answer.


// Recent · Last 30 Days
Apr 27
Architecture validated by Director of Engineering, AI Security at Palo Alto Networks — confirmed on-prem/BYOK, policy-less UX, and CISO intro pathway.
Apr 26
BVA pitch competition submission shipped — deck, video, teleprompter script delivered on hard deadline.
Apr 15
Shipped hooks-based architecture per advisor recommendation — replaces wrap middleware with deterministic checkpoints.
Apr 11
Sent comprehensive Sentinel technical brief to Director of Engineering at top-5 enterprise security vendor, who manages 3,500+ developers on Claude.
Apr 09
Tuent v28 brand and product system shipped — editorial discipline, code-native lockup.

See what your agents have already done.

Free while we build the design-partner cohort. No tiers. No contracts. Direct founder access.

"We're not asking you to bet on potential. We're asking you to bet on the second rep."
Request Free Access
// Free during beta · On-prem · Policy-less · Direct founder access