Sentinel is a transparent reverse proxy that detects agentic traffic (LLM crawlers, scrapers, bots) and can selectively respond with Garak-inspired adversarial payloads, optimized SEO, or restrict access entirely. Human users are proxied through to your origin unchanged.
- Detects likely agents via User-Agent heuristics.
- For agents:
- Skips the origin and returns a generated HTML payload.
ruin
mode can steer crawlers elsewhere with meta refresh and canonical tags, plus adversarial comments.seo
mode can generate better site maps and static HTML from your original webpage (configurable) to better index your website, massively improving SEO
- For humans:
- Proxies requests to your origin, streaming responses back.
- Preserves paths and query strings for both flows.
- Structured logging via
tracing
(RUST_LOG
).
- Ensure your origin is running, e.g.
http://127.0.0.1:3000/
. - Copy
.env.example
to.env
and adjust:
UPSTREAM_BASE_URL=http://127.0.0.1:3000/
BIND_ADDR=127.0.0.1:8000
INJECTION_MODE=ruin # 'ruin' | 'block' | 'seo'
REPLACEMENT_HOST=https://random.website
CLIENT_TIMEOUT_MS=30000
RUST_LOG=info
- Run the proxy:
cargo run
- Visit your app via Sentinel:
http://127.0.0.1:8000
If you do not have an origin app handy, you can run the included test app on port 3000:
- Requires Bun. Install from
https://bun.sh
. - Commands:
cd test_app
bun install
bun run index.ts # serves http://127.0.0.1:3000
The app responds with "You got to the app!" at /
.
- Requests are classified based on
User-Agent
insrc/detector
. - Agents:
src/garak
renders a standalone HTML payload (render_agent_payload
) and returns it directly. - Humans:
src/proxy
forwards toUPSTREAM_BASE_URL
, copies headers, and streams the response back. - The fallback route proxies all paths;
/healthz
returnsok
.
Use .env
(see .env.example
). Main options:
UPSTREAM_BASE_URL
(required): Base origin, e.g.https://example.com/
.BIND_ADDR
(default127.0.0.1:8000
): Listen address.INJECTION_MODE
:block
|seo
|ruin
.REPLACEMENT_HOST
: Used byGarakRuin
to steer crawlers.CLIENT_TIMEOUT_MS
(default30000
): Upstream timeout for human traffic.RUST_LOG
: Logging level (info
,debug
, ...).STRIP_ACCEPT_ENCODING_FOR_AGENTS
: Only relevant if you modify the proxy to fetch upstream for agents; default behavior bypasses upstream for agents.
- Human-like UA (proxies origin):
curl -A 'Mozilla/5.0' -i http://127.0.0.1:8000/
- Agent-like UA (bypasses origin, returns generated payload):
curl -A 'User-Agent: Claude/1.0 (anthropic-ai)' -i http://127.0.0.1:8000/
- Health check:
curl -s http://127.0.0.1:8000/healthz
A helper script is included to run the above checks automatically:
chmod +x ./test_commands.sh
./test_commands.sh
Ensure the proxy is running on 127.0.0.1:8000
and your origin (or the bundled test app) is on 127.0.0.1:3000
before running the script.
src/config
: Load settings from environment.src/detector
: User-Agent detection.src/injector
: Injection strategies and config.src/garak
: Facade to render agent payloads (and, if desired, transform upstream bodies).src/proxy
: Reverse proxy logic; agent bypass vs. human streaming.src/main.rs
: Bootstrap and router.
- Agent responses are generated HTML; tailor
render_agent_payload
to your needs. - Detection is UA-based; extend with rate/IP heuristics for higher accuracy.
- For very large human responses, streaming avoids buffering in memory.
- If you re-enable upstream fetch for agents, ensure you handle
Content-Encoding
andContent-Length
appropriately after rewriting.
MIT