The problem
LLM-powered products are vulnerable to prompt injection — users can smuggle instructions that override your system prompt, exfiltrate data, or hijack tool calls. Most teams either ignore it until an incident, or bolt on heavyweight security that slows every request and gives no explanation when something is blocked.
At the NatWest × Google Cloud hackathon, the brief was clear: build something a real product team could drop inline, not a research demo. The guardrail had to be fast, explainable, and shippable.
What we built
SentryML is a security layer that sits between user input and your LLM. A three-line Python SDK — configure, guard, ship — scans every message before it reaches the model. If an attack is detected, it raises a structured exception with severity, attack type, and token-level SHAP explanations so engineers know *why* something was blocked.
Under the hood: a fine-tuned DistilBERT classifier for speed, semantic similarity against known attack archetypes as a fallback, and a FastAPI service deployed to GCP Cloud Run. A React dashboard gives real-time visibility into threats, latency, and carbon cost per scan.
Results
97% detection accuracy on our evaluation set, with sub-millisecond inference latency — fast enough to run inline on every request without users noticing. Pitched to NatWest and Google Cloud industry judges, then published as open-source Python and JavaScript SDKs on PyPI and npm.
The design choice that mattered most was explainability. Blocking without context trains teams to disable the guardrail. SHAP token attribution and a human-readable threat summary mean security engineers can trust and tune the system instead of fighting it.
Stack
More writings
2026 · 12 min
Building a draggable card-canvas hero (with the code)
A breakdown of this site's hero — copy-paste code for the developer ID badge, the now-playing card, the live clock, and the floating-card primitive behind them all.
2025 — Present · 7 min
Automating Azure DevOps → Jira migration at enterprise scale
Architecture notes on Migrayt — a multi-tenant B2B SaaS with AI-driven data sanitisation and strict Zero Data Retention.