Detection Benchmarks

Radical transparency
in security detection

We publish how our detection performs - including what it misses.

Three reproducible suites from our open SDK: a core regression gate (CI on every PR), an extended curated adversarial set, and a community suite built from public payload repos (PayloadsAllTheThings, HttpParamsDataset). Scores are lab snapshots - never a guarantee that every future attack is caught. Measured on @silker-ai/agent v1.6.8 · 2026-07-01.

Core + 802 adversarial additions (1012 samples). Harder, stratified by category - transparency benchmark.

0.0%

False positives

Extended suite · measured FPR on this run

100.0%

SQLi & XSS TPR

Extended · 1,012 curated samples · 0 FN this run

100.0%

Prompt-injection TPR

Curated suite only · LLM + non-LLM routes

Curated adversarial set - new attack variants can always appear outside it. No benchmark proves complete coverage - new variants always appear outside any fixed sample set.

Results

Detection rate (TPR)False-positive rate (FPR)· 0–100% scale

Prompt injection

LLM routes - block on medium+ severity, or low + override signal

411 (294 attack / 117 benign)

Detection rate (TPR)100.0%

False-positive rate (FPR)0.0%

TP

294

FN

0

FP

0

TN

117

Prompt injection

Non-LLM routes - block only high / critical

411 (294 attack / 117 benign)

Detection rate (TPR)100.0%

False-positive rate (FPR)0.0%

TP

294

FN

0

FP

0

TN

117

SQL injection

Block on detection

264 (129 attack / 135 benign)

Detection rate (TPR)100.0%

False-positive rate (FPR)0.0%

TP

129

FN

0

FP

0

TN

135

XSS

Block on detection

337 (191 attack / 146 benign)

Detection rate (TPR)100.0%

False-positive rate (FPR)0.0%

TP

191

FN

0

FP

0

TN

146

Detection benchmark summary: detection rate, false-positive rate, and precision per threat and policy.
Threat	Policy	Samples	TPR	FPR	Precision
Prompt injection	LLM routes - block on medium+ severity, or low + override signal	411 (294 attack / 117 benign)	100.0%	0.0%	100%
Prompt injection	Non-LLM routes - block only high / critical	411 (294 attack / 117 benign)	100.0%	0.0%	100%
SQL injection	Block on detection	264 (129 attack / 135 benign)	100.0%	0.0%	100%
XSS	Block on detection	337 (191 attack / 146 benign)	100.0%	0.0%	100%

Benchmark history

How detection has changed over time - prompt-injection detection rate (TPR) per release, at a constant 0% false-positive rate.

v1.6.82026-07-01

LLM 100.0%non-LLM 100.0%FPR 0.0%

SQLi quick-pattern fast-path fixed to run before the length gate, plus new signatures (bare boolean probes, ORDER BY enumeration, time-based blind functions, CAST/CONVERT+SELECT, CHAR()/CHR() concatenation). Community SQLi 68.1% → 97.7% TPR at 0% FPR; core/extended unchanged at 100%.

v1.6.72026-06-29

LLM 100.0%non-LLM 100.0%FPR 0.0%

Community benchmark suite in CI (PayloadsAllTheThings + HttpParamsDataset, 1,494 samples). SQLi ORDER BY / GROUP BY / nested-paren blind fix. Curated extended 0 FN; community SQLi 68.1% / XSS 96.1% at 0% FPR.

v1.6.62026-06-28

LLM 100.0%non-LLM 100.0%FPR 0.0%

Auth endpoint safe harbor and outbound-only egress guard. Detection scores unchanged on curated suites.

v1.6.12026-06-26

LLM 100.0%non-LLM 100.0%FPR 0.0%

SQLi comment/hash markers now require SQL context - eliminated benign false positives on text like "mid-2024 -- note". Curated extended suite unchanged at 0% FPR.

v1.6.02026-06-25

LLM 100.0%non-LLM 100.0%FPR 0.0%

Closed remaining extended-suite gaps: prompt-injection non-LLM-route 97.6% → 100%, SQLi 96.9% → 100%, all four datasets at 100% TPR / 0% FPR on extended (1012 samples).

v1.5.22026-06-22

LLM 100.0%non-LLM 97.6%FPR 0.0%

Leetspeak/spacing/homoglyph normalization + decode-and-rescan lifted extended prompt-injection LLM-route TPR from 71.4% to 100% at 0% FPR. Core suite also 100%.

v1.5.12026-06-22

LLM 71.4%non-LLM 71.4%FPR 0.0%

Apache-2.0 license. Dual benchmark suites: core (~210, CI gate) and extended (~1012, transparency).

v1.5.02026-06-22

LLM 96.0%non-LLM 96.0%FPR 0.0%

Input normalization (zero-width strip + NFKC), base64/escape decode-and-rescan lifted core non-LLM prompt-injection from 76.3% to 96.0% at 0% FPR.

v1.3.32026-06-10

LLM 94.9%non-LLM 76.3%FPR 0.0%

Precise LLM-route policy (block on medium+ or override signal) cut LLM-route false positives from 24.4% to 0% while keeping detection high.

What we don't catch

Honesty is part of the product. Here is what these benchmarks show we currently miss, and why.

Community suite: the real stress test

On the community suite (1,494 samples from PayloadsAllTheThings and HttpParamsDataset), SQLi TPR is 97.7% (18 misses) and XSS is 96.1% (12 misses) at 0% FPR. This is the number we trust for honest coverage - curated suites can look perfect while community payloads expose gaps.

Curated suites: regression direction, not proof

The extended suite (1,012 hand-labeled samples) currently shows 0 false negatives on this run - but that only means we pass our own test set today. The core suite (~210 samples) runs on every PR to catch regressions fast.

We chose 0% false positives over chasing the last percent

We deliberately do not special-case residual misses with brittle rules that would re-introduce false positives on legitimate roleplay UX.

Heuristics are a first layer, not the whole defense

Heuristic detection is a fast first layer, not a complete defense. It pairs with the platform's async AI verdict layer and should complement - never replace - your own application controls.

How we verified

Core (~210) + extended (~1012) hand-labeled samples across prompt injection, SQL injection, and XSS - including benign “look-alikes” (legit roleplay, SQL keywords in normal text, HTML-ish content) to stress false positives.
Detectors run through the same public APIs used in production - no special benchmark-only code path.
Two policies reported (LLM vs non-LLM) because Silker applies stricter rules on AI/LLM endpoints than on plain routes.
The benchmark is part of the SDK test suite with CI quality gates - a regression below threshold fails the build.
Community suite - payloads imported from PayloadsAllTheThings and HttpParamsDataset. Regenerate with npm run benchmark:import-community.
Reproducible - run npm run benchmark (core), npm run benchmark:extended, or npm run benchmark:community in the SDK.

Measured on @silker-ai/agent v1.6.8 on 2026-07-01. We'll update these results as the SDK evolves.

Put this detection in front of your app

Integrate the SDK in minutes, then watch verdicts and threats live from the dashboard.

Read the integration docs Open the dashboard