The Security scanner analyzes prompt text against categories such as injection, data exfiltration, role confusion, and instruction override. You get a risk score, findings with severity, and recommendations before you share or publish.
Routes:
- Personal:
/dashboard/scanner, pending at/dashboard/scanner/pending, results at/dashboard/scanner/{scanId} - Organization:
/dashboard/organization/{orgId}/security-scannerwith the same pending and detail pattern
Policies may set requireScanBeforePublish so lifecycle approval fails until a passing scan exists.
Step 1
Paste or load prompt text
Open the scanner, enter up to 4,000 characters, or prefill from a library prompt via query params.
Step 2
Choose target model
Select the model label the scan simulates against (e.g. GPT-4o, Claude). This affects how findings are framed.
Step 3
Start scan
Submit to queue a session. You are redirected to a **pending** page while the scan starts, then to the scan detail route.
Step 4
Watch progress
Results stream in with progress events. Findings appear as they are detected; the session completes with a summary and risk badge.
Step 5
Act on results
Use **Fix with Refine** to open Refine Agent with context, test in **Playground**, or return to the library. Re-scan after edits.

