LLM07: System Prompt Leakage
OWASP LLM Top 10 (2025)
System prompts containing secrets or logic are extracted via crafted input.
What this risk means
System prompts are not a secure secret-storage mechanism. Attackers can extract instructions, embedded API keys, or business logic. Risk is shaped by the vendor's isolation between system and user context and by their secret-storage guidance.
How TrustAtlas dimensions address it
Data-handling covers logging policy and operator visibility into system prompts; transparency covers whether the vendor publishes guidance on safe secret handling.
Data handlingTransparency
See methodology for how each dimension is scored across the catalog.
Questions to ask vendors
Drop these into RFPs, due-diligence questionnaires, or a procurement scorecard. Each question maps back to evidence visible on the vendor's TrustAtlas profile.
- Are system prompts isolated from user-visible logs, support-operator dashboards, and trust-and-safety review queues?
- Do you publish written guidance against embedding secrets (API keys, credentials, PII) in system prompts?
- Have you red-teamed your system-prompt extraction surface, and will you share results under NDA?
Related
- Back to the full OWASP LLM Top 10 cross-walk
- NIST AI RMF cross-walk — the U.S. enterprise companion framework
- TrustAtlas methodology — how the 8 risk dimensions are scored
- Browse the vendor directory and filter by the dimensions tied to this risk