LLM07: System Prompt Leakage

OWASP LLM Top 10 (2025)

System prompts containing secrets or logic are extracted via crafted input.

What this risk means

System prompts are not a secure secret-storage mechanism. Attackers can extract instructions, embedded API keys, or business logic. Risk is shaped by the vendor's isolation between system and user context and by their secret-storage guidance.

How TrustAtlas dimensions address it

Data-handling covers logging policy and operator visibility into system prompts; transparency covers whether the vendor publishes guidance on safe secret handling.

Data handlingTransparency

See methodology for how each dimension is scored across the catalog.

Questions to ask vendors

Drop these into RFPs, due-diligence questionnaires, or a procurement scorecard. Each question maps back to evidence visible on the vendor's TrustAtlas profile.

  1. Are system prompts isolated from user-visible logs, support-operator dashboards, and trust-and-safety review queues?
  2. Do you publish written guidance against embedding secrets (API keys, credentials, PII) in system prompts?
  3. Have you red-teamed your system-prompt extraction surface, and will you share results under NDA?
← LLM06: Excessive Agency LLM08: Vector and Embedding Weaknesses →

Related