Personally Identifiable Information (PII)
Personally Identifiable Information is data that identifies, or could reasonably be used to identify, a specific individual. The definition varies across US, EU, and sector-specific regimes.
Defining PII
PII is the US-flavored term for data that identifies a specific individual. The definition varies by source: NIST 800-122 distinguishes "linked" PII (directly identifying) from "linkable" PII (could identify when combined with other data). GDPR uses the broader "personal data" concept (any information relating to an identified or identifiable natural person, Article 4) which includes both linked and linkable data plus contextual identifiers. HIPAA's PHI (Protected Health Information) is a narrower healthcare-specific subset.
Sensitive PII
Several categories are universally treated as more sensitive: government IDs (SSN, passport, driver's license), financial account numbers, biometric data, genetic data, sexual orientation, religion, political views, health information, and increasingly precise geolocation. GDPR Article 9 calls a similar set "special categories" and requires explicit legal basis beyond ordinary processing. AI vendors handling sensitive PII face stricter procurement diligence and often need explicit certifications (HIPAA BAA, HITRUST) rather than generic SOC 2.
AI vendor questions
Does the vendor's product allow PII in prompts, and if so under what controls (encryption, ZDR, retention windows)? Are there PII-redaction features built into the platform? What happens when a user accidentally pastes a Social Security number into a prompt — is it stored alongside the prompt indefinitely, scrubbed before logging, or flagged for moderation? Mature vendors distinguish between deployment modes where PII is permitted (with appropriate contractual coverage) and modes where it is not.