Private Investigator: Extracting Personally Identifiable Information from Large Language Models Using Optimized Prompts

Seongho Keum

34th USENIX Security Symposium (USENIX Security '25) · Day 3 · Vulnerabilities in LLMs: Privacy, Safety, and Defense

Large Language Models (LLMs) have revolutionized numerous fields, from translation and healthcare to code generation, by demonstrating unprecedented performance across a diverse range of tasks. This remarkable capability stems from their training on vast datasets, which often include a wide array of domain-specific information. However, a critical security vulnerability arises when these training datasets inadvertently contain **Personally Identifiable Information (PII)**, such as personal names, email addresses, phone numbers, and physical addresses. LLMs can memorize these sensitive PII items, making them susceptible to extraction by an adversary. This talk introduces "Private Investigator," a novel framework designed to systematically extract memorized PII from LLMs using optimized prompt generation and selection strategies.

AI review

Legitimate academic research with a functional contribution — optimized prompt selection via multi-armed bandit and the 'PII eliciting direction' vector analysis are genuinely interesting mechanisms. The numbers are real and the methodology is reproducible, but the core insight (LLMs memorize training data and you can extract it with better prompts) is already well-established territory, and the defensive analysis lands on the unsatisfying conclusion that nothing works great.

Watch on YouTube