Lost in Translation: Exploiting Unicode Normalization

Black Hat USA 2025 · Day 1 · Briefings

John Barnett and his daughter Isabella ("Angel Hacker"), a cybersecurity engineering student, present a systematic taxonomy of Unicode normalization vulnerabilities that let attackers bypass security filters by submitting characters that look benign to the security layer but are transformed into dangerous content by the application layer. The talk covers four attack classes — multi-byte decoding errors, overlong encoding, byte truncation, and combining/confusable characters — each tied to a real-world case study, with updates released to ActiveScan++ and Recollapse to detect them. ---

AI review

Competent systematic taxonomy of Unicode normalization attack classes, with tool releases to match. The account-takeover-via-database-collation case study and the combining-character XSS variant are the highlights. The rest is well-documented historical ground re-tilled with updated examples. Good practitioner content, not novel research.

Watch on YouTube