LOKI: Proactively Discovering Online Scam Websites by Mining Toxic Search Queries

Pujan Paudel

Network and Distributed System Security (NDSS) Symposium 2026 · Day 3 · Attacks

Online scams cause mounting financial losses worldwide, yet the discovery pipeline for scam websites has a critical bottleneck: while sophisticated classifiers can identify scam sites once found, **sourcing candidate websites** for analysis remains largely reactive, relying on user complaints and community-maintained lists. This talk presents **LOKI**, a data-driven framework that flips the discovery model from reactive to proactive by identifying **toxic search queries** -- search engine queries that disproportionately surface scam websites in their results. Starting from 1,600 known scam websites, LOKI extracts **1.2 million keywords** via the Google Ad Keywords API, trains a teacher-student distillation model to score query toxicity, and uses the top-ranked queries to discover **52,000 new scam websites** from 270,000 search results across **four search engines** (Google, Bing, Naver). Remarkably, fewer than **5%** of these scam websites were flagged by **Google Safe Browsing**, highlighting a massive gap in current platform-level protections.

AI review

A practical scam website discovery framework that uses data-driven query toxicity scoring to proactively find scam sites via search engines. The finding that Google Safe Browsing misses 95%+ of discovered scams is the most compelling result. Not offensive security, but useful intelligence tradecraft for anyone tracking fraud infrastructure.

Watch on YouTube